InkdownInkdown
Start writing

tweets-explain

3 files·0 subfolders

Shared Workspace

tweets-explain
claude-code-memory-analysis.md

compact

Shared from "tweets-explain" on Inkdown

Claude Code Compaction Architecture

Claude Code has an interesting recipe for compaction. This is how it works. alt

Again, shared by Claude Code

Claude Code is not doing just one compaction. It has three layers, and each layer handles a different kind of overload.

Layers

  • MicroCompact — cheap, every turn
  • Session Memory Compact — medium, no API call
  • Legacy Compact — expensive, full summarization

Window and Triggering Behavior

  • Effective window = model window with reserved space
compact
memory-system
  • Auto-compact trigger = effective window minus ~13K tokens
  • Manual blocking limit = effective window minus ~3K tokens
  • The system tries to compact before hitting the hard “prompt too long” wall
  • 1. Cheap Compaction Happens Every Turn

    • There is a cheap compaction path on every single turn
    • It runs before every API request
    • It saves tokens without changing the actual conversation structure

    2. Real Compaction Is Boundary-Based

    • The real compaction path is based on boundaries
    • Claude is asked to summarize prior conversation
    • The old transcript is not rewritten or deleted

    3. Session Memory Compaction Is Tried First

    • Session memory compaction is attempted before full summarization
    • It rebuilds context from the session memory file plus recent messages
    • No model call is needed
    • Only if that fails does it fall back to full summarization

    4. Resume Loads Only the World After the Last Boundary

    • On resume, Claude loads only the context after the last compact boundary
    • Once that boundary is found, the pre-boundary payload is dropped from the in-memory load buffer
    • So Claude resumes with only post-boundary context

    5. The Smartest Trick: Preserved-Tail Relinking

    • Compaction does not just keep a summary
    • It also keeps a preserved tail of recent live messages
    • On resume, that tail is stitched back onto the summary chain

    6. Compaction Is Really a Pipeline

    Compaction is not one event. It is a pipeline:

    • Trim cheap tool-result bloat
    • Check thresholds to decide whether full compaction is needed
    • Try disk-backed summary reconstruction first
    • Fall back to legacy compact and ask Claude to summarize if needed
    • Append a compact boundary so future loads start from there

    7. What Users Experience as “Claude Forgot”

    What users experience as “Claude forgot” is usually not deletion.

    It is usually a combination of:

    • boundary truncation
    • summary substitution
    • preserved-tail relinking trying to keep the active working set alive