InkdownInkdown
Start writing

Claude-Code

62 files·4 subfolders

Shared Workspace

Claude-Code
codex

agentic-architecture

Shared from "Claude-Code" on Inkdown

Claude Code — Agentic Loop & Architecture Deep Dive

A principal-engineer-level walkthrough of how Claude Code works end-to-end: the agentic loop, agent workflow, tool execution, state management, and everything in between.


Table of Contents

  1. High-Level Architecture
  2. Startup & Initialization
  3. The Agentic Loop
  4. LLM API Call Pipeline
  5. Tool Execution Pipeline
  6. The Tool System
0000_start_here_index_and_recommended_reading_order.md
0100_project_overview_tech_stack_runtime_modes_and_folder_map.md
0200_startup_flow_entry_points_and_cold_start_sequence.md
0300_codebase_modules_layers_state_models_and_schemas.md
0400_system_architecture_and_design_rationale.md
0500_interactive_repl_request_flow_end_to_end.md
0600_headless_sdk_and_print_mode_request_flow_end_to_end.md
0700_mcp_integration_connection_and_tool_call_flow.md
0800_external_services_sdks_storage_and_local_dependencies.md
0900_environment_variables_settings_feature_flags_and_failure_modes.md
1000_non_obvious_patterns_gotchas_and_debugging_traps.md
1100_full_codebase_file_inventory_grouped_by_directory.md
kimi
00-overview.md
01-entrypoints.md
02-state-management.md
03-query-system.md
04-tools-system.md
05-tasks-system.md
06-ui-components.md
07-bridge-remote.md
08-services.md
09-skills-plugins.md
10-commands.md
11-testing-architecture.md
12-permission-system.md
13-build-system.md
14-ink-internals.md
15-git-internals.md
16-context-compaction.md
17-vim-mode.md
18-mailbox-notifications.md
19-session-persistence.md
20-hooks-system.md
21-error-recovery.md
README.md
qwen
00-overview.md
01-entry-points.md
02-query-engine.md
03-tools-and-tasks.md
04-commands-and-skills.md
05-state-management.md
06-ink-rendering.md
07-bridge-remote.md
08-mcp-services.md
09-services-overview.md
10-multi-agent.md
11-system-prompt-constants.md
12-tool-interface.md
13-memory-system.md
14-buddy-companion.md
15-keybindings.md
16-stop-hooks.md
17-vim-mode.md
18-upstreamproxy.md
19-cost-tracking-history.md
20-contexts-styles-onboarding.md
21-hooks.md
22-screens.md
tweets-explain
claude-code-memory-analysis.md
compact
memory-system
agentic-architecture
  • Agent Spawning & Subagents
  • State Management
  • Message System
  • Permission System
  • Hooks System
  • Context & Compaction
  • MCP Server Integration
  • Skills System
  • Coordinator Mode
  • Loop Termination
  • Key Architectural Patterns

  • 1. High-Level Architecture

    Claude Code is a TypeScript/React CLI application built with Ink (React for terminals). It runs on Bun and interfaces with the Anthropic API to create an agentic coding assistant.

    Three-Layer Architecture
    Plain text
    ┌─────────────────────────────────────────────────┐
    │  REPL / TUI Layer (Ink React Components)        │  ← User sees this
    │  main.tsx → interactiveHelpers.tsx → REPL.tsx   │
    ├─────────────────────────────────────────────────┤
    │  Query Engine Layer                             │  ← The brain
    │  QueryEngine.ts → query.ts → queryLoop()        │
    ├─────────────────────────────────────────────────┤
    │  API & Tool Execution Layer                     │  ← The hands
    │  claude.ts (API) + tools/ (43 tool impls)       │
    └─────────────────────────────────────────────────┘
    Core Components
    ComponentFile(s)Role
    Entry Pointmain.tsxCLI parsing, initialization, mode detection
    Query EngineQueryEngine.tsHigh-level orchestrator, one per conversation
    Agentic Loopquery.ts (queryLoop())The infinite loop that drives agent behavior
    Tool RegistryTool.ts, tools.tsTool type definitions and registry
    State Storestate/store.ts, AppStateStore.tsZustand-like pub/sub store
    API Clientservices/api/claude.tsAnthropic API calls with streaming
    Bootstrap Statebootstrap/state.tsGlobal singleton state (session, cost, telemetry)

    2. Startup & Initialization

    Phase 1: Pre-Import Optimizations (main.tsx:1-20)

    Before any heavy imports load, three parallel prefetches fire:

    1. profileCheckpoint('main_tsx_entry') — Marks startup timing
    2. startMdmRawRead() — Fires MDM subprocesses (macOS config reads)
    3. startKeychainPrefetch() — Fires macOS keychain reads (OAuth + API key)

    These run in parallel with the ~135ms of module evaluation that follows, eliminating sequential I/O bottlenecks.

    Phase 2: CLI Parsing (main.tsx:884+)

    The app uses Commander.js to parse CLI arguments. Key modes:

    ModeFlagBehavior
    Interactive REPL(default)Full TUI with Ink React
    Headless/Print-p / --printNon-interactive, output to stdout
    SDK Mode--sdk-urlSDK consumer drives the session
    SSH Remotessh <host>Remote execution via SSH tunnel
    Direct Connectcc://...Connect to a running Claude Code instance
    Assistant Modeassistant [id]Kairos/assistant feature (feature-gated)
    Phase 3: Pre-Action Hook (main.tsx:907-967)

    Runs once before any command executes:

    Plain text
    1. Await MDM + keychain prefetches
    2. init() — loads settings, config, auth
    3. Set process.title = 'claude'
    4. Initialize logging sinks
    5. Run migrations (CURRENT_MIGRATION_VERSION = 11)
    6. Load remote managed settings (non-blocking)
    7. Upload user settings (if feature-gated)
    Phase 4: Deferred Prefetches (main.tsx:388-431)

    After first render, background work fires:

    • initUser(), getUserContext(), getSystemContext()
    • AWS/GCP credential prefetch (if Bedrock/Vertex enabled)
    • File count via ripgrep
    • Model capabilities refresh
    • Settings/skills change detectors
    • Event loop stall detector (ant-only)

    3. The Agentic Loop

    This is the heart of Claude Code. Everything revolves around queryLoop().

    Call Chain
    Plain text
    User submits prompt
      → QueryEngine.submitMessage()
        → query() [outer generator]
          → queryLoop() [core infinite loop]
            → deps.callModel() [API call]
            → execute tools [tool pipeline]
            → loop again with results
    The Loop: Step by Step (query.ts:241+)

    Each iteration of queryLoop() follows this exact sequence:

    Plain text
    ┌──────────────────────────────────────────────────────────────┐
    │                    queryLoop() Iteration                      │
    ├──────────────────────────────────────────────────────────────┤
    │                                                               │
    │  1. Read current state (messages, toolUseContext, tracking)   │
    │                                                               │
    │  2. Skill discovery prefetch (non-blocking, during streaming) │
    │                                                               │
    │  3. Yield 'stream_request_start' event                        │
    │                                                               │
    │  4. Increment query chain tracking (chainId, depth)           │
    │                                                               │
    │  5. Get messages after compact boundary                       │
    │                                                               │
    │  6. Apply tool result budget (truncate oversized results)     │
    │                                                               │
    │  7. Apply history snip (if enabled)                           │
    │                                                               │
    │  8. Apply microcompact (time-based or cached)                 │
    │                                                               │
    │  9. Apply context collapse (if enabled, at 90% usage)         │
    │                                                               │
    │  10. Apply auto-compact (if context exceeds threshold)        │
    │                                                               │
    │  11. Build full system prompt                                 │
    │                                                               │
    │  12. Check blocking token limit (reject if too large)         │
    │                                                               │
    │  ┌─────────────────────────────────────────────────────────┐  │
    │  │  API CALL LOOP (with fallback retry)                     │  │
    │  │  a. deps.callModel() → streams assistant response        │  │
    │  │  b. Collect assistantMessages, toolUseBlocks             │  │
    │  │  c. Handle streaming fallback (model switch on error)    │  │
    │  │  d. Withhold recoverable errors (PTL, max_tokens)        │  │
    │  │  e. Yield messages to consumer                           │  │
    │  └─────────────────────────────────────────────────────────┘  │
    │                                                               │
    │  13. Execute post-sampling hooks                              │
    │                                                               │
    │  14. Check for abort (user interrupt)                         │
    │                                                               │
    │  15. Yield previous turn's tool-use summary (if any)          │
    │                                                               │
    │  16. ┌─ NO tool_use blocks? ──────────────────────────────┐  │
    │      │  a. Handle withheld 413/max_tokens errors           │  │
    │      │  b. Run stop hooks                                  │  │
    │      │  c. Check token budget                              │  │
    │      │  d. Return { reason: 'completed' }                  │  │
    │      └─────────────────────────────────────────────────────┘  │
    │                                                               │
    │  17. ┌─ HAS tool_use blocks? ─────────────────────────────┐  │
    │      │  a. Execute tools (streaming or batched)            │  │
    │      │  b. Collect tool results                            │  │
    │      │  c. Generate tool-use summary (async)               │  │
    │      │  d. Check abort mid-execution                       │  │
    │      │  e. Collect attachments (queued commands, memory)   │  │
    │      │  f. Refresh tools (MCP servers)                     │  │
    │      │  g. Check maxTurns                                  │  │
    │      │  h. Continue loop with new messages                 │  │
    │      └─────────────────────────────────────────────────────┘  │
    │                                                               │
    └──────────────────────────────────────────────────────────────┘
    State Carried Between Iterations

    The State type carries mutable state across loop iterations:

    TypeScript
    type State = {
      messages: Message[]                    // Full conversation history
      toolUseContext: ToolUseContext         // Execution context (tools, abort, state)
      autoCompactTracking: ...               // Tracks compaction state
      maxOutputTokensRecoveryCount: number   // Recovery attempt counter
      hasAttemptedReactiveCompact: boolean   // Prevents infinite compact loops
      pendingToolUseSummary: ...             // Async summary from previous turn
      turnCount: number                      // Iteration counter
      transition: ...                        // Why previous iteration continued
    }
    Key Insight: Immutable State Pattern

    State is never mutated. Each loop continuation creates a brand new State object:

    TypeScript
    state = {
      messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
      toolUseContext: toolUseContextWithQueryTracking,
      turnCount: nextTurnCount,
      // ...other fields
    }

    This makes the loop deterministic, testable, and enables clean recovery paths.


    4. LLM API Call Pipeline

    The Call Chain
    Plain text
    queryLoop()
      → deps.callModel()
        → queryModelWithStreaming()
          → queryModel()
            → anthropic.beta.messages.create() [Anthropic SDK]
    API Request Construction (services/api/claude.ts:752+)

    queryModel() builds a BetaMessageStreamParams object:

    ParameterSourcePurpose
    modelgetRuntimeMainLoopModel()Resolved from permission mode + token count
    systemSystem prompt constructionFull system prompt with cache breakpoints
    messagesnormalizeMessagesForAPI()Normalized conversation history
    toolstoolToAPISchema()Tool schemas (Zod → JSON Schema)
    thinkingThinking configAdaptive/disabled thinking
    max_tokensModel configOutput token limit
    temperature, top_p, top_kModel configSampling parameters
    betasFeature flagsPrompt caching, context management, structured outputs
    metadataSession stateUser ID, session ID for tracking
    Streaming Protocol

    The API uses Anthropic's MessageStream for server-sent events:

    Plain text
    message_start          → Initial message with usage
    content_block_start    → Beginning of text/tool_use/thinking block
    content_block_delta    → Streaming deltas (text chunks, tool input)
    content_block_stop     → Block completed
    message_delta          → Stop reason + final usage
    message_stop           → Request complete
    Fallback Mechanism

    If a FallbackTriggeredError occurs (529, rate limit, etc.):

    1. Switch to fallbackModel (e.g., Sonnet → Opus)
    2. Clear all accumulated assistant messages and tool results
    3. Strip thinking signature blocks (model-bound)
    4. Retry the entire API call with the fallback model
    5. Yield a system warning about the model switch
    Recovery Paths
    ErrorRecovery Strategy
    Prompt too long (413)Context collapse drain → Reactive compact → Surface error
    Max output tokensEscalate to 64k (once) → Inject "Resume directly" message → Retry 3x → Surface
    Media size errorStrip media → Retry once → Surface
    Structured output retryRetry with relaxed constraints → Surface after limit

    5. Tool Execution Pipeline

    This is where the agent acts on the world. Five phases:

    Phase 1: Tool Discovery During Streaming

    Two execution modes exist, controlled by config.gates.streamingToolExecution:

    StreamingToolExecutor (enabled):

    • Tools are added to a queue as they stream in from the API
    • Concurrency-safe tools (read-only) execute in parallel
    • Non-concurrent tools execute serially
    • Results are buffered and yielded in order
    • Bash errors cascade: if one Bash tool errors, sibling subprocesses are killed via siblingAbortController

    runTools() (fallback):

    • After streaming completes, all tool_use blocks are processed
    • partitionToolCalls() groups consecutive concurrency-safe tools into batches
    • Concurrent-safe batches run via runToolsConcurrently() (up to 10 parallel, configurable via CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY)
    • Non-concurrent tools run via runToolsSerially()
    Phase 2: Permission Resolution (per tool)
    Plain text
    runToolUse()
      → streamedCheckPermissionsAndCallTool()
        → checkPermissionsAndCallTool()

    The permission pipeline is strictly ordered:

    Plain text
    1. Input validation     → Zod schema parse (tool.inputSchema.safeParse())
    2. Value validation     → tool.validateInput() (e.g., blocked sleep patterns)
    3. PreToolUse hooks     → runPreToolUseHooks() (can modify input, block, decide)
    4. Permission decision  → resolveHookPermissionDecision() + canUseTool()
       ├── Permission mode check (default, plan, auto, bypass)
       ├── Always-allow rules (session, CLI, settings)
       ├── Always-deny rules
       ├── Auto-mode classifier (security check for Bash)
       └── Permission hooks
    5. If denied            → Yield error tool_result, run PermissionDenied hooks
    6. If allowed           → Proceed to execution
    Phase 3: Tool Execution
    TypeScript
    tool.call(input, toolUseContext, canUseTool, assistantMessage, onProgress)

    The tool's call() method executes with full context. Tools return:

    TypeScript
    type ToolResult<T> = {
      data: T                          // The result data
      newMessages?: Message[]          // Optional additional messages to inject
      contextModifier?: (ctx) => ctx   // Function to update ToolUseContext
      mcpMeta?: { ... }                // MCP protocol metadata
    }
    Phase 4: Result Processing
    1. tool.mapToolResultToToolResultBlockParam() — Maps result to API format
    2. Large results are persisted to disk with a preview (configurable maxResultSizeChars)
    3. PostToolUse hooks run (runPostToolUseHooks())
    4. Result wrapped in createUserMessage({ content: [{ type: 'tool_result', ... }] })
    5. Yielded back to the query loop
    Phase 5: Loop Continuation

    Tool results are appended to messagesForQuery and the loop continues:

    TypeScript
    state = {
      messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
      // ...
    }

    This triggers another API call with the tool results as context.


    6. The Tool System

    Tool Interface (Tool.ts)

    Every tool implements the Tool<Input, Output, Progress> interface:

    TypeScript
    type Tool<Input, Output, Progress> = {
      name: string                           // Unique identifier
      inputSchema: ZodType<Input>            // Input validation schema
      description(...): Promise<string>      // Dynamic description for the model
      call(args, context, canUseTool, ...): Promise<ToolResult<Output>>
      checkPermissions(input, context): Promise<PermissionResult>
      isConcurrencySafe(input): boolean      // Can run in parallel?
      isReadOnly(input): boolean             // Does it modify state?
      isDestructive(input): boolean          // Irreversible operation?
      isEnabled(): boolean                   // Feature-gated?
      validateInput?(input, context): ValidationResult
      // ... plus ~30 more optional methods for UI rendering, progress, etc.
    }
    Tool Builder Pattern

    Tools are created via buildTool() which fills in safe defaults:

    TypeScript
    const MyTool = buildTool({
      name: 'MyTool',
      inputSchema: z.object({ ... }),
      description: async () => 'Does something useful',
      call: async (args, ctx) => ({ data: 'result' }),
      // ... only override what's needed
    })

    Defaults (fail-closed where it matters):

    • isEnabled → true
    • isConcurrencySafe → false (assume not safe)
    • isReadOnly → false (assume writes)
    • isDestructive → false
    • checkPermissions → { behavior: 'allow' } (defer to general system)
    • toAutoClassifierInput → '' (skip classifier — security tools must override)
    Tool Registry (tools.ts)

    All tools are assembled into a Tools array (readonly Tool[]) and passed through the ToolUseContext. The tool pool is:

    1. Built-in tools — Read, Edit, Write, Bash, Glob, Grep, TodoWrite, etc.
    2. MCP tools — Dynamically loaded from connected MCP servers
    3. Skill tools — Generated from skill definitions
    4. Agent tools — Custom agent definitions from .claude/agents/
    Key Tools
    ToolFileWhat It Does
    Bashtools/BashTool/Execute shell commands with sandbox, timeout, backgrounding
    Readtools/FileReadTool/Read files (text, images, PDFs, notebooks) with dedup
    Edittools/FileEditTool/String replacement in files with staleness checks
    Writetools/FileWriteTool/Write/overwrite files with read-first requirement
    Agenttools/AgentTool/Spawn subagents (sync, async, fork, remote, teammate)
    Globtools/GlobTool/File pattern matching
    Greptools/GrepTool/Content search with ripgrep
    TodoWritetools/TodoWriteTool/Task tracking panel
    WebSearchtools/WebSearchTool/Web search via Exa
    WebFetchtools/WebFetchTool/Fetch URL content
    Skilltools/SkillTool/Execute skill commands
    MCPtools/MCPTool/Execute MCP server tools
    ToolSearchtools/ToolSearchTool/Deferred tool loading (lazy schema loading)
    Tool Result Persistence

    When a tool result exceeds maxResultSizeChars:

    1. Result is saved to a file in the tool-results directory
    2. The model receives a preview + file path
    3. The model can Read the file if it needs full content
    4. 64MB cap on persisted output
    Streaming Tool Execution

    When streamingToolExecution is enabled, tools start executing while the API response is still streaming:

    Plain text
    API streams: "I'll read file A and file B..."
      → ToolUse block for Read(fileA) starts streaming
        → Read(fileA) added to execution queue
      → ToolUse block for Read(fileB) starts streaming
        → Read(fileB) added to execution queue
      → Both Read tools execute in parallel (concurrency-safe)
    API continues: "...and also run this command"
      → ToolUse block for Bash starts streaming
        → Bash added to queue (waits for serial execution)

    This reduces latency significantly for independent tool calls.


    7. Agent Spawning & Subagents

    The Agent tool is the primary mechanism for spawning nested agents. It supports multiple spawn modes:

    Task Types (Task.ts)
    TypeScript
    type TaskType =
      | 'local_bash'          // Shell command execution
      | 'local_agent'         // Subagent in same process
      | 'remote_agent'        // Agent in CCR (cloud) environment
      | 'in_process_teammate' // Teammate in same process
      | 'local_workflow'      // Workflow execution
      | 'monitor_mcp'         // MCP server monitor
      | 'dream'               // Dream/experimental mode
    Agent Spawn Modes
    Sync vs Async
    TypeScript
    shouldRunAsync = run_in_background || selectedAgent.background ||
                     isCoordinator || forceAsync || assistantForceAsync ||
                     proactiveActive
    ModeBehaviorUse Case
    SyncBlocks parent's turn, shares abort controllerQuick delegations
    AsyncIndependent lifecycle, own abort controllerLong-running tasks
    Spawn Paths
    1. Teammate (Agent Swarms):

      • When team_name + name provided
      • spawnTeammate() → launches in tmux split-pane with its own process
      • Communicates via message passing
    2. Remote Agent:

      • When isolation: 'remote'
      • teleportToRemote() → launches in CCR (cloud) environment
      • Full session teleport with git state
    3. Fork Subagent (experiment, feature-gated):

      • When subagent_type omitted and fork gate enabled
      • Inherits parent's exact conversation context and system prompt
      • Byte-identical API prefixes for maximum prompt cache hits
      • Recursive fork guard prevents fork-within-fork
      • Worktree isolation: can operate in isolated git worktrees
    4. Standard Subagent:

      • Selected agent definition → runAgent() → calls query() recursively
      • Creates a nested agentic loop with isolated context
    runAgent() Execution Environment

    Each subagent gets an isolated execution environment:

    • Agent-specific system prompt (from agent definition's getSystemPrompt())
    • Agent-specific tool pool (filtered by permission mode)
    • Agent-specific MCP servers (additive to parent's)
    • Agent-specific file state cache
    • Agent-specific abort controller (async) or shared (sync)
    • Agent-specific transcript recording (sidechain)
    • Registers frontmatter hooks, skills, session hooks
    Background Agent Lifecycle

    Async agents are managed by LocalAgentTask:

    Plain text
    1. registerAsyncAgent() — registers with AppState
    2. createProgressTracker() — tracks execution progress
    3. updateAsyncAgentProgress() — updates UI
    4. Agent runs query() in background
    5. On completion: enqueues <task-notification> XML to parent's message queue
    6. Parent drains notifications via getCommandsByMaxPriority() filtered by agentId
    Fork Subagent Mechanics

    When fork is enabled, extreme measures are taken for cache efficiency:

    1. Child inherits parent's exact system prompt bytes
    2. Child inherits parent's exact tool pool
    3. buildForkedMessages() creates:
      Plain text
      [...history, assistant(all_tool_uses), user(placeholder_results..., per_child_directive)]
    4. All fork children produce byte-identical prefixes — only the final directive text block differs
    5. This maximizes Anthropic's prompt cache hit rate across parallel forks

    8. State Management

    Three-Layer State Architecture
    Plain text
    ┌─────────────────────────────────────────────────────┐
    │  Bootstrap State (bootstrap/state.ts)               │  ← Global singleton
    │  Session ID, cost, telemetry, feature latches       │
    ├─────────────────────────────────────────────────────┤
    │  AppState Store (state/AppStateStore.ts)            │  ← Immutable state object
    │  Permissions, tasks, MCP, todos, notifications      │
    ├─────────────────────────────────────────────────────┤
    │  ToolUseContext (Tool.ts)                           │  ← Per-query execution context
    │  Tools, abort controller, state accessors, agentId  │
    └─────────────────────────────────────────────────────┘
    Bootstrap State (bootstrap/state.ts)

    A global STATE singleton containing:

    • Session identity: sessionId, parentSessionId, projectRoot, originalCwd
    • Cost tracking: totalCostUSD, per-model modelUsage
    • Duration tracking: API, tool, hook, classifier durations
    • Telemetry: OpenTelemetry meter, counters, logger, tracer
    • Feature latches: promptCache1hEligible, afkModeHeaderLatched, etc.
    • Hook registry: registeredHooks by event type
    • Agent state: agentColorMap, invokedSkills
    AppState Store (state/AppStateStore.ts)

    A massive immutable state object:

    TypeScript
    type AppState = {
      toolPermissionContext: ToolPermissionContext
      tasks: Map<string, TaskState>
      mcp: { clients, tools, commands, resources }
      todos: Map<AgentId, TodoItem[]>
      notifications: Notification[]
      agentNameRegistry: Map<string, AgentId>
      settings: Settings
      mainLoopModel: string
      verbose: boolean
      // ... many more fields
    }
    Store Implementation (state/store.ts)

    A minimal Zustand-like pub/sub store:

    TypeScript
    createStore(initialState, onChange?) → {
      getState(),           // Current state snapshot
      setState(updater),    // Functional update: (prev) => next
      subscribe(listener)   // Returns unsubscribe function
    }

    Key design: setState(updater) does identity checking to skip no-op updates, preventing unnecessary re-renders.

    ToolUseContext

    The per-query execution context passed to every tool:

    TypeScript
    type ToolUseContext = {
      options: {
        commands: Command[]
        tools: Tools
        mainLoopModel: string
        mcpClients: MCPServerConnection[]
        agentDefinitions: AgentDefinitionsResult
        // ...
      }
      abortController: AbortController
      getAppState(): AppState
      setAppState(f: (prev: AppState) => AppState): void
      setAppStateForTasks?: (...)  // Always-shared for session-scoped infrastructure
      agentId?: AgentId            // Subagent identity (undefined for main thread)
      agentType?: string           // Subagent type name
      messages: Message[]          // Current message array
      readFileState: FileStateCache
      // ... many more fields for UI, hooks, telemetry
    }
    State Flow in the Loop

    Each loop iteration creates a new State object (immutable pattern):

    TypeScript
    // At loop continuation:
    state = {
      messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
      toolUseContext: toolUseContextWithQueryTracking,
      turnCount: nextTurnCount,
      autoCompactTracking: updatedTracking,
      // ...
    }

    This ensures clean state snapshots at each transition point and enables deterministic recovery.


    9. Message System

    Message Types (types/message.ts)
    TypePurpose
    userUser input, tool results, meta messages
    assistantLLM responses with text/thinking/tool_use blocks
    systemInternal messages (compact_boundary, api_error, local_command)
    progressTool execution progress updates
    attachmentSystem attachments (file changes, task notifications, max_turns)
    tombstoneControl signal to remove messages from UI
    stream_eventRaw API streaming events
    tool_use_summaryHaiku-generated summaries of tool batches
    Conversation Flow
    Plain text
    1. User message enters via QueryEngine.submitMessage(prompt)
    2. processUserInput() handles slash commands, attachments, mode changes
    3. Message pushed to mutableMessages and persisted to transcript
    4. query() generator starts the agentic loop
    5. Each API call produces assistant messages with optional tool_use blocks
    6. Tool results become user messages with tool_result blocks
    7. Loop continues until no tool_use blocks and stop hooks don't prevent
    8. Final result yielded to QueryEngine which formats SDK response
    Message Normalization

    normalizeMessagesForAPI() strips UI-only messages before sending to the API. The API only sees:

    • User messages with text/tool_result content
    • Assistant messages with text/tool_use/thinking content

    UI-only messages (system messages, progress messages, tombstones) are excluded.

    Attachment System

    Mid-turn attachments inject additional context into the conversation:

    • Queued commands — Task notifications from completed agents
    • Memory prefetch — Relevant CLAUDE.md files
    • Skill discovery — Discovered skills for touched files
    • File change notifications — When watched files change

    10. Permission System

    Permission Modes
    ModeBehavior
    defaultNormal prompting for each tool use
    planPlan mode — pauses before execution
    acceptEditsAuto-accepts safe edits
    bypassPermissionsAll tools allowed without prompting
    dontAskDenies anything that would prompt
    auto (ant-only)AI classifier auto-approves/denies
    bubbleInternal subagent mode
    Permission Pipeline (hasPermissionsToUseTool)

    The decision pipeline is strictly ordered:

    Plain text
    Step 1: Rule-based checks
      1a. Entire tool denied by rule → deny
      1b. Entire tool has ask rule → ask (unless sandboxed Bash can auto-allow)
      1c. Tool's own checkPermissions() → delegates to tool implementation
      1d. Tool implementation denied → deny
      1e. Tool requires user interaction → ask (bypass-immune)
      1f. Content-specific ask rules → ask (bypass-immune)
      1g. Safety checks (.git/, .claude/, .vscode/, shell configs) → ask (bypass-immune)
    
    Step 2: Mode-based checks
      2a. Bypass mode → allow
      2b. Entire tool allowed by rule → allow
    
    Step 3: Default
      passthrough → converted to ask
    Permission Rules

    Format: ToolName(content) or just ToolName

    Plain text
    Bash(git *)        → matches git commands
    Bash               → matches all Bash
    mcp__server1       → matches all tools from MCP server
    mcp__server1__*    → wildcard for server tools
    Agent(Explore)     → denies specific agent type

    Sources (priority order): policySettings, userSettings, projectSettings, localSettings, cliArg, command, session

    Auto Mode Classifier (ant-only)

    When mode is auto and result is ask:

    1. Fast path 1: Check if acceptEdits mode would allow → auto-approve
    2. Fast path 2: Check safe-tool allowlist → auto-approve
    3. Classifier: Calls classifyYoloAction() with transcript + formatted action
      • Two-stage classifier with detailed telemetry
      • Denial tracking: consecutive and total denial limits
      • On denial limit exceeded: falls back to interactive prompting
      • Fail-closed vs fail-open controlled by feature gate
    Headless Agent Handling

    When shouldAvoidPermissionPrompts (background/subagent):

    1. Runs PermissionRequest hooks first
    2. If no hook decides → auto-deny
    3. Auto mode classifier still runs (with local denial tracking)

    11. Hooks System

    What Are Hooks?

    Hooks are user-defined shell commands (or JS callbacks) executed at specific lifecycle points. They are defined in:

    • .claude/settings.json
    • Skill/plugin frontmatter
    • SDK registration
    Hook Events (types/hooks.ts)
    EventWhen It FiresWhat It Can Do
    PreToolUseBefore any tool executesModify input, change permission, block
    PostToolUseAfter successful tool executionInject context, modify MCP output
    PostToolUseFailureAfter tool failureReact to failures
    SessionStartAt session startInject context, register watchPaths
    SetupDuring setup phaseInitialize state
    StopAfter model finishes (no tool_use)Prevent continuation, inject errors
    StopFailureWhen model errorsSilent notification
    SubagentStart/StopAround subagent lifecycleTrack subagent activity
    UserPromptSubmitWhen user submits a promptInject additional context
    PermissionDeniedWhen permission is deniedRetry with different input
    PermissionRequestFor headless agentsAllow/deny without UI
    CwdChangedWhen working directory changesRegister new watchPaths
    FileChangedWhen watched files changeReact to file changes
    WorktreeCreateWhen git worktree createdSetup worktree state
    NotificationFor system notificationsHandle notifications
    ConfigChangeWhen configuration changesReload config
    TaskCreated/CompletedAround background task lifecycleTrack tasks
    TeammateIdleWhen a teammate is idleReact to idle state
    Execution Model

    Hooks execute as child processes:

    • Shell: bash (Git Bash on Windows, /bin/sh elsewhere). PowerShell hooks use pwsh
    • Input: JSON serialized via stdin with trailing newline
    • Output: Parsed from stdout as JSON
    • Async hooks: Backgrounded via executeInBackground(), results delivered later via task-notification
    • Sync hooks: Wait for process exit, validate against hookJSONOutputSchema
    • Timeout: Default 10 minutes (TOOL_HOOK_EXECUTION_TIMEOUT_MS)
    • Trust gate: ALL hooks require workspace trust accepted
    Hook Output Protocol

    Hooks return JSON:

    JSON
    {
      "continue": false,
      "stopReason": "string",
      "decision": "approve|block",
      "systemMessage": "string",
      "hookSpecificOutput": {
        "hookEventName": "PreToolUse",
        "permissionDecision": "allow|deny|ask",
        "updatedInput": {},
        "additionalContext": "..."
      }
    }
    Matcher System

    Hooks use if conditions with patterns:

    Plain text
    Bash(git *)     → matches git commands
    Write|Edit      → matches Write or Edit
    regex:.*\.ts$   → regex matching

    The prepareIfConditionMatcher function does expensive work (tree-sitter parsing for Bash) once, then the closure is called per hook.

    Stop Hooks

    Called from query.ts after model finishes with no tool_use. Can:

    • Return preventContinuation: true to end the turn
    • Return blockingErrors to inject error messages and re-enter the loop
    • If blocking, sets stopHookActive: true to prevent death spirals

    12. Context & Compaction

    Four-Layer Architecture

    The query loop applies context management in this order each iteration:

    Layer 1: History Snip (HISTORY_SNIP feature)
    • Removes oldest message groups to free tokens
    • Runs before microcompact
    • Returns snipTokensFreed which is subtracted from token counts
    Layer 2: Microcompact

    Two implementations:

    Time-based microcompact (active path):

    • Triggered when gap since last assistant message exceeds gapThresholdMinutes
    • Content-clears all but the most recent N compactable tool results
    • Only compacts specific tools: Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write
    • Resets cached-MC state afterward (cache is cold)

    Cached microcompact (CACHED_MICROCOMPACT feature, ant-only):

    • Uses the API's cache_edits feature to delete tool results without invalidating the cached prefix
    • Tracks tool results in global cachedMCState, queues cache_edits blocks
    • Count-based trigger/keep thresholds from GrowthBook config
    Layer 3: Context Collapse (CONTEXT_COLLAPSE feature)
    • Commits collapsed summaries at 90% context usage
    • Blocking spawn at 95%
    • Uses a commit log that persists across turns via projectView() replay
    • Recovery path: recoverFromOverflow() drains staged collapses on real API 413
    Layer 4: Auto Compact
    • Threshold: effectiveContextWindow - 13,000 tokens buffer
    • Fires a forked agent to summarize conversation
    • First tries session memory compaction (prunes messages), then full conversation compaction
    • Circuit breaker: stops after 3 consecutive failures
    • Post-compact:
      • Restores up to 5 recently-read files
      • Re-announces tools/MCP/agents
      • Runs SessionStart hooks
      • Re-append session metadata
    Reactive Compact (REACTIVE_COMPACT feature)
    • Suppresses proactive autocompact when enabled
    • Catches API prompt_too_long errors and fires compact as recovery
    • Two-stage recovery: first collapse drain (cheap), then reactive compact (full summary)
    • Also handles media-size errors (image/PDF) via strip-retry
    Post-Compact Cleanup
    • Resets readFileState cache
    • Clears loadedNestedMemoryPaths
    • Intentionally does NOT reset sentSkillNames (saves ~4K tokens)
    • Notifies cache break detection to prevent false positives
    Token Warning States
    ThresholdTokens Below WindowBehavior
    Warning20KWarning shown to user
    Error20KError state entered
    Blocking limit3KHard block — no more API calls
    Manual compact buffer3KReserved for /compact command

    13. MCP Server Integration

    What Is MCP?

    Model Context Protocol (MCP) allows Claude Code to connect to external tool servers that provide additional tools, resources, and prompts.

    Transport Types
    TransportHow It Works
    stdioSpawns subprocess, communicates via stdin/stdout
    sseServer-Sent Events with auth provider
    httpStreamable HTTP with OAuth, session management
    wsWebSocket with TLS/mTLS options
    claudeai-proxyRoutes through claude.ai OAuth proxy
    in-processChrome MCP and Computer Use run in-process
    Connection Flow
    Plain text
    1. Memoized by getServerCacheKey(name, serverRef)
    2. Creates transport based on type
    3. Creates Client with capabilities: roots, elicitation
    4. Sets ListRoots request handler (returns file://<cwd>)
    5. Connects with timeout (default 30s)
    6. On success: fetches capabilities, server version, instructions (capped at 2048 chars)
    7. Sets up error/close handlers with reconnection logic
    8. Registers elicitation handler
    Tool Integration
    • MCP tools prefixed as mcp__serverName__toolName
    • Tools merged into tool pool via useMergedTools
    • IDE tools filtered: only mcp__ide__executeCode and mcp__ide__getDiagnostics allowed
    • Tool output truncation via truncateMcpContentIfNeeded
    • Large output persisted to disk via persistToolResult
    Authentication
    • ClaudeAuthProvider: OAuth token management per server
    • Auth cache: 15-min TTL file-based cache
    • 401 handling: Force-refresh, serialized via write chain
    • claude.ai proxy: Custom fetch wrapper with OAuth retry

    14. Skills System

    What Are Skills?

    Skills are markdown files with YAML frontmatter that become Command objects. They provide reusable capabilities that Claude can discover and execute.

    Skill Locations
    LocationScope
    BundledCompiled into CLI binary
    User~/.claude/skills/
    Project.claude/skills/ (walks up to home)
    ManagedPolicy-managed path
    PluginFrom plugin directories
    MCPFrom MCP server skill builders
    Legacy.claude/commands/ (deprecated)
    Skill Format
    Markdown
    ---
    name: my-skill
    description: What this skill does
    allowed-tools: Bash, Read, Write
    argument-hint: <arg1> <arg2>
    when_to_use: When to apply this skill
    model: sonnet
    user-invocable: true
    disable-model-invocation: false
    context: fork
    paths: src/**/*.ts
    effort: low
    hooks:
      PreToolUse:
        - if: Bash(*)
          command: ./hook.sh
    ---
    Skill instructions here...
    Loading Flow
    1. Parallel discovery: Loads from managed, user, project, additional, and legacy dirs simultaneously
    2. Parsing: parseFrontmatter() extracts YAML + markdown content
    3. Deduplication: By realpath (handles symlinks), first-wins ordering
    4. Conditional skills: Skills with paths frontmatter stored separately, activated when matching files are touched
    5. Dynamic discovery: When Read/Edit/Write touches a file, walks up to cwd looking for .claude/skills dirs
    6. Gitignore check: Skips gitignored skill dirs
    Skill Command Generation
    • getPromptForCommand: Substitutes arguments, replaces ${CLAUDE_SKILL_DIR} and ${CLAUDE_SESSION_ID}, executes inline shell commands (!...`)
    • MCP skills: Never execute inline shell commands (remote/untrusted)
    • Argument substitution via substituteArguments()

    15. Coordinator Mode

    What Is It?

    A feature-gated mode (CLAUDE_CODE_COORDINATOR_MODE=1) that transforms Claude into a task orchestrator rather than a direct code editor.

    Coordinator System Prompt

    The coordinator receives a specialized system prompt instructing it to:

    1. Delegate work to workers via the Agent tool with subagent_type: 'worker'
    2. Synthesize research findings into implementation specs
    3. Launch parallel workers for independent tasks
    4. Continue workers via SendMessage tool (preserving context)
    5. Stop workers via TaskStop tool
    6. Never fabricate results — wait for actual <task-notification> messages
    Worker Tool Restrictions

    Workers get a restricted tool set:

    • Simple mode: Bash, Read, Edit + MCP tools
    • Full mode: ASYNC_AGENT_ALLOWED_TOOLS minus internal tools (TeamCreate, TeamDelete, SendMessage, SyntheticOutput)
    Task Workflow
    Plain text
    Research (parallel workers) → Synthesis (coordinator) → Implementation (workers) → Verification (workers)
    Context Injection

    getCoordinatorUserContext() injects a workerToolsContext into the system prompt telling the coordinator which tools its workers have access to.


    16. Loop Termination

    The loop terminates via return { reason: ... } at multiple points:

    Normal Termination
    ReasonWhen
    completedNo tool_use blocks, stop hooks pass, no blocking errors
    stop_hook_preventedStop hook indicated not to continue
    Budget/Limit Termination
    ReasonWhen
    max_turnsExceeded maxTurns parameter
    blocking_limitContext exceeds hard token limit
    Budget exceededChecked in QueryEngine via getTotalCost() >= maxBudgetUsd
    Error Termination
    ReasonWhen
    model_errorAPI call threw an exception
    prompt_too_longRecovery exhausted for 413 errors
    image_errorMedia size error recovery exhausted
    error_max_structured_output_retriesStructured output retry limit
    Abort Termination
    ReasonWhen
    aborted_streamingAbort signal fired during API streaming
    aborted_toolsAbort signal fired during tool execution
    hook_stoppedHook indicated to prevent continuation
    QueryEngine Result Mapping

    When queryLoop returns, QueryEngine.submitMessage() maps the terminal state to an SDK result:

    TypeScript
    // Success
    { type: 'result', subtype: 'success', result: textResult, ... }
    
    // Error
    { type: 'result', subtype: 'error_during_execution', errors: [...], ... }
    
    // Budget exceeded
    { type: 'result', subtype: 'error_max_budget_usd', ... }
    
    // Max turns
    { type: 'result', subtype: 'error_max_turns', ... }

    17. Key Architectural Patterns

    1. Everything Is a Generator

    The entire loop is an async generator, yielding events/messages as they happen. This enables:

    • Real-time UI updates in the TUI
    • SDK streaming for programmatic consumers
    • Clean cancellation at any point
    • Test recording/playback via VCR
    2. Immutable State Pattern

    State is never mutated; new State objects are created at each continue site. This makes:

    • The loop's behavior deterministic and testable
    • Recovery paths clean (no rollback needed)
    • Fork subagents possible (snapshot and branch)
    3. Recursive Agents

    Subagents call query() recursively, creating nested agentic loops. Each has:

    • Its own ToolUseContext
    • Its own abort controller (async) or shared (sync)
    • Its own message history
    • Its own transcript recording (sidechain)
    4. Streaming Tool Execution

    Tools can start executing while the API response is still streaming, reducing latency for independent tool calls. This is a significant performance optimization.

    5. Feature-Gated Complexity

    Most advanced features are behind feature() gates with conditional require() for dead code elimination:

    • Fork subagents
    • Coordinator mode
    • Context collapse
    • Reactive compact
    • History snip
    • Cached microcompact
    • Auto mode classifier
    • Assistant mode (Kairos)
    • SSH remote
    • Direct connect
    6. Prompt Cache Optimization

    The fork subagent path goes to extreme lengths to maximize Anthropic's prompt cache hit rate:

    • Exact tool arrays (no reordering)
    • Exact system prompt bytes (frozen at turn start)
    • Identical placeholder results across forks
    • Content-hash-based temp file paths (not UUIDs)
    7. Recovery-First Design

    The loop has multiple recovery paths before surfacing errors:

    • Reactive compact
    • Context collapse drain
    • Token escalation
    • Fallback models
    • Structured output retry
    • Media strip retry
    8. Trust Gate

    ALL hooks, git operations, and system context reads require workspace trust to be established first. This prevents arbitrary code execution before the user has explicitly trusted the workspace.

    9. Spill-to-Disk for Large Outputs

    Tool results that exceed size limits are persisted to disk with a preview shown to the model. This keeps the context window manageable while preserving full output accessibility.

    10. Permission Rule Sources

    Permission rules come from multiple sources with clear priority ordering:

    Plain text
    policySettings > userSettings > projectSettings > localSettings > cliArg > command > session

    This allows enterprise policies to override user preferences, which override project defaults, etc.


    Quick Reference: File Map

    FileRole
    main.tsxEntry point, CLI parsing, initialization
    QueryEngine.tsHigh-level conversation orchestrator
    query.tsThe agentic loop (queryLoop())
    query/deps.tsProduction dependencies injected into query
    Tool.tsTool type definitions and buildTool()
    tools.tsTool registry assembly
    tools/Individual tool implementations (43 tools)
    state/store.tsZustand-like pub/sub store
    state/AppStateStore.tsImmutable AppState type and store
    bootstrap/state.tsGlobal singleton state
    Task.tsTask type definitions and ID generation
    tasks.tsTask management utilities
    types/message.tsMessage type definitions
    types/hooks.tsHook event type definitions
    types/permissions.tsPermission type definitions
    utils/permissions/Permission system implementation
    utils/hooks.tsHook execution engine
    services/api/claude.tsAnthropic API client
    services/mcp/MCP server integration
    skills/Skills system
    coordinator/Coordinator mode
    context/React contexts (notifications, modals, etc.)

    This document covers the core agentic loop, agent workflow, tool execution, state management, permission system, hooks, context management, MCP integration, skills, and coordinator mode. The codebase is ~100K+ lines of TypeScript with heavy use of feature flags, generators, and immutable state patterns.