agentic-architecture

Shared from "Claude-Code" on Inkdown

Claude Code — Agentic Loop & Architecture Deep Dive

A principal-engineer-level walkthrough of how Claude Code works end-to-end: the agentic loop, agent workflow, tool execution, state management, and everything in between.

High-Level Architecture
Startup & Initialization
The Agentic Loop
LLM API Call Pipeline
Tool Execution Pipeline
The Tool System

Plain text

┌─────────────────────────────────────────────────┐
│  REPL / TUI Layer (Ink React Components)        │  ← User sees this
│  main.tsx → interactiveHelpers.tsx → REPL.tsx   │
├─────────────────────────────────────────────────┤
│  Query Engine Layer                             │  ← The brain
│  QueryEngine.ts → query.ts → queryLoop()        │
├─────────────────────────────────────────────────┤
│  API & Tool Execution Layer                     │  ← The hands
│  claude.ts (API) + tools/ (43 tool impls)       │
└─────────────────────────────────────────────────┘

Component	File(s)	Role
Entry Point	`main.tsx`	CLI parsing, initialization, mode detection
Query Engine	`QueryEngine.ts`	High-level orchestrator, one per conversation
Agentic Loop	`query.ts` (`queryLoop()`)	The infinite loop that drives agent behavior
Tool Registry	`Tool.ts`, `tools.ts`	Tool type definitions and registry
State Store	`state/store.ts`, `AppStateStore.ts`	Zustand-like pub/sub store
API Client	`services/api/claude.ts`	Anthropic API calls with streaming
Bootstrap State	`bootstrap/state.ts`	Global singleton state (session, cost, telemetry)

Mode	Flag	Behavior
Interactive REPL	(default)	Full TUI with Ink React
Headless/Print	`-p` / `--print`	Non-interactive, output to stdout
SDK Mode	`--sdk-url`	SDK consumer drives the session
SSH Remote	`ssh <host>`	Remote execution via SSH tunnel
Direct Connect	`cc://...`	Connect to a running Claude Code instance
Assistant Mode	`assistant [id]`	Kairos/assistant feature (feature-gated)

Plain text

1. Await MDM + keychain prefetches
2. init() — loads settings, config, auth
3. Set process.title = 'claude'
4. Initialize logging sinks
5. Run migrations (CURRENT_MIGRATION_VERSION = 11)
6. Load remote managed settings (non-blocking)
7. Upload user settings (if feature-gated)

Plain text

User submits prompt
  → QueryEngine.submitMessage()
    → query() [outer generator]
      → queryLoop() [core infinite loop]
        → deps.callModel() [API call]
        → execute tools [tool pipeline]
        → loop again with results

Plain text

┌──────────────────────────────────────────────────────────────┐
│                    queryLoop() Iteration                      │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  1. Read current state (messages, toolUseContext, tracking)   │
│                                                               │
│  2. Skill discovery prefetch (non-blocking, during streaming) │
│                                                               │
│  3. Yield 'stream_request_start' event                        │
│                                                               │
│  4. Increment query chain tracking (chainId, depth)           │
│                                                               │
│  5. Get messages after compact boundary                       │
│                                                               │
│  6. Apply tool result budget (truncate oversized results)     │
│                                                               │
│  7. Apply history snip (if enabled)                           │
│                                                               │
│  8. Apply microcompact (time-based or cached)                 │
│                                                               │
│  9. Apply context collapse (if enabled, at 90% usage)         │
│                                                               │
│  10. Apply auto-compact (if context exceeds threshold)        │
│                                                               │
│  11. Build full system prompt                                 │
│                                                               │
│  12. Check blocking token limit (reject if too large)         │
│                                                               │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │  API CALL LOOP (with fallback retry)                     │  │
│  │  a. deps.callModel() → streams assistant response        │  │
│  │  b. Collect assistantMessages, toolUseBlocks             │  │
│  │  c. Handle streaming fallback (model switch on error)    │  │
│  │  d. Withhold recoverable errors (PTL, max_tokens)        │  │
│  │  e. Yield messages to consumer                           │  │
│  └─────────────────────────────────────────────────────────┘  │
│                                                               │
│  13. Execute post-sampling hooks                              │
│                                                               │
│  14. Check for abort (user interrupt)                         │
│                                                               │
│  15. Yield previous turn's tool-use summary (if any)          │
│                                                               │
│  16. ┌─ NO tool_use blocks? ──────────────────────────────┐  │
│      │  a. Handle withheld 413/max_tokens errors           │  │
│      │  b. Run stop hooks                                  │  │
│      │  c. Check token budget                              │  │
│      │  d. Return { reason: 'completed' }                  │  │
│      └─────────────────────────────────────────────────────┘  │
│                                                               │
│  17. ┌─ HAS tool_use blocks? ─────────────────────────────┐  │
│      │  a. Execute tools (streaming or batched)            │  │
│      │  b. Collect tool results                            │  │
│      │  c. Generate tool-use summary (async)               │  │
│      │  d. Check abort mid-execution                       │  │
│      │  e. Collect attachments (queued commands, memory)   │  │
│      │  f. Refresh tools (MCP servers)                     │  │
│      │  g. Check maxTurns                                  │  │
│      │  h. Continue loop with new messages                 │  │
│      └─────────────────────────────────────────────────────┘  │
│                                                               │
└──────────────────────────────────────────────────────────────┘

TypeScript

type State = {
  messages: Message[]                    // Full conversation history
  toolUseContext: ToolUseContext         // Execution context (tools, abort, state)
  autoCompactTracking: ...               // Tracks compaction state
  maxOutputTokensRecoveryCount: number   // Recovery attempt counter
  hasAttemptedReactiveCompact: boolean   // Prevents infinite compact loops
  pendingToolUseSummary: ...             // Async summary from previous turn
  turnCount: number                      // Iteration counter
  transition: ...                        // Why previous iteration continued
}

Parameter	Source	Purpose
`model`	`getRuntimeMainLoopModel()`	Resolved from permission mode + token count
`system`	System prompt construction	Full system prompt with cache breakpoints
`messages`	`normalizeMessagesForAPI()`	Normalized conversation history
`tools`	`toolToAPISchema()`	Tool schemas (Zod → JSON Schema)
`thinking`	Thinking config	Adaptive/disabled thinking
`max_tokens`	Model config	Output token limit
`temperature`, `top_p`, `top_k`	Model config	Sampling parameters
`betas`	Feature flags	Prompt caching, context management, structured outputs
`metadata`	Session state	User ID, session ID for tracking

Plain text

message_start          → Initial message with usage
content_block_start    → Beginning of text/tool_use/thinking block
content_block_delta    → Streaming deltas (text chunks, tool input)
content_block_stop     → Block completed
message_delta          → Stop reason + final usage
message_stop           → Request complete

Error	Recovery Strategy
Prompt too long (413)	Context collapse drain → Reactive compact → Surface error
Max output tokens	Escalate to 64k (once) → Inject "Resume directly" message → Retry 3x → Surface
Media size error	Strip media → Retry once → Surface
Structured output retry	Retry with relaxed constraints → Surface after limit

Plain text

1. Input validation     → Zod schema parse (tool.inputSchema.safeParse())
2. Value validation     → tool.validateInput() (e.g., blocked sleep patterns)
3. PreToolUse hooks     → runPreToolUseHooks() (can modify input, block, decide)
4. Permission decision  → resolveHookPermissionDecision() + canUseTool()
   ├── Permission mode check (default, plan, auto, bypass)
   ├── Always-allow rules (session, CLI, settings)
   ├── Always-deny rules
   ├── Auto-mode classifier (security check for Bash)
   └── Permission hooks
5. If denied            → Yield error tool_result, run PermissionDenied hooks
6. If allowed           → Proceed to execution

TypeScript

type ToolResult<T> = {
  data: T                          // The result data
  newMessages?: Message[]          // Optional additional messages to inject
  contextModifier?: (ctx) => ctx   // Function to update ToolUseContext
  mcpMeta?: { ... }                // MCP protocol metadata
}

TypeScript

type Tool<Input, Output, Progress> = {
  name: string                           // Unique identifier
  inputSchema: ZodType<Input>            // Input validation schema
  description(...): Promise<string>      // Dynamic description for the model
  call(args, context, canUseTool, ...): Promise<ToolResult<Output>>
  checkPermissions(input, context): Promise<PermissionResult>
  isConcurrencySafe(input): boolean      // Can run in parallel?
  isReadOnly(input): boolean             // Does it modify state?
  isDestructive(input): boolean          // Irreversible operation?
  isEnabled(): boolean                   // Feature-gated?
  validateInput?(input, context): ValidationResult
  // ... plus ~30 more optional methods for UI rendering, progress, etc.
}

TypeScript

const MyTool = buildTool({
  name: 'MyTool',
  inputSchema: z.object({ ... }),
  description: async () => 'Does something useful',
  call: async (args, ctx) => ({ data: 'result' }),
  // ... only override what's needed
})

Tool	File	What It Does
Bash	`tools/BashTool/`	Execute shell commands with sandbox, timeout, backgrounding
Read	`tools/FileReadTool/`	Read files (text, images, PDFs, notebooks) with dedup
Edit	`tools/FileEditTool/`	String replacement in files with staleness checks
Write	`tools/FileWriteTool/`	Write/overwrite files with read-first requirement
Agent	`tools/AgentTool/`	Spawn subagents (sync, async, fork, remote, teammate)
Glob	`tools/GlobTool/`	File pattern matching
Grep	`tools/GrepTool/`	Content search with ripgrep
TodoWrite	`tools/TodoWriteTool/`	Task tracking panel
WebSearch	`tools/WebSearchTool/`	Web search via Exa
WebFetch	`tools/WebFetchTool/`	Fetch URL content
Skill	`tools/SkillTool/`	Execute skill commands
MCP	`tools/MCPTool/`	Execute MCP server tools
ToolSearch	`tools/ToolSearchTool/`	Deferred tool loading (lazy schema loading)

Plain text

API streams: "I'll read file A and file B..."
  → ToolUse block for Read(fileA) starts streaming
    → Read(fileA) added to execution queue
  → ToolUse block for Read(fileB) starts streaming
    → Read(fileB) added to execution queue
  → Both Read tools execute in parallel (concurrency-safe)
API continues: "...and also run this command"
  → ToolUse block for Bash starts streaming
    → Bash added to queue (waits for serial execution)

TypeScript

type TaskType =
  | 'local_bash'          // Shell command execution
  | 'local_agent'         // Subagent in same process
  | 'remote_agent'        // Agent in CCR (cloud) environment
  | 'in_process_teammate' // Teammate in same process
  | 'local_workflow'      // Workflow execution
  | 'monitor_mcp'         // MCP server monitor
  | 'dream'               // Dream/experimental mode

Mode	Behavior	Use Case
Sync	Blocks parent's turn, shares abort controller	Quick delegations
Async	Independent lifecycle, own abort controller	Long-running tasks

Plain text

1. registerAsyncAgent() — registers with AppState
2. createProgressTracker() — tracks execution progress
3. updateAsyncAgentProgress() — updates UI
4. Agent runs query() in background
5. On completion: enqueues <task-notification> XML to parent's message queue
6. Parent drains notifications via getCommandsByMaxPriority() filtered by agentId

Child inherits parent's exact system prompt bytes
Child inherits parent's exact tool pool

buildForkedMessages() creates:

Plain text

[...history, assistant(all_tool_uses), user(placeholder_results..., per_child_directive)]

All fork children produce byte-identical prefixes — only the final directive text block differs
This maximizes Anthropic's prompt cache hit rate across parallel forks

Plain text

┌─────────────────────────────────────────────────────┐
│  Bootstrap State (bootstrap/state.ts)               │  ← Global singleton
│  Session ID, cost, telemetry, feature latches       │
├─────────────────────────────────────────────────────┤
│  AppState Store (state/AppStateStore.ts)            │  ← Immutable state object
│  Permissions, tasks, MCP, todos, notifications      │
├─────────────────────────────────────────────────────┤
│  ToolUseContext (Tool.ts)                           │  ← Per-query execution context
│  Tools, abort controller, state accessors, agentId  │
└─────────────────────────────────────────────────────┘

TypeScript

type AppState = {
  toolPermissionContext: ToolPermissionContext
  tasks: Map<string, TaskState>
  mcp: { clients, tools, commands, resources }
  todos: Map<AgentId, TodoItem[]>
  notifications: Notification[]
  agentNameRegistry: Map<string, AgentId>
  settings: Settings
  mainLoopModel: string
  verbose: boolean
  // ... many more fields
}

TypeScript

createStore(initialState, onChange?) → {
  getState(),           // Current state snapshot
  setState(updater),    // Functional update: (prev) => next
  subscribe(listener)   // Returns unsubscribe function
}

TypeScript

type ToolUseContext = {
  options: {
    commands: Command[]
    tools: Tools
    mainLoopModel: string
    mcpClients: MCPServerConnection[]
    agentDefinitions: AgentDefinitionsResult
    // ...
  }
  abortController: AbortController
  getAppState(): AppState
  setAppState(f: (prev: AppState) => AppState): void
  setAppStateForTasks?: (...)  // Always-shared for session-scoped infrastructure
  agentId?: AgentId            // Subagent identity (undefined for main thread)
  agentType?: string           // Subagent type name
  messages: Message[]          // Current message array
  readFileState: FileStateCache
  // ... many more fields for UI, hooks, telemetry
}

TypeScript

// At loop continuation:
state = {
  messages: [...messagesForQuery, ...assistantMessages, ...toolResults],
  toolUseContext: toolUseContextWithQueryTracking,
  turnCount: nextTurnCount,
  autoCompactTracking: updatedTracking,
  // ...
}

Type	Purpose
`user`	User input, tool results, meta messages
`assistant`	LLM responses with text/thinking/tool_use blocks
`system`	Internal messages (compact_boundary, api_error, local_command)
`progress`	Tool execution progress updates
`attachment`	System attachments (file changes, task notifications, max_turns)
`tombstone`	Control signal to remove messages from UI
`stream_event`	Raw API streaming events
`tool_use_summary`	Haiku-generated summaries of tool batches

Plain text

1. User message enters via QueryEngine.submitMessage(prompt)
2. processUserInput() handles slash commands, attachments, mode changes
3. Message pushed to mutableMessages and persisted to transcript
4. query() generator starts the agentic loop
5. Each API call produces assistant messages with optional tool_use blocks
6. Tool results become user messages with tool_result blocks
7. Loop continues until no tool_use blocks and stop hooks don't prevent
8. Final result yielded to QueryEngine which formats SDK response

Mode	Behavior
`default`	Normal prompting for each tool use
`plan`	Plan mode — pauses before execution
`acceptEdits`	Auto-accepts safe edits
`bypassPermissions`	All tools allowed without prompting
`dontAsk`	Denies anything that would prompt
`auto` (ant-only)	AI classifier auto-approves/denies
`bubble`	Internal subagent mode

Plain text

Step 1: Rule-based checks
  1a. Entire tool denied by rule → deny
  1b. Entire tool has ask rule → ask (unless sandboxed Bash can auto-allow)
  1c. Tool's own checkPermissions() → delegates to tool implementation
  1d. Tool implementation denied → deny
  1e. Tool requires user interaction → ask (bypass-immune)
  1f. Content-specific ask rules → ask (bypass-immune)
  1g. Safety checks (.git/, .claude/, .vscode/, shell configs) → ask (bypass-immune)

Step 2: Mode-based checks
  2a. Bypass mode → allow
  2b. Entire tool allowed by rule → allow

Step 3: Default
  passthrough → converted to ask

Plain text

Bash(git *)        → matches git commands
Bash               → matches all Bash
mcp__server1       → matches all tools from MCP server
mcp__server1__*    → wildcard for server tools
Agent(Explore)     → denies specific agent type

Event	When It Fires	What It Can Do
PreToolUse	Before any tool executes	Modify input, change permission, block
PostToolUse	After successful tool execution	Inject context, modify MCP output
PostToolUseFailure	After tool failure	React to failures
SessionStart	At session start	Inject context, register watchPaths
Setup	During setup phase	Initialize state
Stop	After model finishes (no tool_use)	Prevent continuation, inject errors
StopFailure	When model errors	Silent notification
SubagentStart/Stop	Around subagent lifecycle	Track subagent activity
UserPromptSubmit	When user submits a prompt	Inject additional context
PermissionDenied	When permission is denied	Retry with different input
PermissionRequest	For headless agents	Allow/deny without UI
CwdChanged	When working directory changes	Register new watchPaths
FileChanged	When watched files change	React to file changes
WorktreeCreate	When git worktree created	Setup worktree state
Notification	For system notifications	Handle notifications
ConfigChange	When configuration changes	Reload config
TaskCreated/Completed	Around background task lifecycle	Track tasks
TeammateIdle	When a teammate is idle	React to idle state

JSON

{
  "continue": false,
  "stopReason": "string",
  "decision": "approve|block",
  "systemMessage": "string",
  "hookSpecificOutput": {
    "hookEventName": "PreToolUse",
    "permissionDecision": "allow|deny|ask",
    "updatedInput": {},
    "additionalContext": "..."
  }
}

Threshold	Tokens Below Window	Behavior
Warning	20K	Warning shown to user
Error	20K	Error state entered
Blocking limit	3K	Hard block — no more API calls
Manual compact buffer	3K	Reserved for `/compact` command

Transport	How It Works
stdio	Spawns subprocess, communicates via stdin/stdout
sse	Server-Sent Events with auth provider
http	Streamable HTTP with OAuth, session management
ws	WebSocket with TLS/mTLS options
claudeai-proxy	Routes through claude.ai OAuth proxy
in-process	Chrome MCP and Computer Use run in-process

Plain text

1. Memoized by getServerCacheKey(name, serverRef)
2. Creates transport based on type
3. Creates Client with capabilities: roots, elicitation
4. Sets ListRoots request handler (returns file://<cwd>)
5. Connects with timeout (default 30s)
6. On success: fetches capabilities, server version, instructions (capped at 2048 chars)
7. Sets up error/close handlers with reconnection logic
8. Registers elicitation handler

Location	Scope
Bundled	Compiled into CLI binary
User	`~/.claude/skills/`
Project	`.claude/skills/` (walks up to home)
Managed	Policy-managed path
Plugin	From plugin directories
MCP	From MCP server skill builders
Legacy	`.claude/commands/` (deprecated)

Markdown

---
name: my-skill
description: What this skill does
allowed-tools: Bash, Read, Write
argument-hint: <arg1> <arg2>
when_to_use: When to apply this skill
model: sonnet
user-invocable: true
disable-model-invocation: false
context: fork
paths: src/**/*.ts
effort: low
hooks:
  PreToolUse:
    - if: Bash(*)
      command: ./hook.sh
---
Skill instructions here...

Reason	When
`completed`	No tool_use blocks, stop hooks pass, no blocking errors
`stop_hook_prevented`	Stop hook indicated not to continue

Reason	When
`max_turns`	Exceeded `maxTurns` parameter
`blocking_limit`	Context exceeds hard token limit
Budget exceeded	Checked in QueryEngine via `getTotalCost() >= maxBudgetUsd`

Reason	When
`model_error`	API call threw an exception
`prompt_too_long`	Recovery exhausted for 413 errors
`image_error`	Media size error recovery exhausted
`error_max_structured_output_retries`	Structured output retry limit

Reason	When
`aborted_streaming`	Abort signal fired during API streaming
`aborted_tools`	Abort signal fired during tool execution
`hook_stopped`	Hook indicated to prevent continuation

TypeScript

// Success
{ type: 'result', subtype: 'success', result: textResult, ... }

// Error
{ type: 'result', subtype: 'error_during_execution', errors: [...], ... }

// Budget exceeded
{ type: 'result', subtype: 'error_max_budget_usd', ... }

// Max turns
{ type: 'result', subtype: 'error_max_turns', ... }

File	Role
`main.tsx`	Entry point, CLI parsing, initialization
`QueryEngine.ts`	High-level conversation orchestrator
`query.ts`	The agentic loop (`queryLoop()`)
`query/deps.ts`	Production dependencies injected into query
`Tool.ts`	Tool type definitions and `buildTool()`
`tools.ts`	Tool registry assembly
`tools/`	Individual tool implementations (43 tools)
`state/store.ts`	Zustand-like pub/sub store
`state/AppStateStore.ts`	Immutable AppState type and store
`bootstrap/state.ts`	Global singleton state
`Task.ts`	Task type definitions and ID generation
`tasks.ts`	Task management utilities
`types/message.ts`	Message type definitions
`types/hooks.ts`	Hook event type definitions
`types/permissions.ts`	Permission type definitions
`utils/permissions/`	Permission system implementation
`utils/hooks.ts`	Hook execution engine
`services/api/claude.ts`	Anthropic API client
`services/mcp/`	MCP server integration
`skills/`	Skills system
`coordinator/`	Coordinator mode
`context/`	React contexts (notifications, modals, etc.)

agentic-architecture

Claude Code — Agentic Loop & Architecture Deep Dive

Table of Contents

agentic-architecture

Claude Code — Agentic Loop & Architecture Deep Dive

Table of Contents

1. High-Level Architecture

Three-Layer Architecture

Core Components

2. Startup & Initialization

Phase 1: Pre-Import Optimizations (main.tsx:1-20)

Phase 2: CLI Parsing (main.tsx:884+)

Phase 3: Pre-Action Hook (main.tsx:907-967)

Phase 4: Deferred Prefetches (main.tsx:388-431)

3. The Agentic Loop

Call Chain

The Loop: Step by Step (query.ts:241+)

State Carried Between Iterations

Key Insight: Immutable State Pattern

4. LLM API Call Pipeline

The Call Chain

API Request Construction (services/api/claude.ts:752+)

Streaming Protocol

Fallback Mechanism

Recovery Paths

5. Tool Execution Pipeline

Phase 1: Tool Discovery During Streaming

Phase 2: Permission Resolution (per tool)

Phase 3: Tool Execution

Phase 4: Result Processing

Phase 5: Loop Continuation

6. The Tool System

Tool Interface (Tool.ts)

Tool Builder Pattern

Tool Registry (tools.ts)

Key Tools

Tool Result Persistence

Streaming Tool Execution

7. Agent Spawning & Subagents

Task Types (Task.ts)

Agent Spawn Modes

Sync vs Async

Spawn Paths

runAgent() Execution Environment

Background Agent Lifecycle

Fork Subagent Mechanics

8. State Management

Three-Layer State Architecture

Bootstrap State (bootstrap/state.ts)

AppState Store (state/AppStateStore.ts)

Store Implementation (state/store.ts)

ToolUseContext

State Flow in the Loop

9. Message System

Message Types (types/message.ts)

Conversation Flow

Message Normalization

Attachment System

10. Permission System

Permission Modes

Permission Pipeline (hasPermissionsToUseTool)

Permission Rules

Auto Mode Classifier (ant-only)

Headless Agent Handling

11. Hooks System

What Are Hooks?

Hook Events (types/hooks.ts)

Execution Model

Hook Output Protocol

Matcher System

Stop Hooks

12. Context & Compaction

Four-Layer Architecture

Layer 1: History Snip (HISTORY_SNIP feature)

Layer 2: Microcompact

Layer 3: Context Collapse (CONTEXT_COLLAPSE feature)

Layer 4: Auto Compact

Reactive Compact (REACTIVE_COMPACT feature)

Post-Compact Cleanup

Token Warning States

Phase 1: Pre-Import Optimizations (`main.tsx:1-20`)

Phase 2: CLI Parsing (`main.tsx:884+`)

Phase 3: Pre-Action Hook (`main.tsx:907-967`)

Phase 4: Deferred Prefetches (`main.tsx:388-431`)

The Loop: Step by Step (`query.ts:241+`)

API Request Construction (`services/api/claude.ts:752+`)

Tool Interface (`Tool.ts`)

Tool Registry (`tools.ts`)

Task Types (`Task.ts`)

Bootstrap State (`bootstrap/state.ts`)

AppState Store (`state/AppStateStore.ts`)

Store Implementation (`state/store.ts`)

Message Types (`types/message.ts`)

Permission Pipeline (`hasPermissionsToUseTool`)

Hook Events (`types/hooks.ts`)

Layer 1: History Snip (`HISTORY_SNIP` feature)

Layer 3: Context Collapse (`CONTEXT_COLLAPSE` feature)

Reactive Compact (`REACTIVE_COMPACT` feature)