InkdownInkdown
Start writing

Claude-Code

62 files·4 subfolders

Shared Workspace

Claude-Code
codex

03-query-system

Shared from "Claude-Code" on Inkdown

Query System Architecture

Overview

The Query System is the heart of Claude Code - it manages the conversation with Claude's API, handles tool execution, and orchestrates the entire interaction flow.

Plain text
┌─────────────────────────────────────────────────────────────────────────────┐
│                           QUERY SYSTEM                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐                │
│  │   Query      │───►│   Query      │───►│   Claude     │                │
│  │   Engine     │    │   (query.ts) │    │   API        │                │
│  │   (class)    │◄───│              │◄───│              │                │
│  └──────────────┘    └──────────────┘    └──────────────┘                │
│         │                   │                                               │
│         │            ┌──────▼──────┐                                     │
│         │            │  Streaming  │                                     │
│         │            │  Executor   │                                     │
│         │            └──────┬──────┘                                     │
│         │                   │                                               │
│         │            ┌──────▼──────┐                                     │
│         └───────────►│   Tool      │                                     │
│                      │  Execution  │                                     │
│                      └─────────────┘                                     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
0000_start_here_index_and_recommended_reading_order.md
0100_project_overview_tech_stack_runtime_modes_and_folder_map.md
0200_startup_flow_entry_points_and_cold_start_sequence.md
0300_codebase_modules_layers_state_models_and_schemas.md
0400_system_architecture_and_design_rationale.md
0500_interactive_repl_request_flow_end_to_end.md
0600_headless_sdk_and_print_mode_request_flow_end_to_end.md
0700_mcp_integration_connection_and_tool_call_flow.md
0800_external_services_sdks_storage_and_local_dependencies.md
0900_environment_variables_settings_feature_flags_and_failure_modes.md
1000_non_obvious_patterns_gotchas_and_debugging_traps.md
1100_full_codebase_file_inventory_grouped_by_directory.md
kimi
00-overview.md
01-entrypoints.md
02-state-management.md
03-query-system.md
04-tools-system.md
05-tasks-system.md
06-ui-components.md
07-bridge-remote.md
08-services.md
09-skills-plugins.md
10-commands.md
11-testing-architecture.md
12-permission-system.md
13-build-system.md
14-ink-internals.md
15-git-internals.md
16-context-compaction.md
17-vim-mode.md
18-mailbox-notifications.md
19-session-persistence.md
20-hooks-system.md
21-error-recovery.md
README.md
qwen
00-overview.md
01-entry-points.md
02-query-engine.md
03-tools-and-tasks.md
04-commands-and-skills.md
05-state-management.md
06-ink-rendering.md
07-bridge-remote.md
08-mcp-services.md
09-services-overview.md
10-multi-agent.md
11-system-prompt-constants.md
12-tool-interface.md
13-memory-system.md
14-buddy-companion.md
15-keybindings.md
16-stop-hooks.md
17-vim-mode.md
18-upstreamproxy.md
19-cost-tracking-history.md
20-contexts-styles-onboarding.md
21-hooks.md
22-screens.md
tweets-explain
claude-code-memory-analysis.md
compact
memory-system
agentic-architecture

Core Files

FilePurpose
QueryEngine.tsMain class managing query lifecycle
query.tsCore streaming logic and API communication
query/config.tsQuery configuration building
query/deps.tsDependency injection for testing
query/transitions.tsState machine transitions
query/tokenBudget.tsToken limit management

Query Engine Class

TypeScript
// QueryEngine.ts
export class QueryEngine {
  private config: QueryEngineConfig
  private mutableMessages: Message[]  // Conversation history
  private abortController: AbortController
  private totalUsage: NonNullableUsage
  private readFileState: FileStateCache

  constructor(config: QueryEngineConfig) {
    // Setup initial state
  }

  async submitMessage(
    userMessage: UserMessage
  ): AsyncGenerator<QueryEvent, QueryResult> {
    // Main entry point - processes a user message
  }

  async *submitPrompt(
    systemPrompt: SystemPrompt,
    userMessages: UserMessage[]
  ): AsyncGenerator<QueryEvent, QueryResult> {
    // Lower-level prompt submission
  }
}
Key Insight

QueryEngine is reusable across turns. You create it once per conversation, then call submitMessage() for each user input.


Query Flow

Plain text
┌─────────────────────────────────────────────────────────────────┐
│                     QUERY LIFECYCLE                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. SUBMIT                                                      │
│     UserMessage → add to messages → call API                     │
│                                                                 │
│  2. STREAM                                                      │
│     Receive chunks from Claude API                              │
│     ├─ Text blocks → yield message events                       │
│     ├─ Tool use blocks → queue for execution                    │
│     └─ Error blocks → handle errors                             │
│                                                                 │
│  3. EXECUTE TOOLS                                               │
│     For each ToolUseBlock:                                      │
│     ├─ Check permissions                                        │
│     ├─ Execute tool                                             │
│     ├─ Collect results                                          │
│     └─ Add ToolResultBlock to messages                          │
│                                                                 │
│  4. CONTINUE?                                                   │
│     If more tool uses → go back to step 2                      │
│     If max turns reached → stop with warning                    │
│     If completion → yield final result                          │
│                                                                 │
│  5. COMPLETE                                                    │
│     Return final QueryResult                                     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Query Function (Streaming Core)

TypeScript
// query.ts - The main streaming implementation
export async function* query(
  params: QueryParams
): AsyncGenerator<StreamEvent, QueryResult> {
  const {
    messages,
    systemPrompt,
    tools,
    canUseTool,
    toolUseContext,
  } = params

  // Build API configuration
  const apiConfig = buildQueryConfig(params)

  // Start API call with streaming
  const stream = await streamMessages(apiConfig)

  // Process stream chunks
  for await (const chunk of stream) {
    const content = chunk.delta?.content

    // Handle different content types
    if (content?.type === 'text') {
      yield { type: 'text', text: content.text }
    }

    if (content?.type === 'tool_use') {
      // Queue tool for execution
      yield { type: 'tool_use', toolUse: content }
    }

    // Handle thinking blocks (for reasoning models)
    if (content?.type === 'thinking') {
      yield { type: 'thinking', thinking: content.thinking }
    }

    // Handle usage statistics
    if (chunk.usage) {
      yield { type: 'usage', usage: chunk.usage }
    }
  }

  // Execute tools and continue if needed
  const result = await runTools(collectedToolUses)

  if (result.hasMore) {
    // Recursive call to continue the conversation
    return yield* query({
      ...params,
      messages: [...messages, ...result.messages],
    })
  }

  return { type: 'complete', messages: result.messages }
}

Tool Execution Flow

Plain text
┌─────────────────────────────────────────────────────────────────┐
│                    TOOL EXECUTION                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────┐                                              │
│  │ ToolUseBlock │                                              │
│  │ from API     │                                              │
│  └──────┬───────┘                                              │
│         │                                                       │
│         ▼                                                       │
│  ┌──────────────┐     ┌──────────────┐     ┌──────────────┐  │
│  │  Find Tool   │────►│ Check        │────►│  Execute     │  │
│  │  by Name     │     │ Permissions  │     │  Tool        │  │
│  └──────────────┘     └──────────────┘     └──────┬───────┘  │
│                                                    │          │
│                              ┌─────────────────────┘          │
│                              │                                │
│                              ▼                                │
│                    ┌─────────────────┐                       │
│                    │  Tool Progress  │                       │
│                    │  (yield events) │                       │
│                    └────────┬────────┘                       │
│                             │                                 │
│                             ▼                                 │
│                    ┌─────────────────┐                       │
│                    │  ToolResultBlock│                       │
│                    │  (add to msgs)  │                       │
│                    └─────────────────┘                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
Tool Execution Code
TypeScript
// services/tools/toolOrchestration.ts
export async function runTools(
  toolUses: ToolUseBlock[],
  context: ToolUseContext,
): Promise<ToolRunResult> {
  const results: ToolResultBlock[] = []

  for (const toolUse of toolUses) {
    // 1. Find the tool
    const tool = findToolByName(toolUse.name, context.options.tools)

    // 2. Check if allowed
    const permission = await canUseTool(tool, toolUse.input)
    if (!permission.allowed) {
      results.push(createPermissionDeniedResult(toolUse, permission))
      continue
    }

    // 3. Execute with progress tracking
    const executor = new StreamingToolExecutor(tool, toolUse, context)

    for await (const progress of executor.execute()) {
      // Yield progress to UI
      yield progress
    }

    // 4. Get final result
    const result = await executor.getResult()
    results.push(result)
  }

  return { results, messages: [...results] }
}

Streaming Architecture

Plain text
┌─────────────────────────────────────────────────────────────────┐
│                    STREAMING FLOW                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   API Response (SSE/Streaming)                                    │
│        │                                                        │
│        ▼                                                        │
│   ┌─────────────┐                                                │
│   │ Raw Chunks  │  "{\"type\": \"content_block_delta\"...}"      │
│   └──────┬──────┘                                                │
│          │                                                       │
│          ▼                                                       │
│   ┌─────────────┐                                                │
│   │ Parse JSON  │                                                │
│   └──────┬──────┘                                                │
│          │                                                       │
│          ▼                                                       │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐     │
│   │ Text Delta  │────►│ Update      │────►│ Yield to    │     │
│   │             │     │ Assistant   │     │ Generator   │     │
│   └─────────────┘     │ Message     │     └─────────────┘     │
│                       └─────────────┘                          │
│                                                                 │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐     │
│   │ Tool Use    │────►│ Create      │────►│ Yield       │     │
│   │ Start       │     │ ToolUseBlock│     │ Tool Event  │     │
│   └─────────────┘     └─────────────┘     └─────────────┘     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
Why Generators?

Using AsyncGenerator allows incremental UI updates:

TypeScript
// Consumer can show progress as it happens
for await (const event of queryEngine.submitMessage(userMsg)) {
  switch (event.type) {
    case 'text':
      appendToUI(event.text)  // Stream text live
      break
    case 'tool_use':
      showToolStart(event.toolUse)  // Show "Running BashTool..."
      break
    case 'tool_progress':
      updateToolProgress(event.progress)  // Update progress bar
      break
    case 'tool_result':
      showToolResult(event.result)  // Show final output
      break
  }
}

Message Compilation

Before sending to API, messages are normalized:

TypeScript
// utils/messages.ts
export function normalizeMessagesForAPI(
  messages: Message[],
  options: NormalizeOptions
): SDKMessage[] {
  return messages.map(msg => {
    switch (msg.type) {
      case 'user':
        return {
          role: 'user',
          content: normalizeContent(msg.content),
        }
      case 'assistant':
        return {
          role: 'assistant',
          content: [
            ...normalizeContent(msg.content),
            // Include thinking blocks if present
            ...(msg.thinking ? [{ type: 'thinking', ...msg.thinking }] : []),
          ],
        }
      case 'tool_result':
        return {
          role: 'user',  // Tool results are user-role for API
          content: [{
            type: 'tool_result',
            tool_use_id: msg.tool_use_id,
            content: msg.content,
            is_error: msg.is_error,
          }],
        }
    }
  })
}

Context Window Management

Plain text
┌─────────────────────────────────────────────────────────────────┐
│                 CONTEXT WINDOW FLOW                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Input Messages                                                 │
│       │                                                         │
│       ▼                                                         │
│  ┌─────────┐   Too large?   ┌─────────┐                        │
│  │ Measure │───────────────►│ Compact │                        │
│  │ Tokens  │                │ (Summarize)                     │
│  └────┬────┘                └────┬────┘                        │
│       │ No                       │                              │
│       ▼                          ▼                              │
│  ┌─────────┐                ┌─────────┐                        │
│  │  Send   │                │  Send   │                        │
│  │ to API  │                │ to API  │                        │
│  └─────────┘                └─────────┘                        │
│                                                                 │
│  Compaction triggers:                                           │
│  - Token count exceeds threshold (~80% of model limit)          │
│  - User hits soft limit warning                                 │
│  - Auto-compact enabled and threshold reached                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
Compaction Types
TypeScript
// services/compact/compact.ts
export async function buildPostCompactMessages(
  messages: Message[],
  boundary: CompactBoundary,
  config: CompactConfig
): Promise<Message[]> {
  // 1. Keep recent messages (most recent N)
  const recentMessages = getRecentMessages(messages, boundary)

  // 2. Summarize older messages
  const summary = await generateSummary(
    getOlderMessages(messages, boundary)
  )

  // 3. Return: [summary as system message, ...recentMessages]
  return [
    createSystemMessage({ content: summary }),
    ...recentMessages,
  ]
}

Error Handling

TypeScript
// services/api/errors.ts
export function categorizeRetryableAPIError(error: APIError): ErrorCategory {
  if (error.status === 429) {
    return { type: 'rate_limit', retryable: true, delayMs: 60000 }
  }
  if (error.status === 500) {
    return { type: 'server_error', retryable: true, delayMs: 1000 }
  }
  if (error.status === 413 || isPromptTooLong(error)) {
    return { type: 'context_length', retryable: false }
  }
  return { type: 'unknown', retryable: false }
}
Recovery Strategies
Error TypeStrategy
Rate LimitWait and retry with backoff
Server ErrorRetry with exponential backoff
Context LengthCompact and retry
Max OutputTruncate and continue
Invalid ToolReturn error to LLM

Configuration Building

TypeScript
// query/config.ts
export function buildQueryConfig(params: QueryParams): APIConfig {
  return {
    model: params.userSpecifiedModel ?? getDefaultMainLoopModel(),
    max_tokens: getMaxTokensForModel(params.model),
    temperature: getTemperatureForMode(params.mode),
    system: normalizeSystemPrompt(params.systemPrompt),
    messages: normalizeMessagesForAPI(params.messages),
    tools: params.tools.map(tool => ({
      name: tool.name,
      description: tool.description,
      input_schema: tool.inputJSONSchema,
    })),
    // Beta features
    thinking: params.thinkingConfig,
    task_budget: params.taskBudget,
  }
}

Testing the Query System

TypeScript
// Using dependency injection
const mockDeps: QueryDeps = {
  streamMessages: mockStream,
  executeTool: mockToolExecution,
  checkPermissions: mockPermissionCheck,
}

const engine = new QueryEngine({
  ...config,
  deps: mockDeps,
})

// Test can now control all side effects

Key Concepts

1. Turn-Based

Each submitMessage() is one "turn" - a complete request/response cycle.

2. Recursive for Tool Loops

The query function calls itself recursively when tools need to continue the conversation.

3. Streaming is Primary

All responses stream - no blocking waits. UI updates incrementally.

4. Stateless Core

QueryEngine holds state, but the query() function is pure - takes params, returns events.

5. Compaction is Automatic

Context window management happens transparently to the user.


Related Documentation

  • State Management - How query updates state
  • Tools System - How tools integrate with queries
  • Entrypoints - How queries are initiated