InkdownInkdown
Start writing

Edward

3 files·0 subfolders

Shared Workspace

Edward
Orchestration Layer

Orchestration Layer

Shared from "Edward" on Inkdown

Edward Orchestration Workflow - End to End

Simple Technical KT for Engineers


Table of Contents

  1. What is Orchestration in Edward?
  2. The 5-Minute Overview
  3. Step-by-Step Flow
  4. Deep Dive: Each Layer
  5. Key Files to Read
  6. Common Questions

1. What is Orchestration in Edward?

Orchestration = How Edward coordinates all the pieces to turn a user's chat message into working code.

Overview
Stream Continuation

Think of it like a conductor leading an orchestra:

  • Conductor = Orchestration layer
  • Musicians = LLM, Docker sandbox, file system, package manager, build tools
  • Music = The generated code

The orchestration layer makes sure everyone plays at the right time, in the right order.


2. The 5-Minute Overview

The Big Picture
Plain text
User sends message in chat
        ↓
API validates + admits the request
        ↓
Queues work to background worker
        ↓
Worker starts stream session
        ↓
LLM streams response (chunk by chunk)
        ↓
Parser reads chunks → produces events
        ↓
Events trigger actions (write files, install deps, run commands)
        ↓
Loop continues until done
        ↓
Finalize + save results
The 3 Main Phases
Plain text
┌─────────────────────────────────────────────────────────┐
│  PHASE 1: ADMISSION (API)                               │
│  - Validate user, check limits                          │
│  - Create run record                                    │
│  - Queue to worker                                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  PHASE 2: EXECUTION (Worker)                            │
│  - Stream from LLM                                      │
│  - Parse events                                         │
│  - Execute side effects (files, installs, commands)     │
│  - Multi-turn loop                                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  PHASE 3: FINALIZE (Worker)                             │
│  - Apply fixes                                          │
│  - Validate output                                      │
│  - Persist assistant message                            │
└─────────────────────────────────────────────────────────┘

3. Step-by-Step Flow

Step 1: User Sends Message

File: apps/web/stores/chatStream/useStartStream.ts

Plain text
User types: "Create a todo app with React"
        ↓
Frontend sends POST /chat/message
        ↓
Opens SSE (Server-Sent Events) connection for streaming

Step 2: API Admission

File: apps/api/services/runs/messageOrchestrator.service.ts

JavaScript
async function unifiedSendMessage(req, res) {
  // 1. Check system load
  const admissionWindow = await getRunAdmissionWindow();
  if (admissionWindow.overloaded) {
    return error("System busy, try again");
  }
  
  // 2. Load + decrypt user's API key
  const userData = await getUserWithApiKey(userId);
  const decryptedKey = decrypt(userData.apiKey);
  
  // 3. Validate model matches provider
  // (Can't use Gemini key with OpenAI model)
  
  // 4. Create or load chat
  const { chatId } = await getOrCreateChat(userId, body.chatId);
  
  // 5. Save user message to DB
  const userMessageId = await saveMessage(chatId, userId, "user", content);
  
  // 6. Create workflow (planning state machine)
  const workflow = await createWorkflow(userId, chatId, {
    userRequest: content,
    mode: "GENERATE",
  });
  
  // 7. Create run record
  const run = await createAdmittedRun({
    chatId,
    userId,
    userMessageId,
    metadata: { workflow, model, ... },
  });
  
  // 8. Enqueue to worker
  await enqueueAgentRunJob({ runId: run.id });
  
  // 9. Start streaming events to browser
  await streamRunEventsFromPersistence({ res, runId: run.id });
}

Key Points:

  • Admission control prevents overload (global + per-user + per-chat limits)
  • Run is persisted BEFORE execution (durable)
  • Browser gets runId immediately for tracking

Step 3: Worker Picks Up Job

File: apps/api/services/runs/agent-run-worker/processor.ts

JavaScript
async function processAgentRunJob(runId, publisher) {
  // 1. Load run from DB
  const run = await getRunById(runId);
  
  // 2. Check not already done/cancelled
  if (isTerminalRunStatus(run.status)) {
    return; // Already finished
  }
  
  // 3. Subscribe to cancel signal
  const cancelSub = createRedisClient();
  await cancelSub.subscribe(`edward:run-cancel:${runId}`);
  cancelSub.on("message", () => {
    workerAbort.abort(); // Stop everything
  });
  
  // 4. Load user's API key
  const userData = await getUserWithApiKey(run.userId);
  const decryptedKey = decrypt(userData.apiKey);
  
  // 5. Mark run as RUNNING
  await markRunRunningIfAdmissible(runId);
  
  // 6. Create fake HTTP response (captures events)
  const capturedRes = createRunEventCaptureResponse(async (event) => {
    await persistRunEvent(runId, event, publisher);
  });
  
  // 7. Execute stream session
  await runStreamSession({
    res: capturedRes,
    workflow: metadata.workflow,
    decryptedApiKey: decryptedKey,
    historyMessages,
    // ...params
  });
  
  // 8. Finalize (success or failure)
  await finalizeSuccessfulRun({...});
}

Key Points:

  • Worker is independent (can restart without losing progress)
  • Cancel signal via Redis pub/sub (fast)
  • Events persisted to DB as they happen (resumable)

Step 4: Stream Session Setup

File: apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts

JavaScript
async function runStreamSession(params) {
  // ─── PHASE 1: PREPARE ──────────────────────────────────
  
  // 1. Resolve framework
  let framework = await resolveFramework({
    workflow,
    userRequest: userTextContent,
  });
  // Checks: user explicitly requested? → existing sandbox?
  
  // 2. Build messages for LLM
  const { baseMessages } = await prepareBaseMessages({
    userTextContent,
    isFollowUp,
    historyMessages,      // Previous conversation
    projectContext,       // Current project state
  });
  
  // 3. Compose system prompt
  const systemPrompt = composePrompt({
    framework,            // Next.js, Vite, etc.
    complexity,           // simple/moderate/complex
    mode,                 // GENERATE/EDIT/FIX
    profile: COMPACT,
  });
  
  // 4. Check token budget
  const tokenUsage = await computeTokenUsage({
    apiKey: decryptedApiKey,
    systemPrompt,
    messages: baseMessages,
    model,
  });
  
  if (isOverContextLimit(tokenUsage)) {
    sendError("Context window exceeded");
    return;
  }
  
  // ─── PHASE 2: EXECUTE ──────────────────────────────────
  
  // 5. Run agent loop (multi-turn)
  const loopResult = await runAgentLoop({
    decryptedApiKey,
    initialMessages: baseMessages,
    systemPrompt,
    framework,
    abortController,
    generatedFiles: new Map(),
    declaredPackages: [],
    // ...params
  });
  
  // ─── PHASE 3: FINALIZE ─────────────────────────────────
  
  // 6. Apply deterministic fixes
  await applyDeterministicPostgenAutofixes({
    framework,
    generatedFiles,
    sandboxId: workflow.sandboxId,
  });
  
  // 7. Validate + maybe retry
  const retryResult = await maybeRunStrictPostgenRetry({
    violations: getBlockingPostgenViolations({...}),
    // ...params
  });
  
  // 8. Save assistant message to DB
  const finalized = await finalizeStreamSession({
    fullRawResponse: loopResult.fullRawResponse,
    generatedFiles,
    // ...params
  });
}

Key Points:

  • Framework resolved before LLM call (better prompts)
  • Token budget checked BEFORE calling LLM (fail fast)
  • Post-generation validation + autofix (quality control)

Step 5: Agent Loop (Multi-Turn)

File: apps/api/services/chat/session/loop/agentLoop.runner.ts

JavaScript
async function runAgentLoop(params) {
  let agentMessages = params.initialMessages;
  let agentTurn = 0;
  let fullRawResponse = "";
  let loopStopReason = null;
  
  // ─── MULTI-TURN LOOP ───────────────────────────────────
  
  while (agentTurn < MAX_AGENT_TURNS) {
    agentTurn += 1;
    
    // 1. Check token budget for this turn
    const turnTokenUsage = await computeTokenUsage({
      apiKey: decryptedApiKey,
      systemPrompt,
      messages: agentMessages,
      model,
    });
    
    if (isOverContextLimit(turnTokenUsage)) {
      loopStopReason = "CONTEXT_LIMIT_EXCEEDED";
      break;
    }
    
    // 2. Execute one turn (stream LLM + parse + act)
    const turnResult = await executeAgentTurnStream({
      decryptedApiKey,
      agentMessages,
      systemPrompt,
      abortController,
      turn: agentTurn,
      // ...params
    });
    
    fullRawResponse = turnResult.fullRawResponse;
    
    // 3. Decide: continue or stop?
    const outcome = await resolveTurnOutcome({
      agentTurn,
      turnRawResponse: turnResult.turnRawResponse,
      toolResultsThisTurn: turnResult.toolResultsThisTurn,
      budgetState: turnResult.budgetState,
      // ...params
    });
    
    if (outcome.action === "break") {
      loopStopReason = outcome.loopStopReason;
      break;
    }
    
    // 4. Continue with tool results as context
    agentMessages = outcome.agentMessages;
  }
  
  return {
    fullRawResponse,
    agentTurn,
    loopStopReason,
  };
}

Why Multiple Turns?

Plain text
Turn 1: LLM says "I'll create App.jsx"
        → Writes App.jsx
        → No <done> tag yet
        
Turn 2: LLM says "Now I'll add styles.css"
        → Writes styles.css
        → Still no <done>
        
Turn 3: LLM says "Done!"
        → Emits <edward_done />
        → Loop exits

Loop Continues When:

  • Tools were called but no file output yet
  • No <done> tag received
  • Under turn budget

Loop Stops When:

  • Code/file output detected
  • <done> tag received
  • Tool budget exceeded
  • Max turns reached
  • Client aborted

Step 6: Turn Execution (Stream + Parse)

File: apps/api/services/chat/session/loop/agentLoop.stream.ts

JavaScript
async function executeAgentTurnStream(params) {
  // 1. Create parser (state machine)
  const parser = createStreamParser();
  
  // 2. Stream from LLM
  const stream = streamResponse(
    params.decryptedApiKey,
    params.agentMessages,
    params.abortController.signal,
    params.systemPrompt,
    params.framework,
    params.model,
  );
  
  let turnRawResponse = "";
  const toolResultsThisTurn = [];
  
  // 3. Process chunks as they arrive
  for await (const chunk of stream) {
    if (params.abortController.signal.aborted) {
      break;
    }
    
    turnRawResponse += chunk;
    
    // 4. Parse chunk into events
    const events = parser.process(chunk);
    
    // 5. Handle each event (side effects)
    await processParserEvents({
      events,
      turnState,
      budgetState,
      toolResultsThisTurn,
      context: parserContext,
    });
    
    // 6. Check budgets
    if (hasAnyTurnBudgetExceeded(budgetState)) {
      break;
    }
  }
  
  return {
    fullRawResponse: turnRawResponse,
    toolResultsThisTurn,
    budgetState,
  };
}

Key Points:

  • Chunks processed as they arrive (not waiting for full response)
  • Parser converts raw text → structured events
  • Events trigger immediate side effects

Step 7: Parser State Machine

File: apps/api/lib/llm/parser.ts

JavaScript
function createStreamParser() {
  const context = {
    state: "TEXT",    // Current parsing state
    buffer: "",       // Accumulated text
  };
  
  function process(chunk) {
    context.buffer += chunk;
    
    let events = [];
    let iterations = 0;
    
    while (context.buffer.length > 0 && iterations < MAX_ITERATIONS) {
      handleState(events);  // Process based on current state
      iterations++;
    }
    
    return events;
  }
  
  function handleState(events) {
    switch (context.state) {
      case "TEXT":
        // Look for tags: <thinking>, <edward_sandbox>, <file>
        if (buffer.includes("<thinking>")) {
          context.state = "THINKING";
          events.push({ type: "THINKING_START" });
        }
        if (buffer.includes("<edward_sandbox>")) {
          context.state = "SANDBOX";
          events.push({ type: "SANDBOX_START" });
        }
        if (buffer.includes("<file path=")) {
          context.state = "FILE";
          const path = extractPath(buffer);
          events.push({ type: "FILE_START", path });
        }
        break;
        
      case "THINKING":
        // Accumulate thinking content
        if (buffer.includes("</thinking>")) {
          context.state = "TEXT";
          events.push({ type: "THINKING_END" });
        }
        break;
        
      case "FILE":
        // Accumulate file content
        if (buffer.includes("</file>")) {
          context.state = "TEXT";
          events.push({ type: "FILE_END" });
        }
        break;
        
      // ... other states
    }
  }
  
  return { process, flush };
}

Parser States:

StateTriggerExit
TEXTDefault<thinking>, <edward_sandbox>, <file>
THINKING<thinking></thinking>
SANDBOX<edward_sandbox></edward_sandbox>
FILE<file path="..."></file>
INSTALL<install></install>

Why State Machine?

  • Chunks can split tags across boundaries
  • Need to handle incomplete output safely
  • Can't just regex over full string

Step 8: Event Handler (Side Effects)

File: apps/api/services/chat/session/events/handler.ts

JavaScript
async function handleParserEvent(ctx, event) {
  switch (event.type) {
    
    case "SANDBOX_START":
      // Provision sandbox if needed
      if (!ctx.workflow.sandboxId) {
        await ensureSandbox(ctx.workflow);
      }
      break;
      
    case "FILE_START":
      // Prepare file in sandbox
      await prepareSandboxFile(ctx.workflow.sandboxId, event.path);
      ctx.currentFilePath = event.path;
      ctx.generatedFiles.set(event.path, "");
      break;
      
    case "FILE_CONTENT":
      // Buffer content to Redis
      await handleFileContent(
        ctx.workflow.sandboxId,
        ctx.currentFilePath,
        event.content,
        ctx.isFirstFileChunk,
      );
      ctx.generatedFiles.set(
        ctx.currentFilePath,
        ctx.generatedFiles.get(ctx.currentFilePath) + event.content
      );
      break;
      
    case "FILE_END":
      // Sanitize file (remove markdown fences)
      await sanitizeSandboxFile(ctx.workflow.sandboxId, ctx.currentFilePath);
      ctx.currentFilePath = undefined;
      break;
      
    case "SANDBOX_END":
      // Flush Redis buffers to container filesystem
      await flushSandbox(ctx.workflow.sandboxId);
      break;
      
    case "INSTALL_CONTENT":
      // Queue dependency install
      ctx.installTaskQueue.enqueue(async () => {
        await handleInstallContent(ctx, event.dependencies);
      });
      break;
      
    case "COMMAND":
      // Wait for installs, then run command
      await ctx.installTaskQueue?.waitForIdle();
      await handleCommandEvent(ctx, event.command, event.args);
      break;
      
    case "WEB_SEARCH":
      // Execute web search tool
      await handleWebSearchEvent(ctx, event.query, event.maxResults);
      break;
  }
}

Event Types:

EventAction
SANDBOX_STARTProvision Docker container
FILE_STARTPrepare file path
FILE_CONTENTBuffer to Redis
FILE_ENDSanitize file
SANDBOX_ENDFlush buffers to disk
INSTALL_CONTENTQueue npm install
COMMANDRun shell command
WEB_SEARCHSearch web

Step 9: Sandbox Write Flow (Buffered)

File: apps/api/services/sandbox/write/buffer.ts + flush.ts

Write (Buffered to Redis)
JavaScript
async function writeSandboxFile(sandboxId, filePath, content) {
  const bufferKey = `buffer:${sandboxId}:${filePath}`;
  const filesSetKey = `files:${sandboxId}`;
  
  // Append to Redis buffer
  const pipeline = redis.pipeline();
  pipeline.append(bufferKey, content);
  pipeline.sadd(filesSetKey, filePath);
  await pipeline.exec();
  
  // Schedule flush (happens shortly after)
  scheduleSandboxFlush(sandboxId);
}
Flush (Redis → Container)
JavaScript
async function flushSandbox(sandboxId) {
  // 1. Acquire distributed lock
  const handle = await acquireDistributedLock(`flush:${sandboxId}`);
  
  // 2. Get all buffered files
  const filePaths = await redis.smembers(`files:${sandboxId}`);
  
  // 3. Write each file to container
  for (const filePath of filePaths) {
    const content = await redis.get(`buffer:${sandboxId}:${filePath}`);
    
    // Docker exec: cat >> /app/path/to/file
    const exec = await container.exec({
      Cmd: ["sh", "-c", `cat >> '/app/${filePath}'`],
      AttachStdin: true,
    });
    
    const stream = await exec.start({ hijack: true });
    stream.write(content);
    stream.end();
    
    // Clean up buffer
    await redis.del(`buffer:${sandboxId}:${filePath}`);
  }
  
  // 4. Release lock
  await releaseDistributedLock(handle);
}

Why Buffer?

Plain text
Without buffering:
  Write chunk 1 → Docker exec
  Write chunk 2 → Docker exec
  Write chunk 3 → Docker exec
  (Slow, many round trips)

With buffering:
  Write chunk 1 → Redis (fast)
  Write chunk 2 → Redis (fast)
  Write chunk 3 → Redis (fast)
  Flush once → Docker exec
  (Fast, one round trip)

Benefits:

  • Resilient to partial failures
  • Can batch multiple writes
  • Can replay/repair on failure

Step 10: Install Task Queue

File: apps/api/services/chat/session/loop/agentLoop.runner.ts

JavaScript
// Create serialized install queue
let installQueueTail = Promise.resolve();
const installTaskQueue = {
  enqueue(task) {
    const queuedTask = installQueueTail.then(task, task);
    installQueueTail = queuedTask.catch(() => undefined);
  },
  async waitForIdle() {
    await installQueueTail;
  },
};

// Usage in event handler:
case "INSTALL_CONTENT":
  // Queue install (doesn't block)
  installTaskQueue.enqueue(async () => {
    await execCommand(sandboxId, "npm install react");
  });
  break;

case "COMMAND":
  // Wait for all installs before running command
  await installTaskQueue.waitForIdle();
  await execCommand(sandboxId, "npm run build");
  break;

Why Serialize Installs?

Plain text
Without serialization:
  npm install react    (concurrent)
  npm install lodash   (concurrent)
  → Race conditions, lock file conflicts

With serialization:
  npm install react    (wait...)
  npm install lodash   (wait...)
  → Clean, sequential installs

Step 11: Turn Outcome Decision

File: apps/api/services/chat/session/loop/agentLoop.turnOutcome.ts

JavaScript
async function resolveTurnOutcome(params) {
  // 1. Check budgets first
  if (toolBudgetExceededThisTurn) {
    return { action: "break", reason: "TOOL_BUDGET_EXCEEDED" };
  }
  
  // 2. Code output detected? → Stop (success)
  if (codeOutputDetected) {
    return { action: "break", reason: "DONE" };
  }
  
  // 3. Tools called but no code? → Continue
  if (toolResultsThisTurn.length > 0 && !codeOutputDetected) {
    const continuationPrompt = buildAgentContinuationPrompt(
      userContent,
      turnRawResponse,
      toolResultsThisTurn,
    );
    return { 
      action: "continue",
      agentMessages: [{ role: "user", content: continuationPrompt }]
    };
  }
  
  // 4. <done> tag detected? → Stop
  if (doneTagDetectedThisTurn) {
    return { action: "break", reason: "DONE" };
  }
  
  // 5. Conversational response? → Stop
  if (isConversationalReply) {
    return { action: "break", reason: "DONE" };
  }
  
  // 6. No output at all? → Nudge once
  if (noProgressContinuations < MAX_NO_PROGRESS_CONTINUATIONS) {
    const nudgePrompt = buildNoProgressContinuationPrompt();
    return { 
      action: "continue",
      agentMessages: [{ role: "user", content: nudgePrompt }]
    };
  }
  
  // 7. Default: stop
  return { action: "break", reason: "NO_TOOL_RESULTS" };
}

Decision Tree:

Plain text
                    ┌─────────────────┐
                    │ Turn Complete   │
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │ Budget Exceeded?│──Yes──→ BREAK
                    └────────┬────────┘
                             │ No
                    ┌────────▼────────┐
                    │ Code Output?    │──Yes──→ BREAK
                    └────────┬────────┘
                             │ No
                    ┌────────▼────────┐
                    │ Tools Called?   │──Yes──→ CONTINUE (with results)
                    └────────┬────────┘
                             │ No
                    ┌────────▼────────┐
                    │ <done> Tag?     │──Yes──→ BREAK
                    └────────┬────────┘
                             │ No
                    ┌────────▼────────┐
                    │ Conversational? │──Yes──→ BREAK
                    └────────┬────────┘
                             │ No
                    ┌────────▼────────┐
                    │ Can Nudge?      │──Yes──→ CONTINUE (nudge)
                    └────────┬────────┘
                             │ No
                             ↓
                          BREAK

Step 12: Finalize

File: apps/api/services/chat/session/orchestrator/runStreamSession.finalize.ts

JavaScript
async function finalizeStreamSession(params) {
  // 1. Build assistant message content
  const assistantContent = buildAssistantMessageContent({
    fullRawResponse: params.fullRawResponse,
    generatedFiles: params.generatedFiles,
    declaredPackages: params.declaredPackages,
  });
  
  // 2. Save to DB
  await saveMessage(
    params.chatId,
    params.userId,
    "assistant",
    assistantContent,
  );
  
  // 3. Emit final meta event
  emitMeta({
    phase: "SESSION_COMPLETE",
    outputTokens: params.outputTokens,
    duration: Date.now() - params.messageStartTime,
  });
  
  // 4. Return stored content
  return { storedAssistantContent: assistantContent };
}

4. Deep Dive: Each Layer

Layer 1: Message Orchestrator

Purpose: Admission control + queue + stream handoff

Key Functions:

  • unifiedSendMessage() - Entry point
  • createAdmittedRun() - Create run with limits
  • enqueueAdmittedRun() - Queue to worker
  • streamRunEventsFromPersistence() - SSE to browser

What Could Go Wrong:

  • API key decryption fails
  • Model/provider mismatch
  • Run admission rejected (limits)
  • Queue enqueue fails

Layer 2: Stream Session

Purpose: Framework resolve + message prep + token budget + finalize

Key Functions:

  • resolveFramework() - Detect/prefer framework
  • prepareBaseMessages() - Build LLM context
  • composePrompt() - System prompt
  • computeTokenUsage() - Budget check
  • finalizeStreamSession() - Persist results

What Could Go Wrong:

  • Context limit exceeded
  • Framework detection fails
  • Finalize persistence fails

Layer 3: Agent Loop

Purpose: Multi-turn execution + outcome decisions

Key Functions:

  • runAgentLoop() - Main loop
  • executeAgentTurnStream() - Single turn
  • resolveTurnOutcome() - Continue/stop decision

What Could Go Wrong:

  • Turn budget exceeded
  • Max turns reached
  • Abort signal received
  • Continuation prompt fails

Layer 4: Parser

Purpose: Chunk → event conversion

Key Functions:

  • createStreamParser() - State machine
  • process() - Parse chunk
  • flush() - Handle incomplete output

What Could Go Wrong:

  • Tag split across chunks
  • Incomplete output
  • State machine stuck

Layer 5: Event Handler

Purpose: Side effect execution

Key Functions:

  • handleParserEvent() - Dispatch by type
  • handleFileContent() - Buffer writes
  • handleInstallContent() - Queue installs
  • handleCommandEvent() - Run commands

What Could Go Wrong:

  • Sandbox not provisioned
  • File write fails
  • Install conflicts
  • Command timeout

Layer 6: Sandbox Write

Purpose: Buffered writes to container

Key Functions:

  • writeSandboxFile() - Buffer to Redis
  • flushSandbox() - Redis → container
  • scheduleSandboxFlush() - Debounced flush

What Could Go Wrong:

  • Redis unavailable
  • Docker exec fails
  • Lock acquisition fails
  • Container stopped

5. Key Files to Read

Core Orchestration
FilePurpose
apps/api/services/runs/messageOrchestrator.service.tsEntry point
apps/api/services/runs/agent-run-worker/processor.tsWorker execution
apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.tsStream session
apps/api/services/chat/session/loop/agentLoop.runner.tsAgent loop
apps/api/services/chat/session/loop/agentLoop.stream.tsTurn execution
Parser + Events
FilePurpose
apps/api/lib/llm/parser.tsState machine parser
apps/api/services/chat/session/events/handler.tsEvent side effects
apps/api/services/chat/session/loop/events.tsEvent processing
apps/api/services/chat/session/loop/agentLoop.turnOutcome.tsContinue/stop logic
Sandbox
FilePurpose
apps/api/services/sandbox/write/buffer.tsRedis buffering
apps/api/services/sandbox/write/flush.tsFlush to container
apps/api/services/sandbox/write/flush.scheduler.tsDebounced flush
apps/api/services/chat/file.handlers.tsFile content handling

6. Common Questions

Q: Why multi-turn loop instead of one LLM call?

A: Complex tasks need multiple steps:

Plain text
One call approach:
  User: "Build a todo app"
  LLM: [tries to output everything at once]
  → Context overflow, messy output

Multi-turn approach:
  Turn 1: Create App.jsx
  Turn 2: Create styles.css
  Turn 3: Create utils.js
  Turn 4: <done>
  → Clean, bounded, verifiable

Q: Why buffer writes to Redis instead of writing directly?

A: Three reasons:

  1. Resilience: If Docker fails, buffer survives in Redis
  2. Performance: One flush vs many small writes
  3. Batching: Multiple chunks → one file write

Q: Why serialize installs?

A: Prevent race conditions:

Plain text
Concurrent installs:
  npm install react &
  npm install lodash &
  → package-lock.json conflicts
  → Corrupted node_modules

Serialized installs:
  npm install react (wait)
  npm install lodash (wait)
  → Clean state

Q: How does cancellation work?

A: Two mechanisms:

  1. Redis pub/sub (fast):

    Plain text
    Browser → POST /cancel → Redis publish
    Worker subscribes → receives signal → abort
  2. DB polling (backup):

    Plain text
    Worker polls run.status every N seconds
    If status = CANCELLED → abort

Q: What happens if worker crashes mid-turn?

A: Checkpoint system allows resume:

Plain text
Turn 1: Complete ✓ (checkpoint saved)
Turn 2: Worker crashes ✗
        ↓
Worker restarts → loads checkpoint → resumes from Turn 3

Q: How are tokens budgeted?

A: Multiple levels:

Plain text
Level 1: Context window
  - computeTokenUsage() before LLM call
  - Fail if over limit

Level 2: Turn tool budget
  - MAX_AGENT_TOOL_CALLS_PER_TURN (e.g., 5)
  - Break turn if exceeded

Level 3: Run tool budget
  - MAX_AGENT_TOOL_CALLS_PER_RUN (e.g., 20)
  - Break run if exceeded

Level 4: Turn count
  - MAX_AGENT_TURNS (e.g., 10)
  - Break loop if exceeded

Q: How does framework detection work?

A: Three sources:

Plain text
1. User explicit request:
   "Create a Next.js app" → framework = "nextjs"

2. Existing sandbox:
   Sandbox has package.json with "next" → framework = "nextjs"

3. Workflow inference:
   Planning workflow suggests framework based on request

Q: What's the difference between run_event and message?

A: Different purposes:

Plain text
message table:
  - User/assistant conversation history
  - Final output visible to user
  - Queried for chat UI

run_event table:
  - Execution trace (stream events)
  - Used for replay/resume
  - Debugging + audit trail

7. Debugging Guide

Trace a Turn
Plain text
1. Check worker logs for turn start
   → "Agent turn 1 started"

2. Check LLM stream chunks
   → Chunk 1, Chunk 2, ...

3. Check parser events
   → FILE_START, FILE_CONTENT, FILE_END

4. Check event handler
   → "Writing file: App.jsx"

5. Check sandbox flush
   → "Flushed 3 files to sandbox"

6. Check turn outcome
   → "Turn 1 complete: codeOutputDetected=true"
Common Failures
SymptomLikely CauseFix
Context limit exceededToo much historyTruncate context
Tool budget exceededToo many tool callsReduce per-turn limit
Turn stuck in loopNo code output detectedCheck parser, tags
Files not writtenFlush failedCheck Redis, Docker
Install conflictsConcurrent installsCheck queue serialization

8. Summary

The Orchestration Flow in One Diagram
Plain text
┌─────────────────────────────────────────────────────────────┐
│  USER sends message                                         │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  API: Admission Control                                     │
│  - Validate, check limits, create run, enqueue              │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  WORKER: Process Agent Run                                  │
│  - Load context, subscribe to cancel, mark running          │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  STREAM SESSION: Setup                                      │
│  - Resolve framework, prepare messages, check tokens        │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  AGENT LOOP: Multi-Turn Execution                           │
│  ┌───────────────────────────────────────────────────────┐  │
│  │  Turn 1: Stream → Parse → Execute → Decide            │  │
│  │  Turn 2: Stream → Parse → Execute → Decide            │  │
│  │  Turn 3: Stream → Parse → Execute → Decide            │  │
│  │  ... continue until done ...                          │  │
│  └───────────────────────────────────────────────────────┘  │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  POST-GENERATION: Validate + Fix                            │
│  - Apply autofixes, validate, maybe retry                   │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│  FINALIZE: Persist Results                                  │
│  - Save assistant message, emit metrics                     │
└─────────────────────────────────────────────────────────────┘
Key Takeaways
  1. Orchestration is layered - Each layer has a clear responsibility
  2. Multi-turn is essential - Complex tasks need iteration
  3. Buffering matters - Redis buffers make writes resilient
  4. Budgets prevent runaway - Token, tool, turn limits
  5. Events are durable - Persisted for replay/resume
  6. Cancellation is dual - Pub/sub + DB polling

End of Document