InkdownInkdown
Start writing

Edward

3 files·0 subfolders

Shared Workspace

Edward
Orchestration Layer

Overview

Shared from "Edward" on Inkdown

Edward KT

1. What Edward actually is

Edward is an AI-assisted frontend application builder.

At the product level, the user experience is:s

  1. A user signs in with GitHub.
  2. The user stores their own LLM API key inside the product.
  3. The user asks Edward to create, edit, or fix a web app in chat.
  4. Edward plans the work, streams progress, writes files into an isolated sandbox, installs dependencies, runs commands, builds the app, uploads preview assets, and optionally syncs the code to GitHub.
  5. The user sees the generated app, file tree, terminal/build feedback, and live preview.

The system is intentionally split across:

  • apps/web: the product UI and auth surface
  • apps/api: the synchronous HTTP API, SSE delivery, orchestration, and infrastructure adapters
Overview
Stream Continuation
  • apps/api/queue.worker.ts: the asynchronous worker process for long-running build/run jobs
  • packages/auth: auth + database schema + database access
  • packages/shared: shared contracts, enums, stream event types, chat types, model metadata
  • packages/ui: shared UI primitives for the frontend
  • packages/octokit: shared GitHub sync client helpers
  • This split is not accidental. Edward is not just “a chat app”. It is a distributed system with:

    • durable state
    • transient coordination state
    • long-running execution
    • resumable streaming
    • isolated code execution
    • artifact publishing
    • external system integration

    That is the fundamental lens juniors should use when reading this repo.


    2. The highest-level architectural decisions and why they exist

    2.1 Why a monorepo

    Edward uses a pnpm + Turbo monorepo because the frontend, API, worker, DB schema, shared contracts, and UI primitives evolve together.

    Why this was chosen instead of separate repos:

    • Shared contracts are first-class. Stream events, chat types, API response shapes, model catalogs, and rate-limit scopes must not drift.
    • Frontend and backend changes usually land together. A single PR often changes stream event shape, API behavior, and UI rendering.
    • Tooling is simpler. One workspace gives consistent TypeScript, ESLint, and build orchestration.
    • Local development is more realistic. pnpm dev starts the whole system shape, not disconnected fragments.

    Tradeoff:

    • Monorepos become large and noisy.
    • Build graph discipline is required.

    This repo counters that with:

    • separate packages for shared concerns
    • Turbo task boundaries
    • architecture boundary checks in apps/api/scripts/quality
    2.2 Why Next.js on the frontend

    apps/web is a Next.js 16 App Router app.

    Why Next.js instead of plain Vite for the product UI:

    • server-rendered marketing/auth/changelog pages are easier
    • metadata, sitemap, robots, SEO handling are built in
    • auth route handling integrates cleanly with Better Auth
    • product pages and landing pages can live in the same app

    Important nuance:

    Edward generates Vite/Next/Vanilla projects for users, but the Edward product itself uses Next.js. Those are different concerns.

    2.3 Why Express for the API instead of putting everything inside Next route handlers

    The repo deliberately keeps orchestration in apps/api as a separate Express app.

    Why:

    • stream-oriented chat delivery is easier to reason about in a dedicated API
    • queue worker and API share backend services cleanly
    • long-running orchestration should not be tightly coupled to the frontend deployment shape
    • HTTP API can scale independently from the web UI
    • operational concerns like graceful shutdown, Redis connections, worker coordination, and SSE backpressure are clearer in a dedicated server

    Why not “just use Next API routes”:

    • the runtime model becomes mixed and harder to reason about
    • long-running SSE and worker orchestration become more awkward
    • explicit process separation is valuable here
    2.4 Why a dedicated worker process

    The API process accepts requests. The worker process executes long-running jobs.

    Why this is the correct split:

    • user HTTP connections are short-lived and unreliable
    • LLM execution, sandbox interaction, dependency install, build, and artifact publishing can outlive the browser connection
    • workers give retries, concurrency control, and isolation from request latency
    • the system can continue work even if the client disconnects

    This is one of the most important architectural choices in the whole repo.

    Without it, Edward would behave like a fragile synchronous demo.

    With it, Edward behaves like a real product with durable execution.

    2.5 Why Postgres

    Postgres is the durable source of truth.

    It stores:

    • users, sessions, accounts
    • chats and messages
    • runs and run events
    • builds
    • attachments
    • repo bindings and related product state

    Why Postgres instead of Redis-only or document storage:

    • chat history and run history are durable product data, not cache
    • run events need ordering and replay
    • auth tables fit naturally into a relational model
    • ownership and filtering by user/chat/run are frequent and relational
    • transactional admission control is easier and safer

    The strongest proof is createRunWithUserLimit in packages/auth/lib/run.ts. It uses DB transactions and advisory locks for safe concurrent run admission.

    2.6 Why Redis

    Redis is used everywhere, but always for fast, ephemeral, coordination-heavy workloads.

    Redis responsibilities:

    • BullMQ queue backend
    • distributed locks
    • sandbox state and TTLs
    • write buffers for streamed file output
    • pub/sub for run cancellation and live status fanout
    • request rate-limit stores

    Why Redis instead of doing all of this in Postgres:

    • queues and pub/sub need low-latency operational semantics
    • lock contention and TTL semantics are simpler in Redis
    • sandbox write buffering is a classic ephemeral buffering problem
    • rate limiting is a cache-like, time-windowed workload

    This is a good architecture split:

    • Postgres = durable truth
    • Redis = fast coordination and ephemeral runtime state
    2.7 Why BullMQ

    BullMQ is the job orchestration layer on top of Redis.

    Why BullMQ:

    • already fits the Redis runtime Edward needs
    • supports worker concurrency and queue separation
    • familiar operational model for TypeScript/Node systems
    • good fit for “enqueue durable work from API, execute elsewhere”

    Why not build a custom queue:

    • queue correctness is hard
    • retries, visibility, backpressure, and operational observability are all non-trivial

    Edward keeps two main job categories:

    • build jobs
    • agent run jobs

    That separation matters because build workloads and agent-stream workloads have different runtime behavior and failure modes.

    2.8 Why Docker-backed sandboxes

    This is the most product-defining infrastructure choice.

    Edward writes and executes generated code in Docker containers.

    Why:

    • isolation from the host machine
    • deterministic environment for install/build/command execution
    • safer command execution boundary
    • framework-specific template images can be prewarmed
    • easier cleanup and lifecycle control

    Why not run code directly on the API machine filesystem:

    • far less safe
    • dependency conflicts become unmanageable
    • cleanup is harder
    • one broken project can poison the environment for others

    The sandbox is not just an implementation detail. It is the product’s execution boundary.

    2.9 Why S3 + CloudFront + optional Cloudflare KV

    Edward separates “code generation” from “preview hosting”.

    • S3 stores preview artifacts
    • CloudFront serves previews and can be invalidated
    • Cloudflare KV optionally maps subdomains to preview storage prefixes

    Why this is better than serving builds directly from live containers:

    • static previews are cheaper and more stable
    • finished artifacts survive sandbox lifecycle cleanup
    • containers can be ephemeral while previews remain available
    • preview serving and code execution are decoupled

    This is a strong architecture choice because it turns preview hosting into a static asset problem instead of a container serving problem.

    2.10 Why GitHub OAuth + GitHub sync

    GitHub is used both for user authentication and for repository integration.

    Why GitHub auth:

    • Edward’s target users already live in GitHub-centric workflows
    • repo sync is a core feature, so GitHub identity is a natural anchor
    • one provider reduces auth complexity

    Why sync through GitHub APIs instead of shelling out to git in the sandbox:

    • less credential management complexity inside containers
    • direct tree/blob APIs are deterministic and auditable
    • easier to manage partial file sync and manifests
    2.11 Why bring-your-own API keys

    Users store their own OpenAI/Gemini/Anthropic API keys.

    Why:

    • cost ownership stays with the user
    • provider choice stays with the user
    • product does not need to centralize all model billing risk
    • enterprise-style customers often prefer controlling their own provider credentials

    Why encrypted at rest:

    • these are real credentials, not preferences
    • API keys must not sit in plaintext in Postgres

    This is why apps/api/utils/encryption.ts and apps/api/utils/secretEnvelope.ts exist.

    2.12 Why SSE instead of WebSockets for streaming

    Edward streams progress to the browser using Server-Sent Events.

    Why SSE:

    • server-to-client streaming is the actual need
    • simpler than full bidirectional socket infra
    • reconnect semantics are straightforward
    • Last-Event-ID replay model fits persisted run events well

    Why not WebSockets:

    • more operational complexity
    • bidirectional realtime control is not the dominant requirement
    • durability + replay matter more than low-level socket interactivity
    2.13 Why persist stream events

    This is one of the smartest choices in the repo.

    Edward does not treat stream events as disposable transport. It stores them in run_event.

    Why:

    • browser reconnects can resume streams
    • active run pages can rehydrate state after refresh
    • debugging becomes far easier
    • a run has a durable audit trail
    • the stream is no longer coupled to one live TCP connection

    This is the reason the system feels durable instead of “best effort”.


    3. Runtime topology

    At runtime, think in terms of five planes.

    3.1 Presentation plane
    • apps/web
    • Next.js pages, route handlers, React state, chat workspace, sandbox UI
    3.2 Delivery/API plane
    • apps/api/server.http.ts
    • Express routes/controllers/middleware
    • SSE response management
    3.3 Execution plane
    • apps/api/queue.worker.ts
    • BullMQ workers
    • agent runs, builds, backups
    3.4 Durable data plane
    • Postgres via Drizzle in packages/auth
    3.5 Ephemeral coordination plane
    • Redis
    • locks
    • queue backend
    • pub/sub
    • sandbox state
    • rate limiting

    The system is healthy when these planes are kept conceptually separate.


    4. Repo walkthrough

    4.1 Root

    Important root files:

    • package.json: workspace scripts and security overrides
    • pnpm-workspace.yaml: workspace package boundaries
    • turbo.json: task graph and env propagation
    • README.md: product/dev bootstrap overview
    • eslint.config.mjs: root lint baseline
    • tsconfig.json: root TS setup
    • scripts/build-local-sandboxes.sh: builds local sandbox images

    Why root-level task discipline matters:

    • the repo has multiple deployable units
    • env propagation is explicit
    • broken build assumptions must fail early

    4.2 apps/web

    What it owns:

    • landing page
    • auth session bootstrapping
    • chat workspace
    • file/editor/preview UI
    • changelog UI
    • browser-side stream orchestration

    Most important files:

    • app/layout.tsx: global metadata, fonts, providers, navbar shell
    • app/providers.tsx: React Query, theme, notifications, chat stream provider
    • app/page.tsx: landing page entry
    • app/chat/[id]/page.tsx: server-side access probe + metadata generation for chat pages
    • components/chat/chatPageClient.tsx: client orchestration for a chat route
    • components/chat/chatWorkspace.tsx: core desktop/mobile workspace composition
    • stores/chatStream/controller.ts: start/resume/cancel stream orchestration
    • stores/chatStream/useStartStream.ts: new message stream mutation flow
    • lib/streaming/processors/chatStreamProcessor.ts: SSE event consumption into UI state
    • stores/sandbox/*: file/build/terminal UI state
    • hooks/server-state/*: React Query data fetching hooks
    • lib/api/*: API client surface

    Why this frontend is split between React Query and Zustand:

    • React Query handles server state: chat history, metadata, active runs, quotas
    • Zustand handles high-churn UI runtime state: live streaming text, file chunks, open panel state, build errors, terminal lines

    That is the right split.

    If streaming state lived fully inside React Query, it would be awkward and too mutation-heavy. If server state lived fully inside Zustand, cache invalidation and stale-fetch logic would get worse.

    4.3 apps/api

    What it owns:

    • request authentication and validation
    • chat/run/build endpoints
    • run orchestration
    • planning workflow
    • LLM abstraction
    • parser and tool event handling
    • sandbox orchestration
    • GitHub sync orchestration
    • preview routing

    Composition roots:

    • server.http.ts: API process bootstrap
    • queue.worker.ts: worker bootstrap

    Important structure:

    • routes/: HTTP surface
    • controllers/: transport and response wiring
    • services/: application + infra orchestration
    • lib/: adapters/clients/shared helpers
    • middleware/: auth, rate limit, validation, telemetry
    • schemas/: Zod request contracts
    • tests/: mostly API-side tests, mirrored by module

    4.4 packages/auth

    What it owns:

    • Better Auth instance
    • Drizzle/Postgres connection
    • database schema
    • basic data access helpers for runs/builds
    • migrations

    Why it is a separate package:

    • both web and API need auth/schema awareness
    • keeping schema in API app only would over-couple layers

    4.5 packages/shared

    What it owns:

    • stream event contracts
    • chat UI types
    • API contract types
    • model catalog and provider detection
    • rate-limit scopes/policies
    • shared parsing helpers

    This package is the anti-drift package.

    If this package did not exist, frontend/backend stream contracts would break constantly.

    4.6 packages/ui

    What it owns:

    • reusable UI building blocks
    • shared styling artifacts
    • navigation, skeletons, toasts, hooks, utilities

    Why it exists:

    • product UI should reuse primitives instead of re-implementing them
    • keeps apps/web focused on product behavior, not low-level component plumbing

    4.7 packages/octokit

    What it owns:

    • GitHub API client creation
    • repo/branch creation helpers
    • manifest-based sync

    Why separate:

    • GitHub sync logic is infrastructure-adjacent and reusable
    • keeping raw Octokit usage isolated reduces API surface spread

    5. Data model and why each table exists

    The schema in packages/auth/lib/schema.ts is the durable model of the product.

    5.1 Auth tables
    • user
    • session
    • account
    • verification

    Why:

    • Better Auth expects these concepts
    • GitHub OAuth identity and session state need durable storage
    • Edward also stores product-specific fields on user, especially apiKey and preferredModel
    5.2 Product tables
    • chat: the top-level project/conversation container
    • message: user/assistant message history
    • attachment: images attached to messages
    • build: build lifecycle records
    • run: durable execution record for one assistant generation flow
    • run_event: persisted stream events
    • run_tool_call: durable tool invocation records
    5.3 Why chat

    chat is not just a thread id. It is the unit of project identity.

    It owns:

    • title/description
    • SEO fields
    • GitHub repo binding
    • custom subdomain

    Why not put these elsewhere:

    • the project is anchored to the conversation
    • preview and repo sync are project-level concerns
    5.4 Why message

    Messages are user-facing history.

    Important detail:

    • assistant output is persisted as a normal message
    • run events are not a replacement for message history

    Why both message and run_event exist:

    • message = final conversational artifact
    • run_event = execution trace / replay log

    That distinction is important.

    5.5 Why run

    run exists because one assistant generation is not a simple request-response.

    A run has:

    • queue state
    • current state machine position
    • current turn
    • termination reason
    • metadata
    • linkage to user and assistant messages

    Why not derive all of this from messages:

    • messages do not capture execution status, retries, checkpoints, cancellation, or turn-level progress
    5.6 Why run_event

    run_event is the stream replay ledger.

    Why this table is critical:

    • live SSE can reconnect from a sequence number
    • past run behavior can be debugged
    • session completion can be inferred from persisted events
    • worker/API/browser all get a shared truth
    5.7 Why run_tool_call

    This is the durability and idempotency guard for tools.

    Why not only emit tool output into run_event:

    • tools have inputs, outputs, duration, status, and idempotency semantics
    • tool calls are not just display events; they are execution records
    5.8 Why build

    Preview build lifecycle is separate from generation lifecycle.

    That is the correct model because:

    • generation can succeed while build fails
    • build status must be independently queryable and streamable
    • previews need their own duration/error metadata
    5.9 Why attachments are separate

    Attachments are not embedded directly inside messages because:

    • metadata is structured
    • message text and binary/media references are different concerns
    • image uploads have different constraints and lifecycle

    6. The most important end-to-end flow: send a chat message

    This is the core product path.

    6.1 Browser starts the stream

    Frontend entry:

    • apps/web/stores/chatStream/useStartStream.ts
    • apps/web/lib/api/chat.ts

    The browser:

    1. acquires a submission lock
    2. creates optimistic UI state
    3. calls POST /chat/message
    4. begins consuming SSE frames

    Why the submission lock exists:

    • to prevent accidental duplicate sends
    • to avoid overlapping UI-side submissions before the server admits a run
    6.2 API authenticates and validates

    Backend route:

    • apps/api/routes/chat.routes.ts

    Middleware:

    • auth
    • rate limit
    • request validation

    Why this layering exists:

    • fail cheap and early
    • keep orchestration code free from repeated input checks
    6.3 API resolves model and user credentials

    In unifiedSendMessage:

    • user API key is loaded
    • key is decrypted
    • provider is inferred
    • chosen model is checked against provider

    Why this validation exists:

    • a stored Gemini key with an OpenAI model choice is an avoidable operator error
    • fail-fast here is much cleaner than failing deep in LLM execution
    6.4 API creates or loads chat + persists the user message

    This happens via chat.service.ts.

    Why persist before execution:

    • user intent must be durable even if downstream execution fails
    • history should not disappear because a worker crashed
    6.5 API runs planning workflow

    Planning is not the same as generation.

    Workflow engine:

    • services/planning/workflow/engine.ts

    Main early phases:

    • analyze intent
    • resolve packages
    • install packages
    • build/deploy/recover as needed

    Why have a workflow at all:

    • generation quality improves when framework/packages/intent are normalized first
    • retries and step-level state become explicit
    • the system can reason in phases instead of one giant black-box prompt
    6.6 API creates an admitted durable run

    Run admission:

    • services/runs/runAdmission.service.ts
    • packages/auth/lib/run.ts

    Why run admission matters:

    • the product must limit active execution globally, per user, and per chat
    • one chat should not have multiple conflicting active generations
    • the API must reject overload before enqueuing dangerous work

    The use of transactional advisory locks here is a sign of mature concurrency thinking.

    6.7 API enqueues worker job

    After the run is admitted, the API enqueues the job in BullMQ.

    Why queue after durable DB write:

    • DB becomes the source of truth
    • if enqueue fails, the run can be marked failed explicitly
    • the system is not dependent on HTTP lifetime
    6.8 Browser switches from request stream to durable run stream

    The API immediately begins streaming persisted/live run events via streamRunEventsFromPersistence.

    Why this handoff is elegant:

    • the frontend can keep one streaming UX
    • internally, the source is durable run event persistence, not just the original request handler

    This is how Edward bridges synchronous UX and asynchronous execution.


    7. The second important flow: worker-run execution

    Worker entry:

    • apps/api/services/runs/agent-run-worker/processor.ts
    7.1 Worker reloads durable context

    The worker fetches:

    • run record
    • metadata
    • user API key
    • historical conversation context

    Why:

    • worker must be independently restartable
    • it cannot depend on the request process keeping in-memory state alive
    7.2 Worker subscribes to cancellation

    It listens on Redis pub/sub channels like edward:run-cancel:<runId>.

    Why pub/sub plus DB terminal checks both exist:

    • pub/sub gives low-latency cancellation
    • DB polling gives durable truth if a pub/sub signal is missed

    This dual mechanism is deliberate defense-in-depth.

    7.3 Worker marks run running

    The worker updates durable run state to running.

    Why not mark it when API enqueues:

    • enqueued is not the same as actively executing
    • status must reflect reality, not intent
    7.4 Worker captures run events through a fake response

    This is a subtle but strong pattern.

    createRunEventCaptureResponse lets the streaming session code write events as if it were writing to an HTTP response, while the worker intercepts those events and persists/publishes them.

    Why this is good:

    • shared stream-session logic can be reused by API and worker paths
    • the event producer does not need to know whether the sink is a real socket or a persistence pipeline
    7.5 Worker finalizes success or failure

    Success path:

    • update terminal run state
    • clear checkpoint
    • store duration/latency metadata

    Failure path:

    • persist error and terminal completion events
    • mark run failed

    Why explicit finalize helpers exist:

    • terminal transitions are high-risk correctness points
    • centralizing them reduces double-completion bugs

    8. The stream runtime and why it is designed this way

    Core files:

    • runStreamSession.orchestrator.ts
    • agentLoop.runner.ts
    • events/handler.ts
    • lib/llm/parser.ts
    8.1 Why orchestration is separate from the raw LLM client

    LLM API calls are the smallest part of the feature.

    Edward also needs:

    • prompt composition
    • token budgeting
    • parser state handling
    • sandbox side effects
    • tool execution
    • validation/autofix/retry
    • persistence/finalization

    That is why the orchestration layer exists above provider.client.ts.

    8.2 Why the model outputs tagged markup

    Edward instructs the model to output strict Edward tags:

    • thinking
    • response
    • sandbox
    • file
    • install
    • command
    • web search
    • done

    Why tagged output instead of “just ask for code”:

    • the product needs machine-readable execution intent
    • file boundaries must be recoverable
    • installs and commands must be explicit
    • partial streaming must still be parseable

    This is a classic “LLM as structured protocol emitter” design.

    8.3 Why there is a streaming parser state machine

    lib/llm/parser.ts is a state machine because streamed output arrives in incomplete chunks.

    Why not parse with simple regex over full strings:

    • chunks can split tags across boundaries
    • file/install/sandbox sections can nest temporal states
    • incomplete output must still be handled safely

    This parser is not overengineering. It is required for correctness in streamed generation.

    8.4 Why there is an agent loop, not one LLM call

    runAgentLoop supports multiple turns.

    Why:

    • the model may need to inspect, write, install, command, then continue
    • tool results need to feed back into later reasoning
    • retries/continuations need bounded turn accounting

    Why hard budgets exist:

    • prevent runaway loops
    • bound cost
    • bound context growth
    • preserve operational predictability
    8.5 Why token usage is computed before and during execution

    Edward computes provider-aware token usage because context exhaustion is one of the most common real failure modes in agent systems.

    Why this is necessary:

    • different providers have different token windows
    • multimodal content changes token budgeting
    • strict output reservation prevents generation from crowding out response budget
    8.6 Why post-generation validation, autofix, and strict retry exist

    Generated code is probabilistic. Production systems must add deterministic safety rails.

    Edward uses:

    • postgen validation
    • deterministic autofixes
    • strict retry

    Why this layered approach is better than “just regenerate everything”:

    • deterministic fixes are cheaper and faster
    • validation localizes problems
    • retry is only used when the output contract is still violated

    This is one of the strongest “productionized AI” patterns in the repo.


    9. Planning workflow and why it exists separately from stream execution

    Planning modules:

    • services/planning/schemas.ts
    • workflow/engine.ts
    • analyzers/intentAnalyzer.ts
    • resolvers/dependency.resolver.ts
    • validators/postgenValidator.ts
    9.1 Why planning is a workflow

    Because planning has recoverable phases, not just a single pass.

    Phases include:

    • ANALYZE
    • RESOLVE_PACKAGES
    • INSTALL_PACKAGES
    • GENERATE
    • BUILD
    • DEPLOY
    • RECOVER

    Why explicit phase modeling matters:

    • allows retries with context
    • improves debuggability
    • lets the system fail in a known stage
    • reduces the amount of work shoved into one prompt
    9.2 Why intent analysis uses the LLM but is schema-constrained

    intentAnalyzer.ts asks the model for JSON and validates it with Zod.

    Why:

    • intent classification is a fuzzy problem
    • but downstream code wants structured outputs

    So the design is:

    • use LLM for ambiguity resolution
    • use schema validation for control
    • use fallback logic when classification fails

    This is the right balance.

    9.3 Why dependency resolution exists

    The model may recommend packages, but the runtime must filter/verify them.

    Why:

    • package names can be wrong
    • peer conflicts matter
    • some packages are blocked for sandbox/runtime reasons

    This is why the system does not blindly trust model-emitted package lists.


    10. Sandbox architecture and why it is unusually important here

    Key modules:

    • lifecycle/provisioning.ts
    • docker.service.ts
    • write/buffer.ts
    • write/flush.ts
    • command.service.ts
    • builder/unified-build/orchestrator.ts
    • state.service.ts
    10.1 Why sandbox state is in Redis

    Sandbox instances are ephemeral runtime resources with TTLs.

    Why Redis instead of Postgres here:

    • sandbox liveness is operational state, not primary product truth
    • TTL refresh and quick lookup matter
    • container lifecycle reconciliation is fast-path coordination
    10.2 Why sandbox writes are buffered

    Edward streams file content incrementally. Writing every tiny chunk directly to disk/container would be noisy and slow.

    So the system:

    • buffers file chunks in Redis
    • periodically flushes them to the container
    • uses distributed locks around flush

    Why this is smart:

    • reduces write churn
    • handles chunked file streaming naturally
    • coordinates concurrent writes safely
    10.3 Why protected framework files exist

    Template registry marks files like package.json, tsconfig, framework configs, and core CSS files as protected.

    Why:

    • models are much less reliable when editing sensitive build/config files
    • most user value is in app code, not infra/config drift
    • protecting these files preserves build stability

    This is a product safety rail, not a limitation by accident.

    10.4 Why command execution is allowlisted

    command.service.ts validates:

    • command name
    • argument count/length
    • path safety
    • dangerous patterns

    Why:

    • the sandbox is isolated, but still not trusted blindly
    • guardrails reduce accidental destructive behavior
    • product behavior becomes auditable and predictable
    10.5 Why builds happen after generation in a unified build orchestrator

    buildAndUploadUnified handles:

    • dependency presence checks
    • framework detection
    • merge/install dependency logic
    • build execution
    • preview upload
    • cache invalidation

    Why not “just run npm build”:

    • different frameworks output differently
    • preview hosting needs path/base handling
    • dependencies may need reconciliation
    • upload and routing are part of the build product

    11. Preview and deployment architecture

    11.1 Path mode vs subdomain mode

    Configured via EDWARD_DEPLOYMENT_TYPE.

    Why two modes:

    • local/self-hosted environments often want simple path-based previews
    • production environments may want nicer subdomain-based previews

    This avoids forcing one infrastructure assumption everywhere.

    11.2 Why preview routing uses KV

    Subdomain routing needs a fast edge lookup from subdomain -> storage prefix.

    Cloudflare KV is a practical fit because:

    • low-latency reads at edge
    • simple key-value mapping
    • decoupled from the main DB

    Why not use Postgres for request-time routing:

    • worse latency profile for edge routing
    • unnecessary coupling between preview serving and primary transactional DB
    11.3 Why preview URL is also stored on build records

    Because the user cares about “what is the latest preview for this chat/build”.

    Persisting preview URLs on builds means:

    • API can answer build status quickly
    • UI can bootstrap preview state without recomputing routing every time

    12. GitHub integration architecture

    Important files:

    • packages/octokit/index.ts
    • apps/api/services/github/sync.service.ts
    • apps/api/services/github/token.service.ts
    • apps/api/services/github/repoBinding.service.ts
    12.1 Why repo binding is a first-class concept

    Chats/projects can be linked to repos.

    Why bind at chat level:

    • a chat represents a project
    • repo sync is a project concern, not a user-global concern
    12.2 Why GitHub token handling is wrapped

    token.service.ts decrypts and migrates token storage.

    Why centralize this:

    • auth provider data should not be parsed ad hoc everywhere
    • token encryption migration needs one place
    12.3 Why sync uses a manifest

    packages/octokit/index.ts uses .edward-sync-manifest.json.

    Why:

    • lets Edward track which files it manages
    • supports deletion of files removed locally
    • avoids blind destructive sync over unknown repo content

    This is a very practical design choice.


    13. Frontend state architecture

    13.1 Server state

    Handled mainly with React Query:

    • chat history
    • metadata
    • quotas
    • active run lookup
    • GitHub status

    Why:

    • cache lifecycle
    • stale time
    • refetch policies
    • request deduplication
    13.2 Stream state

    Handled with Zustand chat stream store.

    Why:

    • append-heavy mutable event streams
    • low friction updates per chunk/frame
    • simpler than putting stream mutation logic into React component trees
    13.3 Sandbox UI state

    Handled with Zustand sandbox slices:

    • files
    • editor selection
    • preview URL
    • build status/errors
    • terminal output
    • open/close state

    Why separate from chat stream state:

    • stream state represents live assistant output
    • sandbox state represents persistent project workspace UI

    These are related but not identical concerns.

    13.4 Why the chat route does server-side access probing

    apps/web/app/chat/[id]/page.tsx checks access and metadata server-side.

    Why:

    • avoids client-only “flash then deny”
    • supports route metadata generation
    • improves correctness for private chat pages

    14. Security posture and why these controls exist

    Important controls:

    • auth middleware for all protected API routes
    • rate limits backed by Redis
    • encrypted API keys
    • command allowlists
    • protected template files
    • Docker isolation
    • CSP/helmet/cors on API
    • request ids and security telemetry

    Why the repo has many small security modules instead of one giant security file:

    • security concerns happen at different layers
    • auth, rate limit, encryption, runtime isolation, and telemetry are separate controls

    This is the correct decomposition.


    15. Reliability posture and why the repo feels “production-minded”

    Signals of maturity:

    • graceful shutdown in API and worker
    • durable run events
    • queue-based long-running execution
    • resumable streams using Last-Event-ID
    • cancellation via pub/sub plus durable verification
    • DB-backed admission control
    • checkpointing of agent loop state
    • explicit terminal finalization logic
    • post-generation validators
    • quality gate scripts

    These are not “extra code”. They are the difference between demo code and production-oriented code.


    16. Testing and quality gates

    Current rough shape:

    • API tests: about 91 files
    • Web tests: light, about 3 files
    • Shared package tests: light but present

    Why API tests dominate:

    • most complexity and failure modes live in orchestration, streaming, sandboxing, and workers
    • UI is large, but much of it is composition/presentation on top of backend contracts

    Quality scripts in apps/api/scripts/quality enforce:

    • architecture boundaries
    • duplication checks
    • coverage checks
    • function-length checks
    • file audit generation

    Why these custom scripts exist:

    • generic linting does not enforce architecture well enough
    • this codebase has non-trivial layering rules

    17. Key things a junior engineer must understand before making changes

    17.1 Never confuse message history with run execution history
    • message is the conversation artifact
    • run_event is the execution/replay log
    17.2 Never treat Redis state as the source of truth for business data

    Redis is coordination state. Postgres is durable business truth.

    17.3 Never assume the browser connection is the lifetime of the work

    The worker owns durable execution. The browser only observes it.

    17.4 Never casually edit protected sandbox template files or remove guardrails

    Those protections exist because AI-generated config churn destroys stability.

    17.5 Never bypass run admission and queueing for “quick fixes”

    That breaks fairness, concurrency guarantees, and operational predictability.

    17.6 Never add a new stream event without updating both sides

    If you change stream contracts, review:

    • backend emitters
    • persistence/replay
    • frontend stream processor
    • shared type contracts

    18. How I would explain the main architectural “WHY” in one paragraph

    Edward is built as a durable, queue-backed, sandboxed AI execution platform rather than a thin chat wrapper because real code generation is slow, stateful, failure-prone, and operationally dangerous. The architecture separates durable truth (Postgres), fast coordination (Redis), long-running work (BullMQ worker), isolated execution (Docker sandboxes), and progressive UX (SSE + persisted run events). The frontend is split between React Query for server truth and Zustand for live streaming/UI runtime state. The repo uses shared packages to keep contracts aligned and validators/guardrails to turn probabilistic model output into something closer to a deterministic product.


    19. File map by major area

    This section is the “how do I navigate the repo quickly” map.

    19.1 Root and workspace
    • README.md: product + local setup overview
    • package.json: top-level scripts
    • turbo.json: build graph and env config
    • pnpm-workspace.yaml: workspace package boundaries
    • scripts/build-local-sandboxes.sh: local sandbox image prep
    19.2 API composition and delivery
    • apps/api/server.http.ts: API bootstrap and shutdown
    • apps/api/queue.worker.ts: worker bootstrap and background loops
    • apps/api/server/http/app.factory.ts: Express app assembly
    • apps/api/routes/*.ts: route wiring
    • apps/api/controllers/chat/query/*.ts: read/query/build/run delivery controllers
    • apps/api/middleware/*.ts: auth, rate limiting, validation, telemetry
    19.3 Runs and execution
    • apps/api/services/runs/messageOrchestrator.service.ts: main send-message entry
    • apps/api/services/runs/runAdmission.service.ts: load shedding and admission control
    • apps/api/services/runs/runMetadata.ts: durable metadata/checkpoint schema
    • apps/api/services/runs/runEvents.service.ts: publish/persist run events
    • apps/api/services/runs/agent-run-worker/*: worker execution engine
    • apps/api/services/run-event-stream-utils/service.ts: replay + live SSE bridge
    19.4 Chat session runtime
    • apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
    • apps/api/services/chat/session/loop/agentLoop.runner.ts
    • apps/api/services/chat/session/events/handler.ts
    • apps/api/services/chat/session/orchestrator/*
    • apps/api/services/chat/session/loop/*
    19.5 Planning
    • apps/api/services/planning/schemas.ts
    • apps/api/services/planning/analyzers/intentAnalyzer.ts
    • apps/api/services/planning/resolvers/dependency.resolver.ts
    • apps/api/services/planning/validators/*
    • apps/api/services/planning/workflow/*
    19.6 LLM abstraction
    • apps/api/lib/llm/provider.client.ts: provider-specific generation/streaming
    • apps/api/lib/llm/provider.helpers.ts: normalization and provider/model checks
    • apps/api/lib/llm/compose.ts: prompt assembly
    • apps/api/lib/llm/prompts/sections.ts: main system prompt contract
    • apps/api/lib/llm/parser*.ts: streaming parser
    • apps/api/lib/llm/tokens*.ts: token budgeting
    19.7 Sandbox
    • apps/api/services/sandbox/lifecycle/provisioning.ts
    • apps/api/services/sandbox/lifecycle/cleanup.ts
    • apps/api/services/sandbox/docker.service.ts
    • apps/api/services/sandbox/state.service.ts
    • apps/api/services/sandbox/command.service.ts
    • apps/api/services/sandbox/write/*
    • apps/api/services/sandbox/read/*
    • apps/api/services/sandbox/builder/*
    • apps/api/services/sandbox/templates/*
    19.8 Preview/storage/routing
    • apps/api/services/storage.service.ts
    • apps/api/services/storage/*
    • apps/api/services/preview.service.ts
    • apps/api/services/previewRouting/*
    19.9 GitHub
    • apps/api/services/github/github.useCase.ts
    • apps/api/services/github/sync.service.ts
    • apps/api/services/github/token.service.ts
    • apps/api/services/github/repoBinding.service.ts
    • packages/octokit/index.ts
    19.10 Frontend routing and app shell
    • apps/web/app/layout.tsx
    • apps/web/app/providers.tsx
    • apps/web/app/page.tsx
    • apps/web/app/chat/[id]/page.tsx
    • apps/web/app/changelog/page.tsx
    • apps/web/app/api/auth/[...all]/route.ts
    19.11 Frontend chat workspace
    • apps/web/components/chat/chatPageClient.tsx
    • apps/web/components/chat/chatWorkspace.tsx
    • apps/web/components/chat/chatWorkspaceDesktop.tsx
    • apps/web/components/chat/chatWorkspaceMobile.tsx
    • apps/web/components/chat/messages/*
    • apps/web/components/chat/sandbox/*
    • apps/web/stores/chatStream/*
    • apps/web/stores/sandbox/*
    • apps/web/lib/streaming/processors/chatStreamProcessor.ts
    • apps/web/lib/parsing/*
    • apps/web/lib/api/*
    19.12 Shared packages
    • packages/auth/lib/schema.ts
    • packages/auth/lib/auth.ts
    • packages/auth/lib/db.ts
    • packages/auth/lib/run.ts
    • packages/auth/lib/build.ts
    • packages/shared/src/constants.ts
    • packages/shared/src/schema.ts
    • packages/shared/src/streamEvents.ts
    • packages/shared/src/chat/types.ts
    • packages/shared/src/chat/streamActions.ts
    • packages/shared/src/api/contracts.ts
    • packages/ui/src/components/*

    20. Directory-level inventory

    This is not a prose description of every leaf UI component, because that would bury the actual KT. Instead, use this as the completeness map for where code lives.

    20.1 apps/api

    Major subareas:

    • controllers/
    • routes/
    • middleware/
    • lib/
    • services/
    • schemas/
    • utils/
    • tests/

    High-density domains:

    • services/chat
    • services/runs
    • services/sandbox
    • services/planning
    • services/github
    • services/queue
    20.2 apps/web

    Major subareas:

    • app/
    • components/chat/
    • components/home/
    • components/changelog/
    • hooks/
    • lib/
    • stores/chatStream/
    • stores/sandbox/

    High-density domains:

    • chat UI and workspace
    • SSE parsing and stream orchestration
    • sandbox/editor/preview UI
    20.3 packages/auth

    Major subareas:

    • lib/: auth, db, schema, build/run helpers
    • drizzle/: SQL migrations
    20.4 packages/shared

    Major subareas:

    • src/constants.ts
    • src/schema.ts
    • src/streamEvents.ts
    • src/chat/*
    • src/github/*
    • src/api/*
    • src/llm/*
    20.5 packages/ui

    Major subareas:

    • src/components/: reusable primitives
    • src/hooks/
    • src/lib/
    • src/styles/
    20.6 docker/templates

    Major subareas:

    • nextjs
    • vite-react
    • vanilla
    • base

    These templates define the scaffold/runtime assumptions for generated projects and sandbox images.


    21. Practical onboarding order for a new engineer

    If I were onboarding someone senior-but-new, I would ask them to read in this exact order:

    1. Root README.md
    2. apps/api/README.md
    3. packages/auth/lib/schema.ts
    4. apps/api/server.http.ts
    5. apps/api/server/http/app.factory.ts
    6. apps/api/routes/chat.routes.ts
    7. apps/api/services/runs/messageOrchestrator.service.ts
    8. apps/api/services/runs/runAdmission.service.ts
    9. apps/api/services/runs/agent-run-worker/processor.ts
    10. apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
    11. apps/api/services/chat/session/loop/agentLoop.runner.ts
    12. apps/api/services/chat/session/events/handler.ts
    13. apps/api/services/sandbox/lifecycle/provisioning.ts
    14. apps/api/services/sandbox/builder/unified-build/orchestrator.ts
    15. apps/web/app/chat/[id]/page.tsx
    16. apps/web/components/chat/chatPageClient.tsx
    17. apps/web/stores/chatStream/controller.ts
    18. apps/web/lib/streaming/processors/chatStreamProcessor.ts

    If they understand those files, they understand the heart of Edward.


    22. Final summary

    Edward is a monorepo for a durable AI code-generation product, not a thin LLM wrapper. The architecture optimizes for correctness, replayability, operator safety, and product durability:

    • Postgres keeps durable truth.
    • Redis handles fast coordination.
    • BullMQ decouples request acceptance from long-running execution.
    • Docker sandboxes isolate generated code.
    • S3/CDN/KV separate preview hosting from execution.
    • SSE plus persisted run events make streaming resumable.
    • Shared packages prevent contract drift.
    • Planning, validation, autofix, and retry layers turn model output into something operationally usable.

    That is the core “why” behind almost every serious architectural decision in this repo.