InkdownInkdown
Start writing

Overview

Last updated on March 25, 2026

Edward KT

1. What Edward actually is

Edward is an AI-assisted frontend application builder.

At the product level, the user experience is:s

  1. A user signs in with GitHub.
  2. The user stores their own LLM API key inside the product.
  3. The user asks Edward to create, edit, or fix a web app in chat.
  4. Edward plans the work, streams progress, writes files into an isolated sandbox, installs dependencies, runs commands, builds the app, uploads preview assets, and optionally syncs the code to GitHub.
  5. The user sees the generated app, file tree, terminal/build feedback, and live preview.

The system is intentionally split across:

  • apps/web: the product UI and auth surface
  • apps/api: the synchronous HTTP API, SSE delivery, orchestration, and infrastructure adapters
  • apps/api/queue.worker.ts: the asynchronous worker process for long-running build/run jobs
  • packages/auth: auth + database schema + database access
  • packages/shared: shared contracts, enums, stream event types, chat types, model metadata
  • packages/ui: shared UI primitives for the frontend
  • packages/octokit: shared GitHub sync client helpers

This split is not accidental. Edward is not just “a chat app”. It is a distributed system with:

  • durable state
  • transient coordination state
  • long-running execution
  • resumable streaming
  • isolated code execution
  • artifact publishing
  • external system integration

That is the fundamental lens juniors should use when reading this repo.


2. The highest-level architectural decisions and why they exist

2.1 Why a monorepo

Edward uses a pnpm + Turbo monorepo because the frontend, API, worker, DB schema, shared contracts, and UI primitives evolve together.

Why this was chosen instead of separate repos:

  • Shared contracts are first-class. Stream events, chat types, API response shapes, model catalogs, and rate-limit scopes must not drift.
  • Frontend and backend changes usually land together. A single PR often changes stream event shape, API behavior, and UI rendering.
  • Tooling is simpler. One workspace gives consistent TypeScript, ESLint, and build orchestration.
  • Local development is more realistic. pnpm dev starts the whole system shape, not disconnected fragments.

Tradeoff:

  • Monorepos become large and noisy.
  • Build graph discipline is required.

This repo counters that with:

  • separate packages for shared concerns
  • Turbo task boundaries
  • architecture boundary checks in apps/api/scripts/quality

2.2 Why Next.js on the frontend

apps/web is a Next.js 16 App Router app.

Why Next.js instead of plain Vite for the product UI:

  • server-rendered marketing/auth/changelog pages are easier
  • metadata, sitemap, robots, SEO handling are built in
  • auth route handling integrates cleanly with Better Auth
  • product pages and landing pages can live in the same app

Important nuance:

Edward generates Vite/Next/Vanilla projects for users, but the Edward product itself uses Next.js. Those are different concerns.

2.3 Why Express for the API instead of putting everything inside Next route handlers

The repo deliberately keeps orchestration in apps/api as a separate Express app.

Why:

  • stream-oriented chat delivery is easier to reason about in a dedicated API
  • queue worker and API share backend services cleanly
  • long-running orchestration should not be tightly coupled to the frontend deployment shape
  • HTTP API can scale independently from the web UI
  • operational concerns like graceful shutdown, Redis connections, worker coordination, and SSE backpressure are clearer in a dedicated server

Why not “just use Next API routes”:

  • the runtime model becomes mixed and harder to reason about
  • long-running SSE and worker orchestration become more awkward
  • explicit process separation is valuable here

2.4 Why a dedicated worker process

The API process accepts requests. The worker process executes long-running jobs.

Why this is the correct split:

  • user HTTP connections are short-lived and unreliable
  • LLM execution, sandbox interaction, dependency install, build, and artifact publishing can outlive the browser connection
  • workers give retries, concurrency control, and isolation from request latency
  • the system can continue work even if the client disconnects

This is one of the most important architectural choices in the whole repo.

Without it, Edward would behave like a fragile synchronous demo.

With it, Edward behaves like a real product with durable execution.

2.5 Why Postgres

Postgres is the durable source of truth.

It stores:

  • users, sessions, accounts
  • chats and messages
  • runs and run events
  • builds
  • attachments
  • repo bindings and related product state

Why Postgres instead of Redis-only or document storage:

  • chat history and run history are durable product data, not cache
  • run events need ordering and replay
  • auth tables fit naturally into a relational model
  • ownership and filtering by user/chat/run are frequent and relational
  • transactional admission control is easier and safer

The strongest proof is createRunWithUserLimit in packages/auth/lib/run.ts. It uses DB transactions and advisory locks for safe concurrent run admission.

2.6 Why Redis

Redis is used everywhere, but always for fast, ephemeral, coordination-heavy workloads.

Redis responsibilities:

  • BullMQ queue backend
  • distributed locks
  • sandbox state and TTLs
  • write buffers for streamed file output
  • pub/sub for run cancellation and live status fanout
  • request rate-limit stores

Why Redis instead of doing all of this in Postgres:

  • queues and pub/sub need low-latency operational semantics
  • lock contention and TTL semantics are simpler in Redis
  • sandbox write buffering is a classic ephemeral buffering problem
  • rate limiting is a cache-like, time-windowed workload

This is a good architecture split:

  • Postgres = durable truth
  • Redis = fast coordination and ephemeral runtime state

2.7 Why BullMQ

BullMQ is the job orchestration layer on top of Redis.

Why BullMQ:

  • already fits the Redis runtime Edward needs
  • supports worker concurrency and queue separation
  • familiar operational model for TypeScript/Node systems
  • good fit for “enqueue durable work from API, execute elsewhere”

Why not build a custom queue:

  • queue correctness is hard
  • retries, visibility, backpressure, and operational observability are all non-trivial

Edward keeps two main job categories:

  • build jobs
  • agent run jobs

That separation matters because build workloads and agent-stream workloads have different runtime behavior and failure modes.

2.8 Why Docker-backed sandboxes

This is the most product-defining infrastructure choice.

Edward writes and executes generated code in Docker containers.

Why:

  • isolation from the host machine
  • deterministic environment for install/build/command execution
  • safer command execution boundary
  • framework-specific template images can be prewarmed
  • easier cleanup and lifecycle control

Why not run code directly on the API machine filesystem:

  • far less safe
  • dependency conflicts become unmanageable
  • cleanup is harder
  • one broken project can poison the environment for others

The sandbox is not just an implementation detail. It is the product’s execution boundary.

2.9 Why S3 + CloudFront + optional Cloudflare KV

Edward separates “code generation” from “preview hosting”.

  • S3 stores preview artifacts
  • CloudFront serves previews and can be invalidated
  • Cloudflare KV optionally maps subdomains to preview storage prefixes

Why this is better than serving builds directly from live containers:

  • static previews are cheaper and more stable
  • finished artifacts survive sandbox lifecycle cleanup
  • containers can be ephemeral while previews remain available
  • preview serving and code execution are decoupled

This is a strong architecture choice because it turns preview hosting into a static asset problem instead of a container serving problem.

2.10 Why GitHub OAuth + GitHub sync

GitHub is used both for user authentication and for repository integration.

Why GitHub auth:

  • Edward’s target users already live in GitHub-centric workflows
  • repo sync is a core feature, so GitHub identity is a natural anchor
  • one provider reduces auth complexity

Why sync through GitHub APIs instead of shelling out to git in the sandbox:

  • less credential management complexity inside containers
  • direct tree/blob APIs are deterministic and auditable
  • easier to manage partial file sync and manifests

2.11 Why bring-your-own API keys

Users store their own OpenAI/Gemini/Anthropic API keys.

Why:

  • cost ownership stays with the user
  • provider choice stays with the user
  • product does not need to centralize all model billing risk
  • enterprise-style customers often prefer controlling their own provider credentials

Why encrypted at rest:

  • these are real credentials, not preferences
  • API keys must not sit in plaintext in Postgres

This is why apps/api/utils/encryption.ts and apps/api/utils/secretEnvelope.ts exist.

2.12 Why SSE instead of WebSockets for streaming

Edward streams progress to the browser using Server-Sent Events.

Why SSE:

  • server-to-client streaming is the actual need
  • simpler than full bidirectional socket infra
  • reconnect semantics are straightforward
  • Last-Event-ID replay model fits persisted run events well

Why not WebSockets:

  • more operational complexity
  • bidirectional realtime control is not the dominant requirement
  • durability + replay matter more than low-level socket interactivity

2.13 Why persist stream events

This is one of the smartest choices in the repo.

Edward does not treat stream events as disposable transport. It stores them in run_event.

Why:

  • browser reconnects can resume streams
  • active run pages can rehydrate state after refresh
  • debugging becomes far easier
  • a run has a durable audit trail
  • the stream is no longer coupled to one live TCP connection

This is the reason the system feels durable instead of “best effort”.


3. Runtime topology

At runtime, think in terms of five planes.

3.1 Presentation plane

  • apps/web
  • Next.js pages, route handlers, React state, chat workspace, sandbox UI

3.2 Delivery/API plane

  • apps/api/server.http.ts
  • Express routes/controllers/middleware
  • SSE response management

3.3 Execution plane

  • apps/api/queue.worker.ts
  • BullMQ workers
  • agent runs, builds, backups

3.4 Durable data plane

  • Postgres via Drizzle in packages/auth

3.5 Ephemeral coordination plane

  • Redis
  • locks
  • queue backend
  • pub/sub
  • sandbox state
  • rate limiting

The system is healthy when these planes are kept conceptually separate.


4. Repo walkthrough

4.1 Root

Important root files:

  • package.json: workspace scripts and security overrides
  • pnpm-workspace.yaml: workspace package boundaries
  • turbo.json: task graph and env propagation
  • README.md: product/dev bootstrap overview
  • eslint.config.mjs: root lint baseline
  • tsconfig.json: root TS setup
  • scripts/build-local-sandboxes.sh: builds local sandbox images

Why root-level task discipline matters:

  • the repo has multiple deployable units
  • env propagation is explicit
  • broken build assumptions must fail early

4.2 apps/web

What it owns:

  • landing page
  • auth session bootstrapping
  • chat workspace
  • file/editor/preview UI
  • changelog UI
  • browser-side stream orchestration

Most important files:

  • app/layout.tsx: global metadata, fonts, providers, navbar shell
  • app/providers.tsx: React Query, theme, notifications, chat stream provider
  • app/page.tsx: landing page entry
  • app/chat/[id]/page.tsx: server-side access probe + metadata generation for chat pages
  • components/chat/chatPageClient.tsx: client orchestration for a chat route
  • components/chat/chatWorkspace.tsx: core desktop/mobile workspace composition
  • stores/chatStream/controller.ts: start/resume/cancel stream orchestration
  • stores/chatStream/useStartStream.ts: new message stream mutation flow
  • lib/streaming/processors/chatStreamProcessor.ts: SSE event consumption into UI state
  • stores/sandbox/*: file/build/terminal UI state
  • hooks/server-state/*: React Query data fetching hooks
  • lib/api/*: API client surface

Why this frontend is split between React Query and Zustand:

  • React Query handles server state: chat history, metadata, active runs, quotas
  • Zustand handles high-churn UI runtime state: live streaming text, file chunks, open panel state, build errors, terminal lines

That is the right split.

If streaming state lived fully inside React Query, it would be awkward and too mutation-heavy. If server state lived fully inside Zustand, cache invalidation and stale-fetch logic would get worse.

4.3 apps/api

What it owns:

  • request authentication and validation
  • chat/run/build endpoints
  • run orchestration
  • planning workflow
  • LLM abstraction
  • parser and tool event handling
  • sandbox orchestration
  • GitHub sync orchestration
  • preview routing

Composition roots:

  • server.http.ts: API process bootstrap
  • queue.worker.ts: worker bootstrap

Important structure:

  • routes/: HTTP surface
  • controllers/: transport and response wiring
  • services/: application + infra orchestration
  • lib/: adapters/clients/shared helpers
  • middleware/: auth, rate limit, validation, telemetry
  • schemas/: Zod request contracts
  • tests/: mostly API-side tests, mirrored by module

4.4 packages/auth

What it owns:

  • Better Auth instance
  • Drizzle/Postgres connection
  • database schema
  • basic data access helpers for runs/builds
  • migrations

Why it is a separate package:

  • both web and API need auth/schema awareness
  • keeping schema in API app only would over-couple layers

4.5 packages/shared

What it owns:

  • stream event contracts
  • chat UI types
  • API contract types
  • model catalog and provider detection
  • rate-limit scopes/policies
  • shared parsing helpers

This package is the anti-drift package.

If this package did not exist, frontend/backend stream contracts would break constantly.

4.6 packages/ui

What it owns:

  • reusable UI building blocks
  • shared styling artifacts
  • navigation, skeletons, toasts, hooks, utilities

Why it exists:

  • product UI should reuse primitives instead of re-implementing them
  • keeps apps/web focused on product behavior, not low-level component plumbing

4.7 packages/octokit

What it owns:

  • GitHub API client creation
  • repo/branch creation helpers
  • manifest-based sync

Why separate:

  • GitHub sync logic is infrastructure-adjacent and reusable
  • keeping raw Octokit usage isolated reduces API surface spread

5. Data model and why each table exists

The schema in packages/auth/lib/schema.ts is the durable model of the product.

5.1 Auth tables

  • user
  • session
  • account
  • verification

Why:

  • Better Auth expects these concepts
  • GitHub OAuth identity and session state need durable storage
  • Edward also stores product-specific fields on user, especially apiKey and preferredModel

5.2 Product tables

  • chat: the top-level project/conversation container
  • message: user/assistant message history
  • attachment: images attached to messages
  • build: build lifecycle records
  • run: durable execution record for one assistant generation flow
  • run_event: persisted stream events
  • run_tool_call: durable tool invocation records

5.3 Why chat

chat is not just a thread id. It is the unit of project identity.

It owns:

  • title/description
  • SEO fields
  • GitHub repo binding
  • custom subdomain

Why not put these elsewhere:

  • the project is anchored to the conversation
  • preview and repo sync are project-level concerns

5.4 Why message

Messages are user-facing history.

Important detail:

  • assistant output is persisted as a normal message
  • run events are not a replacement for message history

Why both message and run_event exist:

  • message = final conversational artifact
  • run_event = execution trace / replay log

That distinction is important.

5.5 Why run

run exists because one assistant generation is not a simple request-response.

A run has:

  • queue state
  • current state machine position
  • current turn
  • termination reason
  • metadata
  • linkage to user and assistant messages

Why not derive all of this from messages:

  • messages do not capture execution status, retries, checkpoints, cancellation, or turn-level progress

5.6 Why run_event

run_event is the stream replay ledger.

Why this table is critical:

  • live SSE can reconnect from a sequence number
  • past run behavior can be debugged
  • session completion can be inferred from persisted events
  • worker/API/browser all get a shared truth

5.7 Why run_tool_call

This is the durability and idempotency guard for tools.

Why not only emit tool output into run_event:

  • tools have inputs, outputs, duration, status, and idempotency semantics
  • tool calls are not just display events; they are execution records

5.8 Why build

Preview build lifecycle is separate from generation lifecycle.

That is the correct model because:

  • generation can succeed while build fails
  • build status must be independently queryable and streamable
  • previews need their own duration/error metadata

5.9 Why attachments are separate

Attachments are not embedded directly inside messages because:

  • metadata is structured
  • message text and binary/media references are different concerns
  • image uploads have different constraints and lifecycle

6. The most important end-to-end flow: send a chat message

This is the core product path.

6.1 Browser starts the stream

Frontend entry:

  • apps/web/stores/chatStream/useStartStream.ts
  • apps/web/lib/api/chat.ts

The browser:

  1. acquires a submission lock
  2. creates optimistic UI state
  3. calls POST /chat/message
  4. begins consuming SSE frames

Why the submission lock exists:

  • to prevent accidental duplicate sends
  • to avoid overlapping UI-side submissions before the server admits a run

6.2 API authenticates and validates

Backend route:

  • apps/api/routes/chat.routes.ts

Middleware:

  • auth
  • rate limit
  • request validation

Why this layering exists:

  • fail cheap and early
  • keep orchestration code free from repeated input checks

6.3 API resolves model and user credentials

In unifiedSendMessage:

  • user API key is loaded
  • key is decrypted
  • provider is inferred
  • chosen model is checked against provider

Why this validation exists:

  • a stored Gemini key with an OpenAI model choice is an avoidable operator error
  • fail-fast here is much cleaner than failing deep in LLM execution

6.4 API creates or loads chat + persists the user message

This happens via chat.service.ts.

Why persist before execution:

  • user intent must be durable even if downstream execution fails
  • history should not disappear because a worker crashed

6.5 API runs planning workflow

Planning is not the same as generation.

Workflow engine:

  • services/planning/workflow/engine.ts

Main early phases:

  • analyze intent
  • resolve packages
  • install packages
  • build/deploy/recover as needed

Why have a workflow at all:

  • generation quality improves when framework/packages/intent are normalized first
  • retries and step-level state become explicit
  • the system can reason in phases instead of one giant black-box prompt

6.6 API creates an admitted durable run

Run admission:

  • services/runs/runAdmission.service.ts
  • packages/auth/lib/run.ts

Why run admission matters:

  • the product must limit active execution globally, per user, and per chat
  • one chat should not have multiple conflicting active generations
  • the API must reject overload before enqueuing dangerous work

The use of transactional advisory locks here is a sign of mature concurrency thinking.

6.7 API enqueues worker job

After the run is admitted, the API enqueues the job in BullMQ.

Why queue after durable DB write:

  • DB becomes the source of truth
  • if enqueue fails, the run can be marked failed explicitly
  • the system is not dependent on HTTP lifetime

6.8 Browser switches from request stream to durable run stream

The API immediately begins streaming persisted/live run events via streamRunEventsFromPersistence.

Why this handoff is elegant:

  • the frontend can keep one streaming UX
  • internally, the source is durable run event persistence, not just the original request handler

This is how Edward bridges synchronous UX and asynchronous execution.


7. The second important flow: worker-run execution

Worker entry:

  • apps/api/services/runs/agent-run-worker/processor.ts

7.1 Worker reloads durable context

The worker fetches:

  • run record
  • metadata
  • user API key
  • historical conversation context

Why:

  • worker must be independently restartable
  • it cannot depend on the request process keeping in-memory state alive

7.2 Worker subscribes to cancellation

It listens on Redis pub/sub channels like edward:run-cancel:<runId>.

Why pub/sub plus DB terminal checks both exist:

  • pub/sub gives low-latency cancellation
  • DB polling gives durable truth if a pub/sub signal is missed

This dual mechanism is deliberate defense-in-depth.

7.3 Worker marks run running

The worker updates durable run state to running.

Why not mark it when API enqueues:

  • enqueued is not the same as actively executing
  • status must reflect reality, not intent

7.4 Worker captures run events through a fake response

This is a subtle but strong pattern.

createRunEventCaptureResponse lets the streaming session code write events as if it were writing to an HTTP response, while the worker intercepts those events and persists/publishes them.

Why this is good:

  • shared stream-session logic can be reused by API and worker paths
  • the event producer does not need to know whether the sink is a real socket or a persistence pipeline

7.5 Worker finalizes success or failure

Success path:

  • update terminal run state
  • clear checkpoint
  • store duration/latency metadata

Failure path:

  • persist error and terminal completion events
  • mark run failed

Why explicit finalize helpers exist:

  • terminal transitions are high-risk correctness points
  • centralizing them reduces double-completion bugs

8. The stream runtime and why it is designed this way

Core files:

  • runStreamSession.orchestrator.ts
  • agentLoop.runner.ts
  • events/handler.ts
  • lib/llm/parser.ts

8.1 Why orchestration is separate from the raw LLM client

LLM API calls are the smallest part of the feature.

Edward also needs:

  • prompt composition
  • token budgeting
  • parser state handling
  • sandbox side effects
  • tool execution
  • validation/autofix/retry
  • persistence/finalization

That is why the orchestration layer exists above provider.client.ts.

8.2 Why the model outputs tagged markup

Edward instructs the model to output strict Edward tags:

  • thinking
  • response
  • sandbox
  • file
  • install
  • command
  • web search
  • done

Why tagged output instead of “just ask for code”:

  • the product needs machine-readable execution intent
  • file boundaries must be recoverable
  • installs and commands must be explicit
  • partial streaming must still be parseable

This is a classic “LLM as structured protocol emitter” design.

8.3 Why there is a streaming parser state machine

lib/llm/parser.ts is a state machine because streamed output arrives in incomplete chunks.

Why not parse with simple regex over full strings:

  • chunks can split tags across boundaries
  • file/install/sandbox sections can nest temporal states
  • incomplete output must still be handled safely

This parser is not overengineering. It is required for correctness in streamed generation.

8.4 Why there is an agent loop, not one LLM call

runAgentLoop supports multiple turns.

Why:

  • the model may need to inspect, write, install, command, then continue
  • tool results need to feed back into later reasoning
  • retries/continuations need bounded turn accounting

Why hard budgets exist:

  • prevent runaway loops
  • bound cost
  • bound context growth
  • preserve operational predictability

8.5 Why token usage is computed before and during execution

Edward computes provider-aware token usage because context exhaustion is one of the most common real failure modes in agent systems.

Why this is necessary:

  • different providers have different token windows
  • multimodal content changes token budgeting
  • strict output reservation prevents generation from crowding out response budget

8.6 Why post-generation validation, autofix, and strict retry exist

Generated code is probabilistic. Production systems must add deterministic safety rails.

Edward uses:

  • postgen validation
  • deterministic autofixes
  • strict retry

Why this layered approach is better than “just regenerate everything”:

  • deterministic fixes are cheaper and faster
  • validation localizes problems
  • retry is only used when the output contract is still violated

This is one of the strongest “productionized AI” patterns in the repo.


9. Planning workflow and why it exists separately from stream execution

Planning modules:

  • services/planning/schemas.ts
  • workflow/engine.ts
  • analyzers/intentAnalyzer.ts
  • resolvers/dependency.resolver.ts
  • validators/postgenValidator.ts

9.1 Why planning is a workflow

Because planning has recoverable phases, not just a single pass.

Phases include:

  • ANALYZE
  • RESOLVE_PACKAGES
  • INSTALL_PACKAGES
  • GENERATE
  • BUILD
  • DEPLOY
  • RECOVER

Why explicit phase modeling matters:

  • allows retries with context
  • improves debuggability
  • lets the system fail in a known stage
  • reduces the amount of work shoved into one prompt

9.2 Why intent analysis uses the LLM but is schema-constrained

intentAnalyzer.ts asks the model for JSON and validates it with Zod.

Why:

  • intent classification is a fuzzy problem
  • but downstream code wants structured outputs

So the design is:

  • use LLM for ambiguity resolution
  • use schema validation for control
  • use fallback logic when classification fails

This is the right balance.

9.3 Why dependency resolution exists

The model may recommend packages, but the runtime must filter/verify them.

Why:

  • package names can be wrong
  • peer conflicts matter
  • some packages are blocked for sandbox/runtime reasons

This is why the system does not blindly trust model-emitted package lists.


10. Sandbox architecture and why it is unusually important here

Key modules:

  • lifecycle/provisioning.ts
  • docker.service.ts
  • write/buffer.ts
  • write/flush.ts
  • command.service.ts
  • builder/unified-build/orchestrator.ts
  • state.service.ts

10.1 Why sandbox state is in Redis

Sandbox instances are ephemeral runtime resources with TTLs.

Why Redis instead of Postgres here:

  • sandbox liveness is operational state, not primary product truth
  • TTL refresh and quick lookup matter
  • container lifecycle reconciliation is fast-path coordination

10.2 Why sandbox writes are buffered

Edward streams file content incrementally. Writing every tiny chunk directly to disk/container would be noisy and slow.

So the system:

  • buffers file chunks in Redis
  • periodically flushes them to the container
  • uses distributed locks around flush

Why this is smart:

  • reduces write churn
  • handles chunked file streaming naturally
  • coordinates concurrent writes safely

10.3 Why protected framework files exist

Template registry marks files like package.json, tsconfig, framework configs, and core CSS files as protected.

Why:

  • models are much less reliable when editing sensitive build/config files
  • most user value is in app code, not infra/config drift
  • protecting these files preserves build stability

This is a product safety rail, not a limitation by accident.

10.4 Why command execution is allowlisted

command.service.ts validates:

  • command name
  • argument count/length
  • path safety
  • dangerous patterns

Why:

  • the sandbox is isolated, but still not trusted blindly
  • guardrails reduce accidental destructive behavior
  • product behavior becomes auditable and predictable

10.5 Why builds happen after generation in a unified build orchestrator

buildAndUploadUnified handles:

  • dependency presence checks
  • framework detection
  • merge/install dependency logic
  • build execution
  • preview upload
  • cache invalidation

Why not “just run npm build”:

  • different frameworks output differently
  • preview hosting needs path/base handling
  • dependencies may need reconciliation
  • upload and routing are part of the build product

11. Preview and deployment architecture

11.1 Path mode vs subdomain mode

Configured via EDWARD_DEPLOYMENT_TYPE.

Why two modes:

  • local/self-hosted environments often want simple path-based previews
  • production environments may want nicer subdomain-based previews

This avoids forcing one infrastructure assumption everywhere.

11.2 Why preview routing uses KV

Subdomain routing needs a fast edge lookup from subdomain -> storage prefix.

Cloudflare KV is a practical fit because:

  • low-latency reads at edge
  • simple key-value mapping
  • decoupled from the main DB

Why not use Postgres for request-time routing:

  • worse latency profile for edge routing
  • unnecessary coupling between preview serving and primary transactional DB

11.3 Why preview URL is also stored on build records

Because the user cares about “what is the latest preview for this chat/build”.

Persisting preview URLs on builds means:

  • API can answer build status quickly
  • UI can bootstrap preview state without recomputing routing every time

12. GitHub integration architecture

Important files:

  • packages/octokit/index.ts
  • apps/api/services/github/sync.service.ts
  • apps/api/services/github/token.service.ts
  • apps/api/services/github/repoBinding.service.ts

12.1 Why repo binding is a first-class concept

Chats/projects can be linked to repos.

Why bind at chat level:

  • a chat represents a project
  • repo sync is a project concern, not a user-global concern

12.2 Why GitHub token handling is wrapped

token.service.ts decrypts and migrates token storage.

Why centralize this:

  • auth provider data should not be parsed ad hoc everywhere
  • token encryption migration needs one place

12.3 Why sync uses a manifest

packages/octokit/index.ts uses .edward-sync-manifest.json.

Why:

  • lets Edward track which files it manages
  • supports deletion of files removed locally
  • avoids blind destructive sync over unknown repo content

This is a very practical design choice.


13. Frontend state architecture

13.1 Server state

Handled mainly with React Query:

  • chat history
  • metadata
  • quotas
  • active run lookup
  • GitHub status

Why:

  • cache lifecycle
  • stale time
  • refetch policies
  • request deduplication

13.2 Stream state

Handled with Zustand chat stream store.

Why:

  • append-heavy mutable event streams
  • low friction updates per chunk/frame
  • simpler than putting stream mutation logic into React component trees

13.3 Sandbox UI state

Handled with Zustand sandbox slices:

  • files
  • editor selection
  • preview URL
  • build status/errors
  • terminal output
  • open/close state

Why separate from chat stream state:

  • stream state represents live assistant output
  • sandbox state represents persistent project workspace UI

These are related but not identical concerns.

13.4 Why the chat route does server-side access probing

apps/web/app/chat/[id]/page.tsx checks access and metadata server-side.

Why:

  • avoids client-only “flash then deny”
  • supports route metadata generation
  • improves correctness for private chat pages

14. Security posture and why these controls exist

Important controls:

  • auth middleware for all protected API routes
  • rate limits backed by Redis
  • encrypted API keys
  • command allowlists
  • protected template files
  • Docker isolation
  • CSP/helmet/cors on API
  • request ids and security telemetry

Why the repo has many small security modules instead of one giant security file:

  • security concerns happen at different layers
  • auth, rate limit, encryption, runtime isolation, and telemetry are separate controls

This is the correct decomposition.


15. Reliability posture and why the repo feels “production-minded”

Signals of maturity:

  • graceful shutdown in API and worker
  • durable run events
  • queue-based long-running execution
  • resumable streams using Last-Event-ID
  • cancellation via pub/sub plus durable verification
  • DB-backed admission control
  • checkpointing of agent loop state
  • explicit terminal finalization logic
  • post-generation validators
  • quality gate scripts

These are not “extra code”. They are the difference between demo code and production-oriented code.


16. Testing and quality gates

Current rough shape:

  • API tests: about 91 files
  • Web tests: light, about 3 files
  • Shared package tests: light but present

Why API tests dominate:

  • most complexity and failure modes live in orchestration, streaming, sandboxing, and workers
  • UI is large, but much of it is composition/presentation on top of backend contracts

Quality scripts in apps/api/scripts/quality enforce:

  • architecture boundaries
  • duplication checks
  • coverage checks
  • function-length checks
  • file audit generation

Why these custom scripts exist:

  • generic linting does not enforce architecture well enough
  • this codebase has non-trivial layering rules

17. Key things a junior engineer must understand before making changes

17.1 Never confuse message history with run execution history

  • message is the conversation artifact
  • run_event is the execution/replay log

17.2 Never treat Redis state as the source of truth for business data

Redis is coordination state. Postgres is durable business truth.

17.3 Never assume the browser connection is the lifetime of the work

The worker owns durable execution. The browser only observes it.

17.4 Never casually edit protected sandbox template files or remove guardrails

Those protections exist because AI-generated config churn destroys stability.

17.5 Never bypass run admission and queueing for “quick fixes”

That breaks fairness, concurrency guarantees, and operational predictability.

17.6 Never add a new stream event without updating both sides

If you change stream contracts, review:

  • backend emitters
  • persistence/replay
  • frontend stream processor
  • shared type contracts

18. How I would explain the main architectural “WHY” in one paragraph

Edward is built as a durable, queue-backed, sandboxed AI execution platform rather than a thin chat wrapper because real code generation is slow, stateful, failure-prone, and operationally dangerous. The architecture separates durable truth (Postgres), fast coordination (Redis), long-running work (BullMQ worker), isolated execution (Docker sandboxes), and progressive UX (SSE + persisted run events). The frontend is split between React Query for server truth and Zustand for live streaming/UI runtime state. The repo uses shared packages to keep contracts aligned and validators/guardrails to turn probabilistic model output into something closer to a deterministic product.


19. File map by major area

This section is the “how do I navigate the repo quickly” map.

19.1 Root and workspace

  • README.md: product + local setup overview
  • package.json: top-level scripts
  • turbo.json: build graph and env config
  • pnpm-workspace.yaml: workspace package boundaries
  • scripts/build-local-sandboxes.sh: local sandbox image prep

19.2 API composition and delivery

  • apps/api/server.http.ts: API bootstrap and shutdown
  • apps/api/queue.worker.ts: worker bootstrap and background loops
  • apps/api/server/http/app.factory.ts: Express app assembly
  • apps/api/routes/*.ts: route wiring
  • apps/api/controllers/chat/query/*.ts: read/query/build/run delivery controllers
  • apps/api/middleware/*.ts: auth, rate limiting, validation, telemetry

19.3 Runs and execution

  • apps/api/services/runs/messageOrchestrator.service.ts: main send-message entry
  • apps/api/services/runs/runAdmission.service.ts: load shedding and admission control
  • apps/api/services/runs/runMetadata.ts: durable metadata/checkpoint schema
  • apps/api/services/runs/runEvents.service.ts: publish/persist run events
  • apps/api/services/runs/agent-run-worker/*: worker execution engine
  • apps/api/services/run-event-stream-utils/service.ts: replay + live SSE bridge

19.4 Chat session runtime

  • apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
  • apps/api/services/chat/session/loop/agentLoop.runner.ts
  • apps/api/services/chat/session/events/handler.ts
  • apps/api/services/chat/session/orchestrator/*
  • apps/api/services/chat/session/loop/*

19.5 Planning

  • apps/api/services/planning/schemas.ts
  • apps/api/services/planning/analyzers/intentAnalyzer.ts
  • apps/api/services/planning/resolvers/dependency.resolver.ts
  • apps/api/services/planning/validators/*
  • apps/api/services/planning/workflow/*

19.6 LLM abstraction

  • apps/api/lib/llm/provider.client.ts: provider-specific generation/streaming
  • apps/api/lib/llm/provider.helpers.ts: normalization and provider/model checks
  • apps/api/lib/llm/compose.ts: prompt assembly
  • apps/api/lib/llm/prompts/sections.ts: main system prompt contract
  • apps/api/lib/llm/parser*.ts: streaming parser
  • apps/api/lib/llm/tokens*.ts: token budgeting

19.7 Sandbox

  • apps/api/services/sandbox/lifecycle/provisioning.ts
  • apps/api/services/sandbox/lifecycle/cleanup.ts
  • apps/api/services/sandbox/docker.service.ts
  • apps/api/services/sandbox/state.service.ts
  • apps/api/services/sandbox/command.service.ts
  • apps/api/services/sandbox/write/*
  • apps/api/services/sandbox/read/*
  • apps/api/services/sandbox/builder/*
  • apps/api/services/sandbox/templates/*

19.8 Preview/storage/routing

  • apps/api/services/storage.service.ts
  • apps/api/services/storage/*
  • apps/api/services/preview.service.ts
  • apps/api/services/previewRouting/*

19.9 GitHub

  • apps/api/services/github/github.useCase.ts
  • apps/api/services/github/sync.service.ts
  • apps/api/services/github/token.service.ts
  • apps/api/services/github/repoBinding.service.ts
  • packages/octokit/index.ts

19.10 Frontend routing and app shell

  • apps/web/app/layout.tsx
  • apps/web/app/providers.tsx
  • apps/web/app/page.tsx
  • apps/web/app/chat/[id]/page.tsx
  • apps/web/app/changelog/page.tsx
  • apps/web/app/api/auth/[...all]/route.ts

19.11 Frontend chat workspace

  • apps/web/components/chat/chatPageClient.tsx
  • apps/web/components/chat/chatWorkspace.tsx
  • apps/web/components/chat/chatWorkspaceDesktop.tsx
  • apps/web/components/chat/chatWorkspaceMobile.tsx
  • apps/web/components/chat/messages/*
  • apps/web/components/chat/sandbox/*
  • apps/web/stores/chatStream/*
  • apps/web/stores/sandbox/*
  • apps/web/lib/streaming/processors/chatStreamProcessor.ts
  • apps/web/lib/parsing/*
  • apps/web/lib/api/*

19.12 Shared packages

  • packages/auth/lib/schema.ts
  • packages/auth/lib/auth.ts
  • packages/auth/lib/db.ts
  • packages/auth/lib/run.ts
  • packages/auth/lib/build.ts
  • packages/shared/src/constants.ts
  • packages/shared/src/schema.ts
  • packages/shared/src/streamEvents.ts
  • packages/shared/src/chat/types.ts
  • packages/shared/src/chat/streamActions.ts
  • packages/shared/src/api/contracts.ts
  • packages/ui/src/components/*

20. Directory-level inventory

This is not a prose description of every leaf UI component, because that would bury the actual KT. Instead, use this as the completeness map for where code lives.

20.1 apps/api

Major subareas:

  • controllers/
  • routes/
  • middleware/
  • lib/
  • services/
  • schemas/
  • utils/
  • tests/

High-density domains:

  • services/chat
  • services/runs
  • services/sandbox
  • services/planning
  • services/github
  • services/queue

20.2 apps/web

Major subareas:

  • app/
  • components/chat/
  • components/home/
  • components/changelog/
  • hooks/
  • lib/
  • stores/chatStream/
  • stores/sandbox/

High-density domains:

  • chat UI and workspace
  • SSE parsing and stream orchestration
  • sandbox/editor/preview UI

20.3 packages/auth

Major subareas:

  • lib/: auth, db, schema, build/run helpers
  • drizzle/: SQL migrations

20.4 packages/shared

Major subareas:

  • src/constants.ts
  • src/schema.ts
  • src/streamEvents.ts
  • src/chat/*
  • src/github/*
  • src/api/*
  • src/llm/*

20.5 packages/ui

Major subareas:

  • src/components/: reusable primitives
  • src/hooks/
  • src/lib/
  • src/styles/

20.6 docker/templates

Major subareas:

  • nextjs
  • vite-react
  • vanilla
  • base

These templates define the scaffold/runtime assumptions for generated projects and sandbox images.


21. Practical onboarding order for a new engineer

If I were onboarding someone senior-but-new, I would ask them to read in this exact order:

  1. Root README.md
  2. apps/api/README.md
  3. packages/auth/lib/schema.ts
  4. apps/api/server.http.ts
  5. apps/api/server/http/app.factory.ts
  6. apps/api/routes/chat.routes.ts
  7. apps/api/services/runs/messageOrchestrator.service.ts
  8. apps/api/services/runs/runAdmission.service.ts
  9. apps/api/services/runs/agent-run-worker/processor.ts
  10. apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
  11. apps/api/services/chat/session/loop/agentLoop.runner.ts
  12. apps/api/services/chat/session/events/handler.ts
  13. apps/api/services/sandbox/lifecycle/provisioning.ts
  14. apps/api/services/sandbox/builder/unified-build/orchestrator.ts
  15. apps/web/app/chat/[id]/page.tsx
  16. apps/web/components/chat/chatPageClient.tsx
  17. apps/web/stores/chatStream/controller.ts
  18. apps/web/lib/streaming/processors/chatStreamProcessor.ts

If they understand those files, they understand the heart of Edward.


22. Final summary

Edward is a monorepo for a durable AI code-generation product, not a thin LLM wrapper. The architecture optimizes for correctness, replayability, operator safety, and product durability:

  • Postgres keeps durable truth.
  • Redis handles fast coordination.
  • BullMQ decouples request acceptance from long-running execution.
  • Docker sandboxes isolate generated code.
  • S3/CDN/KV separate preview hosting from execution.
  • SSE plus persisted run events make streaming resumable.
  • Shared packages prevent contract drift.
  • Planning, validation, autofix, and retry layers turn model output into something operationally usable.

That is the core “why” behind almost every serious architectural decision in this repo.

Shared via Inkdown