Overview

Edward KT

1. What Edward actually is

Edward is an AI-assisted frontend application builder.

At the product level, the user experience is:s

A user signs in with GitHub.
The user stores their own LLM API key inside the product.
The user asks Edward to create, edit, or fix a web app in chat.
Edward plans the work, streams progress, writes files into an isolated sandbox, installs dependencies, runs commands, builds the app, uploads preview assets, and optionally syncs the code to GitHub.
The user sees the generated app, file tree, terminal/build feedback, and live preview.

The system is intentionally split across:

apps/web: the product UI and auth surface
apps/api: the synchronous HTTP API, SSE delivery, orchestration, and infrastructure adapters
apps/api/queue.worker.ts: the asynchronous worker process for long-running build/run jobs
packages/auth: auth + database schema + database access
packages/shared: shared contracts, enums, stream event types, chat types, model metadata
packages/ui: shared UI primitives for the frontend
packages/octokit: shared GitHub sync client helpers

This split is not accidental. Edward is not just “a chat app”. It is a distributed system with:

durable state
transient coordination state
long-running execution
resumable streaming
isolated code execution
artifact publishing
external system integration

That is the fundamental lens juniors should use when reading this repo.

2. The highest-level architectural decisions and why they exist

2.1 Why a monorepo

Edward uses a pnpm + Turbo monorepo because the frontend, API, worker, DB schema, shared contracts, and UI primitives evolve together.

Why this was chosen instead of separate repos:

Shared contracts are first-class. Stream events, chat types, API response shapes, model catalogs, and rate-limit scopes must not drift.
Frontend and backend changes usually land together. A single PR often changes stream event shape, API behavior, and UI rendering.
Tooling is simpler. One workspace gives consistent TypeScript, ESLint, and build orchestration.
Local development is more realistic. pnpm dev starts the whole system shape, not disconnected fragments.

Tradeoff:

Monorepos become large and noisy.
Build graph discipline is required.

This repo counters that with:

separate packages for shared concerns
Turbo task boundaries
architecture boundary checks in apps/api/scripts/quality

2.2 Why Next.js on the frontend

apps/web is a Next.js 16 App Router app.

Why Next.js instead of plain Vite for the product UI:

server-rendered marketing/auth/changelog pages are easier
metadata, sitemap, robots, SEO handling are built in
auth route handling integrates cleanly with Better Auth
product pages and landing pages can live in the same app

Important nuance:

Edward generates Vite/Next/Vanilla projects for users, but the Edward product itself uses Next.js. Those are different concerns.

2.3 Why Express for the API instead of putting everything inside Next route handlers

The repo deliberately keeps orchestration in apps/api as a separate Express app.

Why:

stream-oriented chat delivery is easier to reason about in a dedicated API
queue worker and API share backend services cleanly
long-running orchestration should not be tightly coupled to the frontend deployment shape
HTTP API can scale independently from the web UI
operational concerns like graceful shutdown, Redis connections, worker coordination, and SSE backpressure are clearer in a dedicated server

Why not “just use Next API routes”:

the runtime model becomes mixed and harder to reason about
long-running SSE and worker orchestration become more awkward
explicit process separation is valuable here

2.4 Why a dedicated worker process

The API process accepts requests. The worker process executes long-running jobs.

Why this is the correct split:

user HTTP connections are short-lived and unreliable
LLM execution, sandbox interaction, dependency install, build, and artifact publishing can outlive the browser connection
workers give retries, concurrency control, and isolation from request latency
the system can continue work even if the client disconnects

This is one of the most important architectural choices in the whole repo.

Without it, Edward would behave like a fragile synchronous demo.

With it, Edward behaves like a real product with durable execution.

2.5 Why Postgres

Postgres is the durable source of truth.

It stores:

users, sessions, accounts
chats and messages
runs and run events
builds
attachments
repo bindings and related product state

Why Postgres instead of Redis-only or document storage:

chat history and run history are durable product data, not cache
run events need ordering and replay
auth tables fit naturally into a relational model
ownership and filtering by user/chat/run are frequent and relational
transactional admission control is easier and safer

The strongest proof is createRunWithUserLimit in packages/auth/lib/run.ts. It uses DB transactions and advisory locks for safe concurrent run admission.

2.6 Why Redis

Redis is used everywhere, but always for fast, ephemeral, coordination-heavy workloads.

Redis responsibilities:

BullMQ queue backend
distributed locks
sandbox state and TTLs
write buffers for streamed file output
pub/sub for run cancellation and live status fanout
request rate-limit stores

Why Redis instead of doing all of this in Postgres:

queues and pub/sub need low-latency operational semantics
lock contention and TTL semantics are simpler in Redis
sandbox write buffering is a classic ephemeral buffering problem
rate limiting is a cache-like, time-windowed workload

This is a good architecture split:

Postgres = durable truth
Redis = fast coordination and ephemeral runtime state

2.7 Why BullMQ

BullMQ is the job orchestration layer on top of Redis.

Why BullMQ:

already fits the Redis runtime Edward needs
supports worker concurrency and queue separation
familiar operational model for TypeScript/Node systems
good fit for “enqueue durable work from API, execute elsewhere”

Why not build a custom queue:

queue correctness is hard
retries, visibility, backpressure, and operational observability are all non-trivial

Edward keeps two main job categories:

build jobs
agent run jobs

That separation matters because build workloads and agent-stream workloads have different runtime behavior and failure modes.

2.8 Why Docker-backed sandboxes

This is the most product-defining infrastructure choice.

Edward writes and executes generated code in Docker containers.

Why:

isolation from the host machine
deterministic environment for install/build/command execution
safer command execution boundary
framework-specific template images can be prewarmed
easier cleanup and lifecycle control

Why not run code directly on the API machine filesystem:

far less safe
dependency conflicts become unmanageable
cleanup is harder
one broken project can poison the environment for others

The sandbox is not just an implementation detail. It is the product’s execution boundary.

2.9 Why S3 + CloudFront + optional Cloudflare KV

Edward separates “code generation” from “preview hosting”.

S3 stores preview artifacts
CloudFront serves previews and can be invalidated
Cloudflare KV optionally maps subdomains to preview storage prefixes

Why this is better than serving builds directly from live containers:

static previews are cheaper and more stable
finished artifacts survive sandbox lifecycle cleanup
containers can be ephemeral while previews remain available
preview serving and code execution are decoupled

This is a strong architecture choice because it turns preview hosting into a static asset problem instead of a container serving problem.

2.10 Why GitHub OAuth + GitHub sync

GitHub is used both for user authentication and for repository integration.

Why GitHub auth:

Edward’s target users already live in GitHub-centric workflows
repo sync is a core feature, so GitHub identity is a natural anchor
one provider reduces auth complexity

Why sync through GitHub APIs instead of shelling out to git in the sandbox:

less credential management complexity inside containers
direct tree/blob APIs are deterministic and auditable
easier to manage partial file sync and manifests

2.11 Why bring-your-own API keys

Users store their own OpenAI/Gemini/Anthropic API keys.

Why:

cost ownership stays with the user
provider choice stays with the user
product does not need to centralize all model billing risk
enterprise-style customers often prefer controlling their own provider credentials

Why encrypted at rest:

these are real credentials, not preferences
API keys must not sit in plaintext in Postgres

This is why apps/api/utils/encryption.ts and apps/api/utils/secretEnvelope.ts exist.

2.12 Why SSE instead of WebSockets for streaming

Edward streams progress to the browser using Server-Sent Events.

Why SSE:

server-to-client streaming is the actual need
simpler than full bidirectional socket infra
reconnect semantics are straightforward
Last-Event-ID replay model fits persisted run events well

Why not WebSockets:

more operational complexity
bidirectional realtime control is not the dominant requirement
durability + replay matter more than low-level socket interactivity

2.13 Why persist stream events

This is one of the smartest choices in the repo.

Edward does not treat stream events as disposable transport. It stores them in run_event.

Why:

browser reconnects can resume streams
active run pages can rehydrate state after refresh
debugging becomes far easier
a run has a durable audit trail
the stream is no longer coupled to one live TCP connection

This is the reason the system feels durable instead of “best effort”.

3. Runtime topology

At runtime, think in terms of five planes.

3.1 Presentation plane

apps/web
Next.js pages, route handlers, React state, chat workspace, sandbox UI

3.2 Delivery/API plane

apps/api/server.http.ts
Express routes/controllers/middleware
SSE response management

3.3 Execution plane

apps/api/queue.worker.ts
BullMQ workers
agent runs, builds, backups

3.4 Durable data plane

Postgres via Drizzle in packages/auth

3.5 Ephemeral coordination plane

Redis
locks
queue backend
pub/sub
sandbox state
rate limiting

The system is healthy when these planes are kept conceptually separate.

4. Repo walkthrough

4.1 Root

Important root files:

package.json: workspace scripts and security overrides
pnpm-workspace.yaml: workspace package boundaries
turbo.json: task graph and env propagation
README.md: product/dev bootstrap overview
eslint.config.mjs: root lint baseline
tsconfig.json: root TS setup
scripts/build-local-sandboxes.sh: builds local sandbox images

Why root-level task discipline matters:

the repo has multiple deployable units
env propagation is explicit
broken build assumptions must fail early

4.2 apps/web

What it owns:

landing page
auth session bootstrapping
chat workspace
file/editor/preview UI
changelog UI
browser-side stream orchestration

Most important files:

app/layout.tsx: global metadata, fonts, providers, navbar shell
app/providers.tsx: React Query, theme, notifications, chat stream provider
app/page.tsx: landing page entry
app/chat/[id]/page.tsx: server-side access probe + metadata generation for chat pages
components/chat/chatPageClient.tsx: client orchestration for a chat route
components/chat/chatWorkspace.tsx: core desktop/mobile workspace composition
stores/chatStream/controller.ts: start/resume/cancel stream orchestration
stores/chatStream/useStartStream.ts: new message stream mutation flow
lib/streaming/processors/chatStreamProcessor.ts: SSE event consumption into UI state
stores/sandbox/*: file/build/terminal UI state
hooks/server-state/*: React Query data fetching hooks
lib/api/*: API client surface

Why this frontend is split between React Query and Zustand:

React Query handles server state: chat history, metadata, active runs, quotas
Zustand handles high-churn UI runtime state: live streaming text, file chunks, open panel state, build errors, terminal lines

That is the right split.

If streaming state lived fully inside React Query, it would be awkward and too mutation-heavy. If server state lived fully inside Zustand, cache invalidation and stale-fetch logic would get worse.

4.3 apps/api

What it owns:

request authentication and validation
chat/run/build endpoints
run orchestration
planning workflow
LLM abstraction
parser and tool event handling
sandbox orchestration
GitHub sync orchestration
preview routing

Composition roots:

server.http.ts: API process bootstrap
queue.worker.ts: worker bootstrap

Important structure:

routes/: HTTP surface
controllers/: transport and response wiring
services/: application + infra orchestration
lib/: adapters/clients/shared helpers
middleware/: auth, rate limit, validation, telemetry
schemas/: Zod request contracts
tests/: mostly API-side tests, mirrored by module

4.4 packages/auth

What it owns:

Better Auth instance
Drizzle/Postgres connection
database schema
basic data access helpers for runs/builds
migrations

Why it is a separate package:

both web and API need auth/schema awareness
keeping schema in API app only would over-couple layers

4.5 packages/shared

What it owns:

stream event contracts
chat UI types
API contract types
model catalog and provider detection
rate-limit scopes/policies
shared parsing helpers

This package is the anti-drift package.

If this package did not exist, frontend/backend stream contracts would break constantly.

4.6 packages/ui

What it owns:

reusable UI building blocks
shared styling artifacts
navigation, skeletons, toasts, hooks, utilities

Why it exists:

product UI should reuse primitives instead of re-implementing them
keeps apps/web focused on product behavior, not low-level component plumbing

4.7 packages/octokit

What it owns:

GitHub API client creation
repo/branch creation helpers
manifest-based sync

Why separate:

GitHub sync logic is infrastructure-adjacent and reusable
keeping raw Octokit usage isolated reduces API surface spread

5. Data model and why each table exists

The schema in packages/auth/lib/schema.ts is the durable model of the product.

5.1 Auth tables

user
session
account
verification

Why:

Better Auth expects these concepts
GitHub OAuth identity and session state need durable storage
Edward also stores product-specific fields on user, especially apiKey and preferredModel

5.2 Product tables

chat: the top-level project/conversation container
message: user/assistant message history
attachment: images attached to messages
build: build lifecycle records
run: durable execution record for one assistant generation flow
run_event: persisted stream events
run_tool_call: durable tool invocation records

5.3 Why `chat`

chat is not just a thread id. It is the unit of project identity.

It owns:

title/description
SEO fields
GitHub repo binding
custom subdomain

Why not put these elsewhere:

the project is anchored to the conversation
preview and repo sync are project-level concerns

5.4 Why `message`

Messages are user-facing history.

Important detail:

assistant output is persisted as a normal message
run events are not a replacement for message history

Why both message and run_event exist:

message = final conversational artifact
run_event = execution trace / replay log

That distinction is important.

5.5 Why `run`

run exists because one assistant generation is not a simple request-response.

A run has:

queue state
current state machine position
current turn
termination reason
metadata
linkage to user and assistant messages

Why not derive all of this from messages:

messages do not capture execution status, retries, checkpoints, cancellation, or turn-level progress

5.6 Why `run_event`

run_event is the stream replay ledger.

Why this table is critical:

live SSE can reconnect from a sequence number
past run behavior can be debugged
session completion can be inferred from persisted events
worker/API/browser all get a shared truth

5.7 Why `run_tool_call`

This is the durability and idempotency guard for tools.

Why not only emit tool output into run_event:

tools have inputs, outputs, duration, status, and idempotency semantics
tool calls are not just display events; they are execution records

5.8 Why `build`

Preview build lifecycle is separate from generation lifecycle.

That is the correct model because:

generation can succeed while build fails
build status must be independently queryable and streamable
previews need their own duration/error metadata

5.9 Why attachments are separate

Attachments are not embedded directly inside messages because:

metadata is structured
message text and binary/media references are different concerns
image uploads have different constraints and lifecycle

6. The most important end-to-end flow: send a chat message

This is the core product path.

6.1 Browser starts the stream

Frontend entry:

apps/web/stores/chatStream/useStartStream.ts
apps/web/lib/api/chat.ts

The browser:

acquires a submission lock
creates optimistic UI state
calls POST /chat/message
begins consuming SSE frames

Why the submission lock exists:

to prevent accidental duplicate sends
to avoid overlapping UI-side submissions before the server admits a run

6.2 API authenticates and validates

Backend route:

apps/api/routes/chat.routes.ts

Middleware:

auth
rate limit
request validation

Why this layering exists:

fail cheap and early
keep orchestration code free from repeated input checks

6.3 API resolves model and user credentials

In unifiedSendMessage:

user API key is loaded
key is decrypted
provider is inferred
chosen model is checked against provider

Why this validation exists:

a stored Gemini key with an OpenAI model choice is an avoidable operator error
fail-fast here is much cleaner than failing deep in LLM execution

6.4 API creates or loads chat + persists the user message

This happens via chat.service.ts.

Why persist before execution:

user intent must be durable even if downstream execution fails
history should not disappear because a worker crashed

6.5 API runs planning workflow

Planning is not the same as generation.

Workflow engine:

services/planning/workflow/engine.ts

Main early phases:

analyze intent
resolve packages
install packages
build/deploy/recover as needed

Why have a workflow at all:

generation quality improves when framework/packages/intent are normalized first
retries and step-level state become explicit
the system can reason in phases instead of one giant black-box prompt

6.6 API creates an admitted durable run

Run admission:

services/runs/runAdmission.service.ts
packages/auth/lib/run.ts

Why run admission matters:

the product must limit active execution globally, per user, and per chat
one chat should not have multiple conflicting active generations
the API must reject overload before enqueuing dangerous work

The use of transactional advisory locks here is a sign of mature concurrency thinking.

6.7 API enqueues worker job

After the run is admitted, the API enqueues the job in BullMQ.

Why queue after durable DB write:

DB becomes the source of truth
if enqueue fails, the run can be marked failed explicitly
the system is not dependent on HTTP lifetime

6.8 Browser switches from request stream to durable run stream

The API immediately begins streaming persisted/live run events via streamRunEventsFromPersistence.

Why this handoff is elegant:

the frontend can keep one streaming UX
internally, the source is durable run event persistence, not just the original request handler

This is how Edward bridges synchronous UX and asynchronous execution.

7. The second important flow: worker-run execution

Worker entry:

apps/api/services/runs/agent-run-worker/processor.ts

7.1 Worker reloads durable context

The worker fetches:

run record
metadata
user API key
historical conversation context

Why:

worker must be independently restartable
it cannot depend on the request process keeping in-memory state alive

7.2 Worker subscribes to cancellation

It listens on Redis pub/sub channels like edward:run-cancel:<runId>.

Why pub/sub plus DB terminal checks both exist:

pub/sub gives low-latency cancellation
DB polling gives durable truth if a pub/sub signal is missed

This dual mechanism is deliberate defense-in-depth.

7.3 Worker marks run running

The worker updates durable run state to running.

Why not mark it when API enqueues:

enqueued is not the same as actively executing
status must reflect reality, not intent

7.4 Worker captures run events through a fake response

This is a subtle but strong pattern.

createRunEventCaptureResponse lets the streaming session code write events as if it were writing to an HTTP response, while the worker intercepts those events and persists/publishes them.

Why this is good:

shared stream-session logic can be reused by API and worker paths
the event producer does not need to know whether the sink is a real socket or a persistence pipeline

7.5 Worker finalizes success or failure

Success path:

update terminal run state
clear checkpoint
store duration/latency metadata

Failure path:

persist error and terminal completion events
mark run failed

Why explicit finalize helpers exist:

terminal transitions are high-risk correctness points
centralizing them reduces double-completion bugs

8. The stream runtime and why it is designed this way

Core files:

runStreamSession.orchestrator.ts
agentLoop.runner.ts
events/handler.ts
lib/llm/parser.ts

8.1 Why orchestration is separate from the raw LLM client

LLM API calls are the smallest part of the feature.

Edward also needs:

prompt composition
token budgeting
parser state handling
sandbox side effects
tool execution
validation/autofix/retry
persistence/finalization

That is why the orchestration layer exists above provider.client.ts.

8.2 Why the model outputs tagged markup

Edward instructs the model to output strict Edward tags:

thinking
response
sandbox
file
install
command
web search
done

Why tagged output instead of “just ask for code”:

the product needs machine-readable execution intent
file boundaries must be recoverable
installs and commands must be explicit
partial streaming must still be parseable

This is a classic “LLM as structured protocol emitter” design.

8.3 Why there is a streaming parser state machine

lib/llm/parser.ts is a state machine because streamed output arrives in incomplete chunks.

Why not parse with simple regex over full strings:

chunks can split tags across boundaries
file/install/sandbox sections can nest temporal states
incomplete output must still be handled safely

This parser is not overengineering. It is required for correctness in streamed generation.

8.4 Why there is an agent loop, not one LLM call

runAgentLoop supports multiple turns.

Why:

the model may need to inspect, write, install, command, then continue
tool results need to feed back into later reasoning
retries/continuations need bounded turn accounting

Why hard budgets exist:

prevent runaway loops
bound cost
bound context growth
preserve operational predictability

8.5 Why token usage is computed before and during execution

Edward computes provider-aware token usage because context exhaustion is one of the most common real failure modes in agent systems.

Why this is necessary:

different providers have different token windows
multimodal content changes token budgeting
strict output reservation prevents generation from crowding out response budget

8.6 Why post-generation validation, autofix, and strict retry exist

Generated code is probabilistic. Production systems must add deterministic safety rails.

Edward uses:

postgen validation
deterministic autofixes
strict retry

Why this layered approach is better than “just regenerate everything”:

deterministic fixes are cheaper and faster
validation localizes problems
retry is only used when the output contract is still violated

This is one of the strongest “productionized AI” patterns in the repo.

9. Planning workflow and why it exists separately from stream execution

Planning modules:

services/planning/schemas.ts
workflow/engine.ts
analyzers/intentAnalyzer.ts
resolvers/dependency.resolver.ts
validators/postgenValidator.ts

9.1 Why planning is a workflow

Because planning has recoverable phases, not just a single pass.

Phases include:

ANALYZE
RESOLVE_PACKAGES
INSTALL_PACKAGES
GENERATE
BUILD
DEPLOY
RECOVER

Why explicit phase modeling matters:

allows retries with context
improves debuggability
lets the system fail in a known stage
reduces the amount of work shoved into one prompt

9.2 Why intent analysis uses the LLM but is schema-constrained

intentAnalyzer.ts asks the model for JSON and validates it with Zod.

Why:

intent classification is a fuzzy problem
but downstream code wants structured outputs

So the design is:

use LLM for ambiguity resolution
use schema validation for control
use fallback logic when classification fails

This is the right balance.

9.3 Why dependency resolution exists

The model may recommend packages, but the runtime must filter/verify them.

Why:

package names can be wrong
peer conflicts matter
some packages are blocked for sandbox/runtime reasons

This is why the system does not blindly trust model-emitted package lists.

10. Sandbox architecture and why it is unusually important here

Key modules:

lifecycle/provisioning.ts
docker.service.ts
write/buffer.ts
write/flush.ts
command.service.ts
builder/unified-build/orchestrator.ts
state.service.ts

10.1 Why sandbox state is in Redis

Sandbox instances are ephemeral runtime resources with TTLs.

Why Redis instead of Postgres here:

sandbox liveness is operational state, not primary product truth
TTL refresh and quick lookup matter
container lifecycle reconciliation is fast-path coordination

10.2 Why sandbox writes are buffered

Edward streams file content incrementally. Writing every tiny chunk directly to disk/container would be noisy and slow.

So the system:

buffers file chunks in Redis
periodically flushes them to the container
uses distributed locks around flush

Why this is smart:

reduces write churn
handles chunked file streaming naturally
coordinates concurrent writes safely

10.3 Why protected framework files exist

Template registry marks files like package.json, tsconfig, framework configs, and core CSS files as protected.

Why:

models are much less reliable when editing sensitive build/config files
most user value is in app code, not infra/config drift
protecting these files preserves build stability

This is a product safety rail, not a limitation by accident.

10.4 Why command execution is allowlisted

command.service.ts validates:

command name
argument count/length
path safety
dangerous patterns

Why:

the sandbox is isolated, but still not trusted blindly
guardrails reduce accidental destructive behavior
product behavior becomes auditable and predictable

10.5 Why builds happen after generation in a unified build orchestrator

buildAndUploadUnified handles:

dependency presence checks
framework detection
merge/install dependency logic
build execution
preview upload
cache invalidation

Why not “just run npm build”:

different frameworks output differently
preview hosting needs path/base handling
dependencies may need reconciliation
upload and routing are part of the build product

11. Preview and deployment architecture

11.1 Path mode vs subdomain mode

Configured via EDWARD_DEPLOYMENT_TYPE.

Why two modes:

local/self-hosted environments often want simple path-based previews
production environments may want nicer subdomain-based previews

This avoids forcing one infrastructure assumption everywhere.

11.2 Why preview routing uses KV

Subdomain routing needs a fast edge lookup from subdomain -> storage prefix.

Cloudflare KV is a practical fit because:

low-latency reads at edge
simple key-value mapping
decoupled from the main DB

Why not use Postgres for request-time routing:

worse latency profile for edge routing
unnecessary coupling between preview serving and primary transactional DB

11.3 Why preview URL is also stored on build records

Because the user cares about “what is the latest preview for this chat/build”.

Persisting preview URLs on builds means:

API can answer build status quickly
UI can bootstrap preview state without recomputing routing every time

12. GitHub integration architecture

Important files:

packages/octokit/index.ts
apps/api/services/github/sync.service.ts
apps/api/services/github/token.service.ts
apps/api/services/github/repoBinding.service.ts

12.1 Why repo binding is a first-class concept

Chats/projects can be linked to repos.

Why bind at chat level:

a chat represents a project
repo sync is a project concern, not a user-global concern

12.2 Why GitHub token handling is wrapped

token.service.ts decrypts and migrates token storage.

Why centralize this:

auth provider data should not be parsed ad hoc everywhere
token encryption migration needs one place

12.3 Why sync uses a manifest

packages/octokit/index.ts uses .edward-sync-manifest.json.

Why:

lets Edward track which files it manages
supports deletion of files removed locally
avoids blind destructive sync over unknown repo content

This is a very practical design choice.

13. Frontend state architecture

13.1 Server state

Handled mainly with React Query:

chat history
metadata
quotas
active run lookup
GitHub status

Why:

cache lifecycle
stale time
refetch policies
request deduplication

13.2 Stream state

Handled with Zustand chat stream store.

Why:

append-heavy mutable event streams
low friction updates per chunk/frame
simpler than putting stream mutation logic into React component trees

13.3 Sandbox UI state

Handled with Zustand sandbox slices:

files
editor selection
preview URL
build status/errors
terminal output
open/close state

Why separate from chat stream state:

stream state represents live assistant output
sandbox state represents persistent project workspace UI

These are related but not identical concerns.

13.4 Why the chat route does server-side access probing

apps/web/app/chat/[id]/page.tsx checks access and metadata server-side.

Why:

avoids client-only “flash then deny”
supports route metadata generation
improves correctness for private chat pages

14. Security posture and why these controls exist

Important controls:

auth middleware for all protected API routes
rate limits backed by Redis
encrypted API keys
command allowlists
protected template files
Docker isolation
CSP/helmet/cors on API
request ids and security telemetry

Why the repo has many small security modules instead of one giant security file:

security concerns happen at different layers
auth, rate limit, encryption, runtime isolation, and telemetry are separate controls

This is the correct decomposition.

15. Reliability posture and why the repo feels “production-minded”

Signals of maturity:

graceful shutdown in API and worker
durable run events
queue-based long-running execution
resumable streams using Last-Event-ID
cancellation via pub/sub plus durable verification
DB-backed admission control
checkpointing of agent loop state
explicit terminal finalization logic
post-generation validators
quality gate scripts

These are not “extra code”. They are the difference between demo code and production-oriented code.

16. Testing and quality gates

Current rough shape:

API tests: about 91 files
Web tests: light, about 3 files
Shared package tests: light but present

Why API tests dominate:

most complexity and failure modes live in orchestration, streaming, sandboxing, and workers
UI is large, but much of it is composition/presentation on top of backend contracts

Quality scripts in apps/api/scripts/quality enforce:

architecture boundaries
duplication checks
coverage checks
function-length checks
file audit generation

Why these custom scripts exist:

generic linting does not enforce architecture well enough
this codebase has non-trivial layering rules

17. Key things a junior engineer must understand before making changes

17.1 Never confuse message history with run execution history

message is the conversation artifact
run_event is the execution/replay log

17.2 Never treat Redis state as the source of truth for business data

Redis is coordination state. Postgres is durable business truth.

17.3 Never assume the browser connection is the lifetime of the work

The worker owns durable execution. The browser only observes it.

17.4 Never casually edit protected sandbox template files or remove guardrails

Those protections exist because AI-generated config churn destroys stability.

17.5 Never bypass run admission and queueing for “quick fixes”

That breaks fairness, concurrency guarantees, and operational predictability.

17.6 Never add a new stream event without updating both sides

If you change stream contracts, review:

backend emitters
persistence/replay
frontend stream processor
shared type contracts

18. How I would explain the main architectural “WHY” in one paragraph

Edward is built as a durable, queue-backed, sandboxed AI execution platform rather than a thin chat wrapper because real code generation is slow, stateful, failure-prone, and operationally dangerous. The architecture separates durable truth (Postgres), fast coordination (Redis), long-running work (BullMQ worker), isolated execution (Docker sandboxes), and progressive UX (SSE + persisted run events). The frontend is split between React Query for server truth and Zustand for live streaming/UI runtime state. The repo uses shared packages to keep contracts aligned and validators/guardrails to turn probabilistic model output into something closer to a deterministic product.

19. File map by major area

This section is the “how do I navigate the repo quickly” map.

19.1 Root and workspace

README.md: product + local setup overview
package.json: top-level scripts
turbo.json: build graph and env config
pnpm-workspace.yaml: workspace package boundaries
scripts/build-local-sandboxes.sh: local sandbox image prep

19.2 API composition and delivery

apps/api/server.http.ts: API bootstrap and shutdown
apps/api/queue.worker.ts: worker bootstrap and background loops
apps/api/server/http/app.factory.ts: Express app assembly
apps/api/routes/*.ts: route wiring
apps/api/controllers/chat/query/*.ts: read/query/build/run delivery controllers
apps/api/middleware/*.ts: auth, rate limiting, validation, telemetry

19.3 Runs and execution

apps/api/services/runs/messageOrchestrator.service.ts: main send-message entry
apps/api/services/runs/runAdmission.service.ts: load shedding and admission control
apps/api/services/runs/runMetadata.ts: durable metadata/checkpoint schema
apps/api/services/runs/runEvents.service.ts: publish/persist run events
apps/api/services/runs/agent-run-worker/*: worker execution engine
apps/api/services/run-event-stream-utils/service.ts: replay + live SSE bridge

19.4 Chat session runtime

apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
apps/api/services/chat/session/loop/agentLoop.runner.ts
apps/api/services/chat/session/events/handler.ts
apps/api/services/chat/session/orchestrator/*
apps/api/services/chat/session/loop/*

19.5 Planning

apps/api/services/planning/schemas.ts
apps/api/services/planning/analyzers/intentAnalyzer.ts
apps/api/services/planning/resolvers/dependency.resolver.ts
apps/api/services/planning/validators/*
apps/api/services/planning/workflow/*

19.6 LLM abstraction

apps/api/lib/llm/provider.client.ts: provider-specific generation/streaming
apps/api/lib/llm/provider.helpers.ts: normalization and provider/model checks
apps/api/lib/llm/compose.ts: prompt assembly
apps/api/lib/llm/prompts/sections.ts: main system prompt contract
apps/api/lib/llm/parser*.ts: streaming parser
apps/api/lib/llm/tokens*.ts: token budgeting

19.7 Sandbox

apps/api/services/sandbox/lifecycle/provisioning.ts
apps/api/services/sandbox/lifecycle/cleanup.ts
apps/api/services/sandbox/docker.service.ts
apps/api/services/sandbox/state.service.ts
apps/api/services/sandbox/command.service.ts
apps/api/services/sandbox/write/*
apps/api/services/sandbox/read/*
apps/api/services/sandbox/builder/*
apps/api/services/sandbox/templates/*

19.8 Preview/storage/routing

apps/api/services/storage.service.ts
apps/api/services/storage/*
apps/api/services/preview.service.ts
apps/api/services/previewRouting/*

19.9 GitHub

apps/api/services/github/github.useCase.ts
apps/api/services/github/sync.service.ts
apps/api/services/github/token.service.ts
apps/api/services/github/repoBinding.service.ts
packages/octokit/index.ts

19.10 Frontend routing and app shell

apps/web/app/layout.tsx
apps/web/app/providers.tsx
apps/web/app/page.tsx
apps/web/app/chat/[id]/page.tsx
apps/web/app/changelog/page.tsx
apps/web/app/api/auth/[...all]/route.ts

19.11 Frontend chat workspace

apps/web/components/chat/chatPageClient.tsx
apps/web/components/chat/chatWorkspace.tsx
apps/web/components/chat/chatWorkspaceDesktop.tsx
apps/web/components/chat/chatWorkspaceMobile.tsx
apps/web/components/chat/messages/*
apps/web/components/chat/sandbox/*
apps/web/stores/chatStream/*
apps/web/stores/sandbox/*
apps/web/lib/streaming/processors/chatStreamProcessor.ts
apps/web/lib/parsing/*
apps/web/lib/api/*

19.12 Shared packages

packages/auth/lib/schema.ts
packages/auth/lib/auth.ts
packages/auth/lib/db.ts
packages/auth/lib/run.ts
packages/auth/lib/build.ts
packages/shared/src/constants.ts
packages/shared/src/schema.ts
packages/shared/src/streamEvents.ts
packages/shared/src/chat/types.ts
packages/shared/src/chat/streamActions.ts
packages/shared/src/api/contracts.ts
packages/ui/src/components/*

20. Directory-level inventory

This is not a prose description of every leaf UI component, because that would bury the actual KT. Instead, use this as the completeness map for where code lives.

20.1 `apps/api`

Major subareas:

controllers/
routes/
middleware/
lib/
services/
schemas/
utils/
tests/

High-density domains:

services/chat
services/runs
services/sandbox
services/planning
services/github
services/queue

20.2 `apps/web`

Major subareas:

app/
components/chat/
components/home/
components/changelog/
hooks/
lib/
stores/chatStream/
stores/sandbox/

High-density domains:

chat UI and workspace
SSE parsing and stream orchestration
sandbox/editor/preview UI

20.3 `packages/auth`

Major subareas:

lib/: auth, db, schema, build/run helpers
drizzle/: SQL migrations

20.4 `packages/shared`

Major subareas:

src/constants.ts
src/schema.ts
src/streamEvents.ts
src/chat/*
src/github/*
src/api/*
src/llm/*

20.5 `packages/ui`

Major subareas:

src/components/: reusable primitives
src/hooks/
src/lib/
src/styles/

20.6 `docker/templates`

Major subareas:

nextjs
vite-react
vanilla
base

These templates define the scaffold/runtime assumptions for generated projects and sandbox images.

21. Practical onboarding order for a new engineer

If I were onboarding someone senior-but-new, I would ask them to read in this exact order:

Root README.md
apps/api/README.md
packages/auth/lib/schema.ts
apps/api/server.http.ts
apps/api/server/http/app.factory.ts
apps/api/routes/chat.routes.ts
apps/api/services/runs/messageOrchestrator.service.ts
apps/api/services/runs/runAdmission.service.ts
apps/api/services/runs/agent-run-worker/processor.ts
apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
apps/api/services/chat/session/loop/agentLoop.runner.ts
apps/api/services/chat/session/events/handler.ts
apps/api/services/sandbox/lifecycle/provisioning.ts
apps/api/services/sandbox/builder/unified-build/orchestrator.ts
apps/web/app/chat/[id]/page.tsx
apps/web/components/chat/chatPageClient.tsx
apps/web/stores/chatStream/controller.ts
apps/web/lib/streaming/processors/chatStreamProcessor.ts

If they understand those files, they understand the heart of Edward.

22. Final summary

Edward is a monorepo for a durable AI code-generation product, not a thin LLM wrapper. The architecture optimizes for correctness, replayability, operator safety, and product durability:

Postgres keeps durable truth.
Redis handles fast coordination.
BullMQ decouples request acceptance from long-running execution.
Docker sandboxes isolate generated code.
S3/CDN/KV separate preview hosting from execution.
SSE plus persisted run events make streaming resumable.
Shared packages prevent contract drift.
Planning, validation, autofix, and retry layers turn model output into something operationally usable.

That is the core “why” behind almost every serious architectural decision in this repo.

Edward KT

1. What Edward actually is

Edward is an AI-assisted frontend application builder.

At the product level, the user experience is:s

A user signs in with GitHub.
The user stores their own LLM API key inside the product.
The user asks Edward to create, edit, or fix a web app in chat.
Edward plans the work, streams progress, writes files into an isolated sandbox, installs dependencies, runs commands, builds the app, uploads preview assets, and optionally syncs the code to GitHub.
The user sees the generated app, file tree, terminal/build feedback, and live preview.

The system is intentionally split across:

apps/web: the product UI and auth surface
apps/api: the synchronous HTTP API, SSE delivery, orchestration, and infrastructure adapters
apps/api/queue.worker.ts: the asynchronous worker process for long-running build/run jobs
packages/auth: auth + database schema + database access
packages/shared: shared contracts, enums, stream event types, chat types, model metadata
packages/ui: shared UI primitives for the frontend
packages/octokit: shared GitHub sync client helpers

This split is not accidental. Edward is not just “a chat app”. It is a distributed system with:

durable state
transient coordination state
long-running execution
resumable streaming
isolated code execution
artifact publishing
external system integration

That is the fundamental lens juniors should use when reading this repo.

2. The highest-level architectural decisions and why they exist

2.1 Why a monorepo

Edward uses a pnpm + Turbo monorepo because the frontend, API, worker, DB schema, shared contracts, and UI primitives evolve together.

Why this was chosen instead of separate repos:

Shared contracts are first-class. Stream events, chat types, API response shapes, model catalogs, and rate-limit scopes must not drift.
Frontend and backend changes usually land together. A single PR often changes stream event shape, API behavior, and UI rendering.
Tooling is simpler. One workspace gives consistent TypeScript, ESLint, and build orchestration.
Local development is more realistic. pnpm dev starts the whole system shape, not disconnected fragments.

Tradeoff:

Monorepos become large and noisy.
Build graph discipline is required.

This repo counters that with:

separate packages for shared concerns
Turbo task boundaries
architecture boundary checks in apps/api/scripts/quality

2.2 Why Next.js on the frontend

apps/web is a Next.js 16 App Router app.

Why Next.js instead of plain Vite for the product UI:

server-rendered marketing/auth/changelog pages are easier
metadata, sitemap, robots, SEO handling are built in
auth route handling integrates cleanly with Better Auth
product pages and landing pages can live in the same app

Important nuance:

Edward generates Vite/Next/Vanilla projects for users, but the Edward product itself uses Next.js. Those are different concerns.

2.3 Why Express for the API instead of putting everything inside Next route handlers

The repo deliberately keeps orchestration in apps/api as a separate Express app.

Why:

stream-oriented chat delivery is easier to reason about in a dedicated API
queue worker and API share backend services cleanly
long-running orchestration should not be tightly coupled to the frontend deployment shape
HTTP API can scale independently from the web UI
operational concerns like graceful shutdown, Redis connections, worker coordination, and SSE backpressure are clearer in a dedicated server

Why not “just use Next API routes”:

the runtime model becomes mixed and harder to reason about
long-running SSE and worker orchestration become more awkward
explicit process separation is valuable here

2.4 Why a dedicated worker process

The API process accepts requests. The worker process executes long-running jobs.

Why this is the correct split:

user HTTP connections are short-lived and unreliable
LLM execution, sandbox interaction, dependency install, build, and artifact publishing can outlive the browser connection
workers give retries, concurrency control, and isolation from request latency
the system can continue work even if the client disconnects

This is one of the most important architectural choices in the whole repo.

Without it, Edward would behave like a fragile synchronous demo.

With it, Edward behaves like a real product with durable execution.

2.5 Why Postgres

Postgres is the durable source of truth.

It stores:

users, sessions, accounts
chats and messages
runs and run events
builds
attachments
repo bindings and related product state

Why Postgres instead of Redis-only or document storage:

chat history and run history are durable product data, not cache
run events need ordering and replay
auth tables fit naturally into a relational model
ownership and filtering by user/chat/run are frequent and relational
transactional admission control is easier and safer

The strongest proof is createRunWithUserLimit in packages/auth/lib/run.ts. It uses DB transactions and advisory locks for safe concurrent run admission.

2.6 Why Redis

Redis is used everywhere, but always for fast, ephemeral, coordination-heavy workloads.

Redis responsibilities:

BullMQ queue backend
distributed locks
sandbox state and TTLs
write buffers for streamed file output
pub/sub for run cancellation and live status fanout
request rate-limit stores

Why Redis instead of doing all of this in Postgres:

queues and pub/sub need low-latency operational semantics
lock contention and TTL semantics are simpler in Redis
sandbox write buffering is a classic ephemeral buffering problem
rate limiting is a cache-like, time-windowed workload

This is a good architecture split:

Postgres = durable truth
Redis = fast coordination and ephemeral runtime state

2.7 Why BullMQ

BullMQ is the job orchestration layer on top of Redis.

Why BullMQ:

already fits the Redis runtime Edward needs
supports worker concurrency and queue separation
familiar operational model for TypeScript/Node systems
good fit for “enqueue durable work from API, execute elsewhere”

Why not build a custom queue:

queue correctness is hard
retries, visibility, backpressure, and operational observability are all non-trivial

Edward keeps two main job categories:

build jobs
agent run jobs

That separation matters because build workloads and agent-stream workloads have different runtime behavior and failure modes.

2.8 Why Docker-backed sandboxes

This is the most product-defining infrastructure choice.

Edward writes and executes generated code in Docker containers.

Why:

isolation from the host machine
deterministic environment for install/build/command execution
safer command execution boundary
framework-specific template images can be prewarmed
easier cleanup and lifecycle control

Why not run code directly on the API machine filesystem:

far less safe
dependency conflicts become unmanageable
cleanup is harder
one broken project can poison the environment for others

The sandbox is not just an implementation detail. It is the product’s execution boundary.

2.9 Why S3 + CloudFront + optional Cloudflare KV

Edward separates “code generation” from “preview hosting”.

S3 stores preview artifacts
CloudFront serves previews and can be invalidated
Cloudflare KV optionally maps subdomains to preview storage prefixes

Why this is better than serving builds directly from live containers:

static previews are cheaper and more stable
finished artifacts survive sandbox lifecycle cleanup
containers can be ephemeral while previews remain available
preview serving and code execution are decoupled

This is a strong architecture choice because it turns preview hosting into a static asset problem instead of a container serving problem.

2.10 Why GitHub OAuth + GitHub sync

GitHub is used both for user authentication and for repository integration.

Why GitHub auth:

Edward’s target users already live in GitHub-centric workflows
repo sync is a core feature, so GitHub identity is a natural anchor
one provider reduces auth complexity

Why sync through GitHub APIs instead of shelling out to git in the sandbox:

less credential management complexity inside containers
direct tree/blob APIs are deterministic and auditable
easier to manage partial file sync and manifests

2.11 Why bring-your-own API keys

Users store their own OpenAI/Gemini/Anthropic API keys.

Why:

cost ownership stays with the user
provider choice stays with the user
product does not need to centralize all model billing risk
enterprise-style customers often prefer controlling their own provider credentials

Why encrypted at rest:

these are real credentials, not preferences
API keys must not sit in plaintext in Postgres

This is why apps/api/utils/encryption.ts and apps/api/utils/secretEnvelope.ts exist.

2.12 Why SSE instead of WebSockets for streaming

Edward streams progress to the browser using Server-Sent Events.

Why SSE:

server-to-client streaming is the actual need
simpler than full bidirectional socket infra
reconnect semantics are straightforward
Last-Event-ID replay model fits persisted run events well

Why not WebSockets:

more operational complexity
bidirectional realtime control is not the dominant requirement
durability + replay matter more than low-level socket interactivity

2.13 Why persist stream events

This is one of the smartest choices in the repo.

Edward does not treat stream events as disposable transport. It stores them in run_event.

Why:

browser reconnects can resume streams
active run pages can rehydrate state after refresh
debugging becomes far easier
a run has a durable audit trail
the stream is no longer coupled to one live TCP connection

This is the reason the system feels durable instead of “best effort”.

3. Runtime topology

At runtime, think in terms of five planes.

3.1 Presentation plane

apps/web
Next.js pages, route handlers, React state, chat workspace, sandbox UI

3.2 Delivery/API plane

apps/api/server.http.ts
Express routes/controllers/middleware
SSE response management

3.3 Execution plane

apps/api/queue.worker.ts
BullMQ workers
agent runs, builds, backups

3.4 Durable data plane

Postgres via Drizzle in packages/auth

3.5 Ephemeral coordination plane

Redis
locks
queue backend
pub/sub
sandbox state
rate limiting

The system is healthy when these planes are kept conceptually separate.

4. Repo walkthrough

4.1 Root

Important root files:

package.json: workspace scripts and security overrides
pnpm-workspace.yaml: workspace package boundaries
turbo.json: task graph and env propagation
README.md: product/dev bootstrap overview
eslint.config.mjs: root lint baseline
tsconfig.json: root TS setup
scripts/build-local-sandboxes.sh: builds local sandbox images

Why root-level task discipline matters:

the repo has multiple deployable units
env propagation is explicit
broken build assumptions must fail early

4.2 apps/web

What it owns:

landing page
auth session bootstrapping
chat workspace
file/editor/preview UI
changelog UI
browser-side stream orchestration

Most important files:

app/layout.tsx: global metadata, fonts, providers, navbar shell
app/providers.tsx: React Query, theme, notifications, chat stream provider
app/page.tsx: landing page entry
app/chat/[id]/page.tsx: server-side access probe + metadata generation for chat pages
components/chat/chatPageClient.tsx: client orchestration for a chat route
components/chat/chatWorkspace.tsx: core desktop/mobile workspace composition
stores/chatStream/controller.ts: start/resume/cancel stream orchestration
stores/chatStream/useStartStream.ts: new message stream mutation flow
lib/streaming/processors/chatStreamProcessor.ts: SSE event consumption into UI state
stores/sandbox/*: file/build/terminal UI state
hooks/server-state/*: React Query data fetching hooks
lib/api/*: API client surface

Why this frontend is split between React Query and Zustand:

React Query handles server state: chat history, metadata, active runs, quotas
Zustand handles high-churn UI runtime state: live streaming text, file chunks, open panel state, build errors, terminal lines

That is the right split.

If streaming state lived fully inside React Query, it would be awkward and too mutation-heavy. If server state lived fully inside Zustand, cache invalidation and stale-fetch logic would get worse.

4.3 apps/api

What it owns:

request authentication and validation
chat/run/build endpoints
run orchestration
planning workflow
LLM abstraction
parser and tool event handling
sandbox orchestration
GitHub sync orchestration
preview routing

Composition roots:

server.http.ts: API process bootstrap
queue.worker.ts: worker bootstrap

Important structure:

routes/: HTTP surface
controllers/: transport and response wiring
services/: application + infra orchestration
lib/: adapters/clients/shared helpers
middleware/: auth, rate limit, validation, telemetry
schemas/: Zod request contracts
tests/: mostly API-side tests, mirrored by module

4.4 packages/auth

What it owns:

Better Auth instance
Drizzle/Postgres connection
database schema
basic data access helpers for runs/builds
migrations

Why it is a separate package:

both web and API need auth/schema awareness
keeping schema in API app only would over-couple layers

4.5 packages/shared

What it owns:

stream event contracts
chat UI types
API contract types
model catalog and provider detection
rate-limit scopes/policies
shared parsing helpers

This package is the anti-drift package.

If this package did not exist, frontend/backend stream contracts would break constantly.

4.6 packages/ui

What it owns:

reusable UI building blocks
shared styling artifacts
navigation, skeletons, toasts, hooks, utilities

Why it exists:

product UI should reuse primitives instead of re-implementing them
keeps apps/web focused on product behavior, not low-level component plumbing

4.7 packages/octokit

What it owns:

GitHub API client creation
repo/branch creation helpers
manifest-based sync

Why separate:

GitHub sync logic is infrastructure-adjacent and reusable
keeping raw Octokit usage isolated reduces API surface spread

5. Data model and why each table exists

The schema in packages/auth/lib/schema.ts is the durable model of the product.

5.1 Auth tables

user
session
account
verification

Why:

Better Auth expects these concepts
GitHub OAuth identity and session state need durable storage
Edward also stores product-specific fields on user, especially apiKey and preferredModel

5.2 Product tables

chat: the top-level project/conversation container
message: user/assistant message history
attachment: images attached to messages
build: build lifecycle records
run: durable execution record for one assistant generation flow
run_event: persisted stream events
run_tool_call: durable tool invocation records

5.3 Why `chat`

chat is not just a thread id. It is the unit of project identity.

It owns:

title/description
SEO fields
GitHub repo binding
custom subdomain

Why not put these elsewhere:

the project is anchored to the conversation
preview and repo sync are project-level concerns

5.4 Why `message`

Messages are user-facing history.

Important detail:

assistant output is persisted as a normal message
run events are not a replacement for message history

Why both message and run_event exist:

message = final conversational artifact
run_event = execution trace / replay log

That distinction is important.

5.5 Why `run`

run exists because one assistant generation is not a simple request-response.

A run has:

queue state
current state machine position
current turn
termination reason
metadata
linkage to user and assistant messages

Why not derive all of this from messages:

messages do not capture execution status, retries, checkpoints, cancellation, or turn-level progress

5.6 Why `run_event`

run_event is the stream replay ledger.

Why this table is critical:

live SSE can reconnect from a sequence number
past run behavior can be debugged
session completion can be inferred from persisted events
worker/API/browser all get a shared truth

5.7 Why `run_tool_call`

This is the durability and idempotency guard for tools.

Why not only emit tool output into run_event:

tools have inputs, outputs, duration, status, and idempotency semantics
tool calls are not just display events; they are execution records

5.8 Why `build`

Preview build lifecycle is separate from generation lifecycle.

That is the correct model because:

generation can succeed while build fails
build status must be independently queryable and streamable
previews need their own duration/error metadata

5.9 Why attachments are separate

Attachments are not embedded directly inside messages because:

metadata is structured
message text and binary/media references are different concerns
image uploads have different constraints and lifecycle

6. The most important end-to-end flow: send a chat message

This is the core product path.

6.1 Browser starts the stream

Frontend entry:

apps/web/stores/chatStream/useStartStream.ts
apps/web/lib/api/chat.ts

The browser:

acquires a submission lock
creates optimistic UI state
calls POST /chat/message
begins consuming SSE frames

Why the submission lock exists:

to prevent accidental duplicate sends
to avoid overlapping UI-side submissions before the server admits a run

6.2 API authenticates and validates

Backend route:

apps/api/routes/chat.routes.ts

Middleware:

auth
rate limit
request validation

Why this layering exists:

fail cheap and early
keep orchestration code free from repeated input checks

6.3 API resolves model and user credentials

In unifiedSendMessage:

user API key is loaded
key is decrypted
provider is inferred
chosen model is checked against provider

Why this validation exists:

a stored Gemini key with an OpenAI model choice is an avoidable operator error
fail-fast here is much cleaner than failing deep in LLM execution

6.4 API creates or loads chat + persists the user message

This happens via chat.service.ts.

Why persist before execution:

user intent must be durable even if downstream execution fails
history should not disappear because a worker crashed

6.5 API runs planning workflow

Planning is not the same as generation.

Workflow engine:

services/planning/workflow/engine.ts

Main early phases:

analyze intent
resolve packages
install packages
build/deploy/recover as needed

Why have a workflow at all:

generation quality improves when framework/packages/intent are normalized first
retries and step-level state become explicit
the system can reason in phases instead of one giant black-box prompt

6.6 API creates an admitted durable run

Run admission:

services/runs/runAdmission.service.ts
packages/auth/lib/run.ts

Why run admission matters:

the product must limit active execution globally, per user, and per chat
one chat should not have multiple conflicting active generations
the API must reject overload before enqueuing dangerous work

The use of transactional advisory locks here is a sign of mature concurrency thinking.

6.7 API enqueues worker job

After the run is admitted, the API enqueues the job in BullMQ.

Why queue after durable DB write:

DB becomes the source of truth
if enqueue fails, the run can be marked failed explicitly
the system is not dependent on HTTP lifetime

6.8 Browser switches from request stream to durable run stream

The API immediately begins streaming persisted/live run events via streamRunEventsFromPersistence.

Why this handoff is elegant:

the frontend can keep one streaming UX
internally, the source is durable run event persistence, not just the original request handler

This is how Edward bridges synchronous UX and asynchronous execution.

7. The second important flow: worker-run execution

Worker entry:

apps/api/services/runs/agent-run-worker/processor.ts

7.1 Worker reloads durable context

The worker fetches:

run record
metadata
user API key
historical conversation context

Why:

worker must be independently restartable
it cannot depend on the request process keeping in-memory state alive

7.2 Worker subscribes to cancellation

It listens on Redis pub/sub channels like edward:run-cancel:<runId>.

Why pub/sub plus DB terminal checks both exist:

pub/sub gives low-latency cancellation
DB polling gives durable truth if a pub/sub signal is missed

This dual mechanism is deliberate defense-in-depth.

7.3 Worker marks run running

The worker updates durable run state to running.

Why not mark it when API enqueues:

enqueued is not the same as actively executing
status must reflect reality, not intent

7.4 Worker captures run events through a fake response

This is a subtle but strong pattern.

createRunEventCaptureResponse lets the streaming session code write events as if it were writing to an HTTP response, while the worker intercepts those events and persists/publishes them.

Why this is good:

shared stream-session logic can be reused by API and worker paths
the event producer does not need to know whether the sink is a real socket or a persistence pipeline

7.5 Worker finalizes success or failure

Success path:

update terminal run state
clear checkpoint
store duration/latency metadata

Failure path:

persist error and terminal completion events
mark run failed

Why explicit finalize helpers exist:

terminal transitions are high-risk correctness points
centralizing them reduces double-completion bugs

8. The stream runtime and why it is designed this way

Core files:

runStreamSession.orchestrator.ts
agentLoop.runner.ts
events/handler.ts
lib/llm/parser.ts

8.1 Why orchestration is separate from the raw LLM client

LLM API calls are the smallest part of the feature.

Edward also needs:

prompt composition
token budgeting
parser state handling
sandbox side effects
tool execution
validation/autofix/retry
persistence/finalization

That is why the orchestration layer exists above provider.client.ts.

8.2 Why the model outputs tagged markup

Edward instructs the model to output strict Edward tags:

thinking
response
sandbox
file
install
command
web search
done

Why tagged output instead of “just ask for code”:

the product needs machine-readable execution intent
file boundaries must be recoverable
installs and commands must be explicit
partial streaming must still be parseable

This is a classic “LLM as structured protocol emitter” design.

8.3 Why there is a streaming parser state machine

lib/llm/parser.ts is a state machine because streamed output arrives in incomplete chunks.

Why not parse with simple regex over full strings:

chunks can split tags across boundaries
file/install/sandbox sections can nest temporal states
incomplete output must still be handled safely

This parser is not overengineering. It is required for correctness in streamed generation.

8.4 Why there is an agent loop, not one LLM call

runAgentLoop supports multiple turns.

Why:

the model may need to inspect, write, install, command, then continue
tool results need to feed back into later reasoning
retries/continuations need bounded turn accounting

Why hard budgets exist:

prevent runaway loops
bound cost
bound context growth
preserve operational predictability

8.5 Why token usage is computed before and during execution

Edward computes provider-aware token usage because context exhaustion is one of the most common real failure modes in agent systems.

Why this is necessary:

different providers have different token windows
multimodal content changes token budgeting
strict output reservation prevents generation from crowding out response budget

8.6 Why post-generation validation, autofix, and strict retry exist

Generated code is probabilistic. Production systems must add deterministic safety rails.

Edward uses:

postgen validation
deterministic autofixes
strict retry

Why this layered approach is better than “just regenerate everything”:

deterministic fixes are cheaper and faster
validation localizes problems
retry is only used when the output contract is still violated

This is one of the strongest “productionized AI” patterns in the repo.

9. Planning workflow and why it exists separately from stream execution

Planning modules:

services/planning/schemas.ts
workflow/engine.ts
analyzers/intentAnalyzer.ts
resolvers/dependency.resolver.ts
validators/postgenValidator.ts

9.1 Why planning is a workflow

Because planning has recoverable phases, not just a single pass.

Phases include:

ANALYZE
RESOLVE_PACKAGES
INSTALL_PACKAGES
GENERATE
BUILD
DEPLOY
RECOVER

Why explicit phase modeling matters:

allows retries with context
improves debuggability
lets the system fail in a known stage
reduces the amount of work shoved into one prompt

9.2 Why intent analysis uses the LLM but is schema-constrained

intentAnalyzer.ts asks the model for JSON and validates it with Zod.

Why:

intent classification is a fuzzy problem
but downstream code wants structured outputs

So the design is:

use LLM for ambiguity resolution
use schema validation for control
use fallback logic when classification fails

This is the right balance.

9.3 Why dependency resolution exists

The model may recommend packages, but the runtime must filter/verify them.

Why:

package names can be wrong
peer conflicts matter
some packages are blocked for sandbox/runtime reasons

This is why the system does not blindly trust model-emitted package lists.

10. Sandbox architecture and why it is unusually important here

Key modules:

lifecycle/provisioning.ts
docker.service.ts
write/buffer.ts
write/flush.ts
command.service.ts
builder/unified-build/orchestrator.ts
state.service.ts

10.1 Why sandbox state is in Redis

Sandbox instances are ephemeral runtime resources with TTLs.

Why Redis instead of Postgres here:

sandbox liveness is operational state, not primary product truth
TTL refresh and quick lookup matter
container lifecycle reconciliation is fast-path coordination

10.2 Why sandbox writes are buffered

Edward streams file content incrementally. Writing every tiny chunk directly to disk/container would be noisy and slow.

So the system:

buffers file chunks in Redis
periodically flushes them to the container
uses distributed locks around flush

Why this is smart:

reduces write churn
handles chunked file streaming naturally
coordinates concurrent writes safely

10.3 Why protected framework files exist

Template registry marks files like package.json, tsconfig, framework configs, and core CSS files as protected.

Why:

models are much less reliable when editing sensitive build/config files
most user value is in app code, not infra/config drift
protecting these files preserves build stability

This is a product safety rail, not a limitation by accident.

10.4 Why command execution is allowlisted

command.service.ts validates:

command name
argument count/length
path safety
dangerous patterns

Why:

the sandbox is isolated, but still not trusted blindly
guardrails reduce accidental destructive behavior
product behavior becomes auditable and predictable

10.5 Why builds happen after generation in a unified build orchestrator

buildAndUploadUnified handles:

dependency presence checks
framework detection
merge/install dependency logic
build execution
preview upload
cache invalidation

Why not “just run npm build”:

different frameworks output differently
preview hosting needs path/base handling
dependencies may need reconciliation
upload and routing are part of the build product

11. Preview and deployment architecture

11.1 Path mode vs subdomain mode

Configured via EDWARD_DEPLOYMENT_TYPE.

Why two modes:

local/self-hosted environments often want simple path-based previews
production environments may want nicer subdomain-based previews

This avoids forcing one infrastructure assumption everywhere.

11.2 Why preview routing uses KV

Subdomain routing needs a fast edge lookup from subdomain -> storage prefix.

Cloudflare KV is a practical fit because:

low-latency reads at edge
simple key-value mapping
decoupled from the main DB

Why not use Postgres for request-time routing:

worse latency profile for edge routing
unnecessary coupling between preview serving and primary transactional DB

11.3 Why preview URL is also stored on build records

Because the user cares about “what is the latest preview for this chat/build”.

Persisting preview URLs on builds means:

API can answer build status quickly
UI can bootstrap preview state without recomputing routing every time

12. GitHub integration architecture

Important files:

packages/octokit/index.ts
apps/api/services/github/sync.service.ts
apps/api/services/github/token.service.ts
apps/api/services/github/repoBinding.service.ts

12.1 Why repo binding is a first-class concept

Chats/projects can be linked to repos.

Why bind at chat level:

a chat represents a project
repo sync is a project concern, not a user-global concern

12.2 Why GitHub token handling is wrapped

token.service.ts decrypts and migrates token storage.

Why centralize this:

auth provider data should not be parsed ad hoc everywhere
token encryption migration needs one place

12.3 Why sync uses a manifest

packages/octokit/index.ts uses .edward-sync-manifest.json.

Why:

lets Edward track which files it manages
supports deletion of files removed locally
avoids blind destructive sync over unknown repo content

This is a very practical design choice.

13. Frontend state architecture

13.1 Server state

Handled mainly with React Query:

chat history
metadata
quotas
active run lookup
GitHub status

Why:

cache lifecycle
stale time
refetch policies
request deduplication

13.2 Stream state

Handled with Zustand chat stream store.

Why:

append-heavy mutable event streams
low friction updates per chunk/frame
simpler than putting stream mutation logic into React component trees

13.3 Sandbox UI state

Handled with Zustand sandbox slices:

files
editor selection
preview URL
build status/errors
terminal output
open/close state

Why separate from chat stream state:

stream state represents live assistant output
sandbox state represents persistent project workspace UI

These are related but not identical concerns.

13.4 Why the chat route does server-side access probing

apps/web/app/chat/[id]/page.tsx checks access and metadata server-side.

Why:

avoids client-only “flash then deny”
supports route metadata generation
improves correctness for private chat pages

14. Security posture and why these controls exist

Important controls:

auth middleware for all protected API routes
rate limits backed by Redis
encrypted API keys
command allowlists
protected template files
Docker isolation
CSP/helmet/cors on API
request ids and security telemetry

Why the repo has many small security modules instead of one giant security file:

security concerns happen at different layers
auth, rate limit, encryption, runtime isolation, and telemetry are separate controls

This is the correct decomposition.

15. Reliability posture and why the repo feels “production-minded”

Signals of maturity:

graceful shutdown in API and worker
durable run events
queue-based long-running execution
resumable streams using Last-Event-ID
cancellation via pub/sub plus durable verification
DB-backed admission control
checkpointing of agent loop state
explicit terminal finalization logic
post-generation validators
quality gate scripts

These are not “extra code”. They are the difference between demo code and production-oriented code.

16. Testing and quality gates

Current rough shape:

API tests: about 91 files
Web tests: light, about 3 files
Shared package tests: light but present

Why API tests dominate:

most complexity and failure modes live in orchestration, streaming, sandboxing, and workers
UI is large, but much of it is composition/presentation on top of backend contracts

Quality scripts in apps/api/scripts/quality enforce:

architecture boundaries
duplication checks
coverage checks
function-length checks
file audit generation

Why these custom scripts exist:

generic linting does not enforce architecture well enough
this codebase has non-trivial layering rules

17. Key things a junior engineer must understand before making changes

17.1 Never confuse message history with run execution history

message is the conversation artifact
run_event is the execution/replay log

17.2 Never treat Redis state as the source of truth for business data

Redis is coordination state. Postgres is durable business truth.

17.3 Never assume the browser connection is the lifetime of the work

The worker owns durable execution. The browser only observes it.

17.4 Never casually edit protected sandbox template files or remove guardrails

Those protections exist because AI-generated config churn destroys stability.

17.5 Never bypass run admission and queueing for “quick fixes”

That breaks fairness, concurrency guarantees, and operational predictability.

17.6 Never add a new stream event without updating both sides

If you change stream contracts, review:

backend emitters
persistence/replay
frontend stream processor
shared type contracts

18. How I would explain the main architectural “WHY” in one paragraph

19. File map by major area

This section is the “how do I navigate the repo quickly” map.

19.1 Root and workspace

README.md: product + local setup overview
package.json: top-level scripts
turbo.json: build graph and env config
pnpm-workspace.yaml: workspace package boundaries
scripts/build-local-sandboxes.sh: local sandbox image prep

19.2 API composition and delivery

apps/api/server.http.ts: API bootstrap and shutdown
apps/api/queue.worker.ts: worker bootstrap and background loops
apps/api/server/http/app.factory.ts: Express app assembly
apps/api/routes/*.ts: route wiring
apps/api/controllers/chat/query/*.ts: read/query/build/run delivery controllers
apps/api/middleware/*.ts: auth, rate limiting, validation, telemetry

19.3 Runs and execution

apps/api/services/runs/messageOrchestrator.service.ts: main send-message entry
apps/api/services/runs/runAdmission.service.ts: load shedding and admission control
apps/api/services/runs/runMetadata.ts: durable metadata/checkpoint schema
apps/api/services/runs/runEvents.service.ts: publish/persist run events
apps/api/services/runs/agent-run-worker/*: worker execution engine
apps/api/services/run-event-stream-utils/service.ts: replay + live SSE bridge

19.4 Chat session runtime

apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
apps/api/services/chat/session/loop/agentLoop.runner.ts
apps/api/services/chat/session/events/handler.ts
apps/api/services/chat/session/orchestrator/*
apps/api/services/chat/session/loop/*

19.5 Planning

apps/api/services/planning/schemas.ts
apps/api/services/planning/analyzers/intentAnalyzer.ts
apps/api/services/planning/resolvers/dependency.resolver.ts
apps/api/services/planning/validators/*
apps/api/services/planning/workflow/*

19.6 LLM abstraction

apps/api/lib/llm/provider.client.ts: provider-specific generation/streaming
apps/api/lib/llm/provider.helpers.ts: normalization and provider/model checks
apps/api/lib/llm/compose.ts: prompt assembly
apps/api/lib/llm/prompts/sections.ts: main system prompt contract
apps/api/lib/llm/parser*.ts: streaming parser
apps/api/lib/llm/tokens*.ts: token budgeting

19.7 Sandbox

apps/api/services/sandbox/lifecycle/provisioning.ts
apps/api/services/sandbox/lifecycle/cleanup.ts
apps/api/services/sandbox/docker.service.ts
apps/api/services/sandbox/state.service.ts
apps/api/services/sandbox/command.service.ts
apps/api/services/sandbox/write/*
apps/api/services/sandbox/read/*
apps/api/services/sandbox/builder/*
apps/api/services/sandbox/templates/*

19.8 Preview/storage/routing

apps/api/services/storage.service.ts
apps/api/services/storage/*
apps/api/services/preview.service.ts
apps/api/services/previewRouting/*

19.9 GitHub

apps/api/services/github/github.useCase.ts
apps/api/services/github/sync.service.ts
apps/api/services/github/token.service.ts
apps/api/services/github/repoBinding.service.ts
packages/octokit/index.ts

19.10 Frontend routing and app shell

apps/web/app/layout.tsx
apps/web/app/providers.tsx
apps/web/app/page.tsx
apps/web/app/chat/[id]/page.tsx
apps/web/app/changelog/page.tsx
apps/web/app/api/auth/[...all]/route.ts

19.11 Frontend chat workspace

apps/web/components/chat/chatPageClient.tsx
apps/web/components/chat/chatWorkspace.tsx
apps/web/components/chat/chatWorkspaceDesktop.tsx
apps/web/components/chat/chatWorkspaceMobile.tsx
apps/web/components/chat/messages/*
apps/web/components/chat/sandbox/*
apps/web/stores/chatStream/*
apps/web/stores/sandbox/*
apps/web/lib/streaming/processors/chatStreamProcessor.ts
apps/web/lib/parsing/*
apps/web/lib/api/*

19.12 Shared packages

packages/auth/lib/schema.ts
packages/auth/lib/auth.ts
packages/auth/lib/db.ts
packages/auth/lib/run.ts
packages/auth/lib/build.ts
packages/shared/src/constants.ts
packages/shared/src/schema.ts
packages/shared/src/streamEvents.ts
packages/shared/src/chat/types.ts
packages/shared/src/chat/streamActions.ts
packages/shared/src/api/contracts.ts
packages/ui/src/components/*

20. Directory-level inventory

This is not a prose description of every leaf UI component, because that would bury the actual KT. Instead, use this as the completeness map for where code lives.

20.1 `apps/api`

Major subareas:

controllers/
routes/
middleware/
lib/
services/
schemas/
utils/
tests/

High-density domains:

services/chat
services/runs
services/sandbox
services/planning
services/github
services/queue

20.2 `apps/web`

Major subareas:

app/
components/chat/
components/home/
components/changelog/
hooks/
lib/
stores/chatStream/
stores/sandbox/

High-density domains:

chat UI and workspace
SSE parsing and stream orchestration
sandbox/editor/preview UI

20.3 `packages/auth`

Major subareas:

lib/: auth, db, schema, build/run helpers
drizzle/: SQL migrations

20.4 `packages/shared`

Major subareas:

src/constants.ts
src/schema.ts
src/streamEvents.ts
src/chat/*
src/github/*
src/api/*
src/llm/*

20.5 `packages/ui`

Major subareas:

src/components/: reusable primitives
src/hooks/
src/lib/
src/styles/

20.6 `docker/templates`

Major subareas:

nextjs
vite-react
vanilla
base

These templates define the scaffold/runtime assumptions for generated projects and sandbox images.

21. Practical onboarding order for a new engineer

If I were onboarding someone senior-but-new, I would ask them to read in this exact order:

Root README.md
apps/api/README.md
packages/auth/lib/schema.ts
apps/api/server.http.ts
apps/api/server/http/app.factory.ts
apps/api/routes/chat.routes.ts
apps/api/services/runs/messageOrchestrator.service.ts
apps/api/services/runs/runAdmission.service.ts
apps/api/services/runs/agent-run-worker/processor.ts
apps/api/services/chat/session/orchestrator/runStreamSession.orchestrator.ts
apps/api/services/chat/session/loop/agentLoop.runner.ts
apps/api/services/chat/session/events/handler.ts
apps/api/services/sandbox/lifecycle/provisioning.ts
apps/api/services/sandbox/builder/unified-build/orchestrator.ts
apps/web/app/chat/[id]/page.tsx
apps/web/components/chat/chatPageClient.tsx
apps/web/stores/chatStream/controller.ts
apps/web/lib/streaming/processors/chatStreamProcessor.ts

If they understand those files, they understand the heart of Edward.

22. Final summary

Edward is a monorepo for a durable AI code-generation product, not a thin LLM wrapper. The architecture optimizes for correctness, replayability, operator safety, and product durability:

Postgres keeps durable truth.
Redis handles fast coordination.
BullMQ decouples request acceptance from long-running execution.
Docker sandboxes isolate generated code.
S3/CDN/KV separate preview hosting from execution.
SSE plus persisted run events make streaming resumable.
Shared packages prevent contract drift.
Planning, validation, autofix, and retry layers turn model output into something operationally usable.

That is the core “why” behind almost every serious architectural decision in this repo.

Edward KT

1. What Edward actually is

2. The highest-level architectural decisions and why they exist

2.1 Why a monorepo

2.2 Why Next.js on the frontend

2.3 Why Express for the API instead of putting everything inside Next route handlers

2.4 Why a dedicated worker process

2.5 Why Postgres

2.6 Why Redis

2.7 Why BullMQ

2.8 Why Docker-backed sandboxes

2.9 Why S3 + CloudFront + optional Cloudflare KV

2.10 Why GitHub OAuth + GitHub sync

2.11 Why bring-your-own API keys

2.12 Why SSE instead of WebSockets for streaming

2.13 Why persist stream events

3. Runtime topology

3.1 Presentation plane

3.2 Delivery/API plane

3.3 Execution plane

3.4 Durable data plane

3.5 Ephemeral coordination plane

4. Repo walkthrough

4.1 Root

4.2 apps/web

4.3 apps/api

4.4 packages/auth

4.5 packages/shared

4.6 packages/ui

4.7 packages/octokit

5. Data model and why each table exists

5.1 Auth tables

5.2 Product tables

5.3 Why chat

5.4 Why message

5.5 Why run

5.6 Why run_event

5.7 Why run_tool_call

5.8 Why build

5.9 Why attachments are separate

6. The most important end-to-end flow: send a chat message

6.1 Browser starts the stream

6.2 API authenticates and validates

6.3 API resolves model and user credentials

6.4 API creates or loads chat + persists the user message

6.5 API runs planning workflow

6.6 API creates an admitted durable run

6.7 API enqueues worker job

6.8 Browser switches from request stream to durable run stream

7. The second important flow: worker-run execution

7.1 Worker reloads durable context

7.2 Worker subscribes to cancellation

7.3 Worker marks run running

7.4 Worker captures run events through a fake response

7.5 Worker finalizes success or failure

8. The stream runtime and why it is designed this way

8.1 Why orchestration is separate from the raw LLM client

8.2 Why the model outputs tagged markup

8.3 Why there is a streaming parser state machine

8.4 Why there is an agent loop, not one LLM call

8.5 Why token usage is computed before and during execution

8.6 Why post-generation validation, autofix, and strict retry exist

9. Planning workflow and why it exists separately from stream execution

9.1 Why planning is a workflow

9.2 Why intent analysis uses the LLM but is schema-constrained

9.3 Why dependency resolution exists

10. Sandbox architecture and why it is unusually important here

10.1 Why sandbox state is in Redis

10.2 Why sandbox writes are buffered

10.3 Why protected framework files exist

10.4 Why command execution is allowlisted

10.5 Why builds happen after generation in a unified build orchestrator

11. Preview and deployment architecture

11.1 Path mode vs subdomain mode

11.2 Why preview routing uses KV

11.3 Why preview URL is also stored on build records

12. GitHub integration architecture

12.1 Why repo binding is a first-class concept

12.2 Why GitHub token handling is wrapped

12.3 Why sync uses a manifest

5.3 Why `chat`

5.4 Why `message`

5.5 Why `run`

5.6 Why `run_event`

5.7 Why `run_tool_call`

5.8 Why `build`

20.1 `apps/api`

20.2 `apps/web`

20.3 `packages/auth`

20.4 `packages/shared`

20.5 `packages/ui`

20.6 `docker/templates`