Edward is an AI-assisted frontend application builder.
At the product level, the user experience is:s
A user signs in with GitHub.
The user stores their own LLM API key inside the product.
The user asks Edward to create, edit, or fix a web app in chat.
Edward plans the work, streams progress, writes files into an isolated sandbox, installs dependencies, runs commands, builds the app, uploads preview assets, and optionally syncs the code to GitHub.
The user sees the generated app, file tree, terminal/build feedback, and live preview.
The system is intentionally split across:
apps/web: the product UI and auth surface
apps/api: the synchronous HTTP API, SSE delivery, orchestration, and infrastructure adapters
This split is not accidental. Edward is not just “a chat app”. It is a distributed system with:
durable state
transient coordination state
long-running execution
resumable streaming
isolated code execution
artifact publishing
external system integration
That is the fundamental lens juniors should use when reading this repo.
2. The highest-level architectural decisions and why they exist
2.1 Why a monorepo
Edward uses a pnpm + Turbo monorepo because the frontend, API, worker, DB schema, shared contracts, and UI primitives evolve together.
Why this was chosen instead of separate repos:
Shared contracts are first-class. Stream events, chat types, API response shapes, model catalogs, and rate-limit scopes must not drift.
Frontend and backend changes usually land together. A single PR often changes stream event shape, API behavior, and UI rendering.
Tooling is simpler. One workspace gives consistent TypeScript, ESLint, and build orchestration.
Local development is more realistic. pnpm dev starts the whole system shape, not disconnected fragments.
Tradeoff:
Monorepos become large and noisy.
Build graph discipline is required.
This repo counters that with:
separate packages for shared concerns
Turbo task boundaries
architecture boundary checks in apps/api/scripts/quality
2.2 Why Next.js on the frontend
apps/web is a Next.js 16 App Router app.
Why Next.js instead of plain Vite for the product UI:
server-rendered marketing/auth/changelog pages are easier
metadata, sitemap, robots, SEO handling are built in
auth route handling integrates cleanly with Better Auth
product pages and landing pages can live in the same app
Important nuance:
Edward generates Vite/Next/Vanilla projects for users, but the Edward product itself uses Next.js. Those are different concerns.
2.3 Why Express for the API instead of putting everything inside Next route handlers
The repo deliberately keeps orchestration in apps/api as a separate Express app.
Why:
stream-oriented chat delivery is easier to reason about in a dedicated API
queue worker and API share backend services cleanly
long-running orchestration should not be tightly coupled to the frontend deployment shape
HTTP API can scale independently from the web UI
operational concerns like graceful shutdown, Redis connections, worker coordination, and SSE backpressure are clearer in a dedicated server
Why not “just use Next API routes”:
the runtime model becomes mixed and harder to reason about
long-running SSE and worker orchestration become more awkward
explicit process separation is valuable here
2.4 Why a dedicated worker process
The API process accepts requests. The worker process executes long-running jobs.
Why this is the correct split:
user HTTP connections are short-lived and unreliable
LLM execution, sandbox interaction, dependency install, build, and artifact publishing can outlive the browser connection
workers give retries, concurrency control, and isolation from request latency
the system can continue work even if the client disconnects
This is one of the most important architectural choices in the whole repo.
Without it, Edward would behave like a fragile synchronous demo.
With it, Edward behaves like a real product with durable execution.
2.5 Why Postgres
Postgres is the durable source of truth.
It stores:
users, sessions, accounts
chats and messages
runs and run events
builds
attachments
repo bindings and related product state
Why Postgres instead of Redis-only or document storage:
chat history and run history are durable product data, not cache
run events need ordering and replay
auth tables fit naturally into a relational model
ownership and filtering by user/chat/run are frequent and relational
transactional admission control is easier and safer
The strongest proof is createRunWithUserLimit in packages/auth/lib/run.ts. It uses DB transactions and advisory locks for safe concurrent run admission.
2.6 Why Redis
Redis is used everywhere, but always for fast, ephemeral, coordination-heavy workloads.
Redis responsibilities:
BullMQ queue backend
distributed locks
sandbox state and TTLs
write buffers for streamed file output
pub/sub for run cancellation and live status fanout
request rate-limit stores
Why Redis instead of doing all of this in Postgres:
queues and pub/sub need low-latency operational semantics
lock contention and TTL semantics are simpler in Redis
sandbox write buffering is a classic ephemeral buffering problem
rate limiting is a cache-like, time-windowed workload
This is a good architecture split:
Postgres = durable truth
Redis = fast coordination and ephemeral runtime state
2.7 Why BullMQ
BullMQ is the job orchestration layer on top of Redis.
Why BullMQ:
already fits the Redis runtime Edward needs
supports worker concurrency and queue separation
familiar operational model for TypeScript/Node systems
good fit for “enqueue durable work from API, execute elsewhere”
Why not build a custom queue:
queue correctness is hard
retries, visibility, backpressure, and operational observability are all non-trivial
Edward keeps two main job categories:
build jobs
agent run jobs
That separation matters because build workloads and agent-stream workloads have different runtime behavior and failure modes.
2.8 Why Docker-backed sandboxes
This is the most product-defining infrastructure choice.
Edward writes and executes generated code in Docker containers.
Why:
isolation from the host machine
deterministic environment for install/build/command execution
safer command execution boundary
framework-specific template images can be prewarmed
easier cleanup and lifecycle control
Why not run code directly on the API machine filesystem:
far less safe
dependency conflicts become unmanageable
cleanup is harder
one broken project can poison the environment for others
The sandbox is not just an implementation detail. It is the product’s execution boundary.
2.9 Why S3 + CloudFront + optional Cloudflare KV
Edward separates “code generation” from “preview hosting”.
S3 stores preview artifacts
CloudFront serves previews and can be invalidated
Cloudflare KV optionally maps subdomains to preview storage prefixes
Why this is better than serving builds directly from live containers:
stores/chatStream/useStartStream.ts: new message stream mutation flow
lib/streaming/processors/chatStreamProcessor.ts: SSE event consumption into UI state
stores/sandbox/*: file/build/terminal UI state
hooks/server-state/*: React Query data fetching hooks
lib/api/*: API client surface
Why this frontend is split between React Query and Zustand:
React Query handles server state: chat history, metadata, active runs, quotas
Zustand handles high-churn UI runtime state: live streaming text, file chunks, open panel state, build errors, terminal lines
That is the right split.
If streaming state lived fully inside React Query, it would be awkward and too mutation-heavy.
If server state lived fully inside Zustand, cache invalidation and stale-fetch logic would get worse.
it cannot depend on the request process keeping in-memory state alive
7.2 Worker subscribes to cancellation
It listens on Redis pub/sub channels like edward:run-cancel:<runId>.
Why pub/sub plus DB terminal checks both exist:
pub/sub gives low-latency cancellation
DB polling gives durable truth if a pub/sub signal is missed
This dual mechanism is deliberate defense-in-depth.
7.3 Worker marks run running
The worker updates durable run state to running.
Why not mark it when API enqueues:
enqueued is not the same as actively executing
status must reflect reality, not intent
7.4 Worker captures run events through a fake response
This is a subtle but strong pattern.
createRunEventCaptureResponse lets the streaming session code write events as if it were writing to an HTTP response, while the worker intercepts those events and persists/publishes them.
Why this is good:
shared stream-session logic can be reused by API and worker paths
the event producer does not need to know whether the sink is a real socket or a persistence pipeline
7.5 Worker finalizes success or failure
Success path:
update terminal run state
clear checkpoint
store duration/latency metadata
Failure path:
persist error and terminal completion events
mark run failed
Why explicit finalize helpers exist:
terminal transitions are high-risk correctness points
centralizing them reduces double-completion bugs
8. The stream runtime and why it is designed this way
Core files:
runStreamSession.orchestrator.ts
agentLoop.runner.ts
events/handler.ts
lib/llm/parser.ts
8.1 Why orchestration is separate from the raw LLM client
LLM API calls are the smallest part of the feature.
Edward also needs:
prompt composition
token budgeting
parser state handling
sandbox side effects
tool execution
validation/autofix/retry
persistence/finalization
That is why the orchestration layer exists above provider.client.ts.
8.2 Why the model outputs tagged markup
Edward instructs the model to output strict Edward tags:
thinking
response
sandbox
file
install
command
web search
done
Why tagged output instead of “just ask for code”:
the product needs machine-readable execution intent
file boundaries must be recoverable
installs and commands must be explicit
partial streaming must still be parseable
This is a classic “LLM as structured protocol emitter” design.
8.3 Why there is a streaming parser state machine
lib/llm/parser.ts is a state machine because streamed output arrives in incomplete chunks.
Why not parse with simple regex over full strings:
chunks can split tags across boundaries
file/install/sandbox sections can nest temporal states
incomplete output must still be handled safely
This parser is not overengineering. It is required for correctness in streamed generation.
8.4 Why there is an agent loop, not one LLM call
runAgentLoop supports multiple turns.
Why:
the model may need to inspect, write, install, command, then continue
tool results need to feed back into later reasoning
retries/continuations need bounded turn accounting
Why hard budgets exist:
prevent runaway loops
bound cost
bound context growth
preserve operational predictability
8.5 Why token usage is computed before and during execution
Edward computes provider-aware token usage because context exhaustion is one of the most common real failure modes in agent systems.
Why this is necessary:
different providers have different token windows
multimodal content changes token budgeting
strict output reservation prevents generation from crowding out response budget
8.6 Why post-generation validation, autofix, and strict retry exist
Generated code is probabilistic. Production systems must add deterministic safety rails.
Edward uses:
postgen validation
deterministic autofixes
strict retry
Why this layered approach is better than “just regenerate everything”:
deterministic fixes are cheaper and faster
validation localizes problems
retry is only used when the output contract is still violated
This is one of the strongest “productionized AI” patterns in the repo.
9. Planning workflow and why it exists separately from stream execution
Planning modules:
services/planning/schemas.ts
workflow/engine.ts
analyzers/intentAnalyzer.ts
resolvers/dependency.resolver.ts
validators/postgenValidator.ts
9.1 Why planning is a workflow
Because planning has recoverable phases, not just a single pass.
Phases include:
ANALYZE
RESOLVE_PACKAGES
INSTALL_PACKAGES
GENERATE
BUILD
DEPLOY
RECOVER
Why explicit phase modeling matters:
allows retries with context
improves debuggability
lets the system fail in a known stage
reduces the amount of work shoved into one prompt
9.2 Why intent analysis uses the LLM but is schema-constrained
intentAnalyzer.ts asks the model for JSON and validates it with Zod.
Why:
intent classification is a fuzzy problem
but downstream code wants structured outputs
So the design is:
use LLM for ambiguity resolution
use schema validation for control
use fallback logic when classification fails
This is the right balance.
9.3 Why dependency resolution exists
The model may recommend packages, but the runtime must filter/verify them.
Why:
package names can be wrong
peer conflicts matter
some packages are blocked for sandbox/runtime reasons
This is why the system does not blindly trust model-emitted package lists.
10. Sandbox architecture and why it is unusually important here
Key modules:
lifecycle/provisioning.ts
docker.service.ts
write/buffer.ts
write/flush.ts
command.service.ts
builder/unified-build/orchestrator.ts
state.service.ts
10.1 Why sandbox state is in Redis
Sandbox instances are ephemeral runtime resources with TTLs.
Why Redis instead of Postgres here:
sandbox liveness is operational state, not primary product truth
TTL refresh and quick lookup matter
container lifecycle reconciliation is fast-path coordination
10.2 Why sandbox writes are buffered
Edward streams file content incrementally. Writing every tiny chunk directly to disk/container would be noisy and slow.
So the system:
buffers file chunks in Redis
periodically flushes them to the container
uses distributed locks around flush
Why this is smart:
reduces write churn
handles chunked file streaming naturally
coordinates concurrent writes safely
10.3 Why protected framework files exist
Template registry marks files like package.json, tsconfig, framework configs, and core CSS files as protected.
Why:
models are much less reliable when editing sensitive build/config files
most user value is in app code, not infra/config drift
protecting these files preserves build stability
This is a product safety rail, not a limitation by accident.
10.4 Why command execution is allowlisted
command.service.ts validates:
command name
argument count/length
path safety
dangerous patterns
Why:
the sandbox is isolated, but still not trusted blindly
guardrails reduce accidental destructive behavior
product behavior becomes auditable and predictable
10.5 Why builds happen after generation in a unified build orchestrator
buildAndUploadUnified handles:
dependency presence checks
framework detection
merge/install dependency logic
build execution
preview upload
cache invalidation
Why not “just run npm build”:
different frameworks output differently
preview hosting needs path/base handling
dependencies may need reconciliation
upload and routing are part of the build product
11. Preview and deployment architecture
11.1 Path mode vs subdomain mode
Configured via EDWARD_DEPLOYMENT_TYPE.
Why two modes:
local/self-hosted environments often want simple path-based previews
production environments may want nicer subdomain-based previews
This avoids forcing one infrastructure assumption everywhere.
11.2 Why preview routing uses KV
Subdomain routing needs a fast edge lookup from subdomain -> storage prefix.
Cloudflare KV is a practical fit because:
low-latency reads at edge
simple key-value mapping
decoupled from the main DB
Why not use Postgres for request-time routing:
worse latency profile for edge routing
unnecessary coupling between preview serving and primary transactional DB
11.3 Why preview URL is also stored on build records
Because the user cares about “what is the latest preview for this chat/build”.
Persisting preview URLs on builds means:
API can answer build status quickly
UI can bootstrap preview state without recomputing routing every time
12. GitHub integration architecture
Important files:
packages/octokit/index.ts
apps/api/services/github/sync.service.ts
apps/api/services/github/token.service.ts
apps/api/services/github/repoBinding.service.ts
12.1 Why repo binding is a first-class concept
Chats/projects can be linked to repos.
Why bind at chat level:
a chat represents a project
repo sync is a project concern, not a user-global concern
12.2 Why GitHub token handling is wrapped
token.service.ts decrypts and migrates token storage.
Why centralize this:
auth provider data should not be parsed ad hoc everywhere
avoids blind destructive sync over unknown repo content
This is a very practical design choice.
13. Frontend state architecture
13.1 Server state
Handled mainly with React Query:
chat history
metadata
quotas
active run lookup
GitHub status
Why:
cache lifecycle
stale time
refetch policies
request deduplication
13.2 Stream state
Handled with Zustand chat stream store.
Why:
append-heavy mutable event streams
low friction updates per chunk/frame
simpler than putting stream mutation logic into React component trees
13.3 Sandbox UI state
Handled with Zustand sandbox slices:
files
editor selection
preview URL
build status/errors
terminal output
open/close state
Why separate from chat stream state:
stream state represents live assistant output
sandbox state represents persistent project workspace UI
These are related but not identical concerns.
13.4 Why the chat route does server-side access probing
apps/web/app/chat/[id]/page.tsx checks access and metadata server-side.
Why:
avoids client-only “flash then deny”
supports route metadata generation
improves correctness for private chat pages
14. Security posture and why these controls exist
Important controls:
auth middleware for all protected API routes
rate limits backed by Redis
encrypted API keys
command allowlists
protected template files
Docker isolation
CSP/helmet/cors on API
request ids and security telemetry
Why the repo has many small security modules instead of one giant security file:
security concerns happen at different layers
auth, rate limit, encryption, runtime isolation, and telemetry are separate controls
This is the correct decomposition.
15. Reliability posture and why the repo feels “production-minded”
Signals of maturity:
graceful shutdown in API and worker
durable run events
queue-based long-running execution
resumable streams using Last-Event-ID
cancellation via pub/sub plus durable verification
DB-backed admission control
checkpointing of agent loop state
explicit terminal finalization logic
post-generation validators
quality gate scripts
These are not “extra code”. They are the difference between demo code and production-oriented code.
16. Testing and quality gates
Current rough shape:
API tests: about 91 files
Web tests: light, about 3 files
Shared package tests: light but present
Why API tests dominate:
most complexity and failure modes live in orchestration, streaming, sandboxing, and workers
UI is large, but much of it is composition/presentation on top of backend contracts
Quality scripts in apps/api/scripts/quality enforce:
architecture boundaries
duplication checks
coverage checks
function-length checks
file audit generation
Why these custom scripts exist:
generic linting does not enforce architecture well enough
this codebase has non-trivial layering rules
17. Key things a junior engineer must understand before making changes
17.1 Never confuse message history with run execution history
message is the conversation artifact
run_event is the execution/replay log
17.2 Never treat Redis state as the source of truth for business data
Redis is coordination state.
Postgres is durable business truth.
17.3 Never assume the browser connection is the lifetime of the work
The worker owns durable execution.
The browser only observes it.
17.4 Never casually edit protected sandbox template files or remove guardrails
Those protections exist because AI-generated config churn destroys stability.
17.5 Never bypass run admission and queueing for “quick fixes”
That breaks fairness, concurrency guarantees, and operational predictability.
17.6 Never add a new stream event without updating both sides
If you change stream contracts, review:
backend emitters
persistence/replay
frontend stream processor
shared type contracts
18. How I would explain the main architectural “WHY” in one paragraph
Edward is built as a durable, queue-backed, sandboxed AI execution platform rather than a thin chat wrapper because real code generation is slow, stateful, failure-prone, and operationally dangerous. The architecture separates durable truth (Postgres), fast coordination (Redis), long-running work (BullMQ worker), isolated execution (Docker sandboxes), and progressive UX (SSE + persisted run events). The frontend is split between React Query for server truth and Zustand for live streaming/UI runtime state. The repo uses shared packages to keep contracts aligned and validators/guardrails to turn probabilistic model output into something closer to a deterministic product.
19. File map by major area
This section is the “how do I navigate the repo quickly” map.
19.1 Root and workspace
README.md: product + local setup overview
package.json: top-level scripts
turbo.json: build graph and env config
pnpm-workspace.yaml: workspace package boundaries
scripts/build-local-sandboxes.sh: local sandbox image prep
19.2 API composition and delivery
apps/api/server.http.ts: API bootstrap and shutdown
apps/api/queue.worker.ts: worker bootstrap and background loops
This is not a prose description of every leaf UI component, because that would bury the actual KT. Instead, use this as the completeness map for where code lives.
20.1 apps/api
Major subareas:
controllers/
routes/
middleware/
lib/
services/
schemas/
utils/
tests/
High-density domains:
services/chat
services/runs
services/sandbox
services/planning
services/github
services/queue
20.2 apps/web
Major subareas:
app/
components/chat/
components/home/
components/changelog/
hooks/
lib/
stores/chatStream/
stores/sandbox/
High-density domains:
chat UI and workspace
SSE parsing and stream orchestration
sandbox/editor/preview UI
20.3 packages/auth
Major subareas:
lib/: auth, db, schema, build/run helpers
drizzle/: SQL migrations
20.4 packages/shared
Major subareas:
src/constants.ts
src/schema.ts
src/streamEvents.ts
src/chat/*
src/github/*
src/api/*
src/llm/*
20.5 packages/ui
Major subareas:
src/components/: reusable primitives
src/hooks/
src/lib/
src/styles/
20.6 docker/templates
Major subareas:
nextjs
vite-react
vanilla
base
These templates define the scaffold/runtime assumptions for generated projects and sandbox images.
21. Practical onboarding order for a new engineer
If I were onboarding someone senior-but-new, I would ask them to read in this exact order:
If they understand those files, they understand the heart of Edward.
22. Final summary
Edward is a monorepo for a durable AI code-generation product, not a thin LLM wrapper. The architecture optimizes for correctness, replayability, operator safety, and product durability:
Postgres keeps durable truth.
Redis handles fast coordination.
BullMQ decouples request acceptance from long-running execution.
Docker sandboxes isolate generated code.
S3/CDN/KV separate preview hosting from execution.
SSE plus persisted run events make streaming resumable.
Shared packages prevent contract drift.
Planning, validation, autofix, and retry layers turn model output into something operationally usable.
That is the core “why” behind almost every serious architectural decision in this repo.