questions

Shared from "Interview Questions" on Inkdown

Edward / AI Systems Interview Question Bank

Prepared from the interview thread plus the extended Edward-specific question bank.

How To Use This

Use this as a deep interview/practice guide. The goal is not to memorize answers, but to explain exact mechanisms, invariants, failure modes, tradeoffs, and production debugging paths. Recommended format:

Ask one question at a time.
Candidate answers in concrete system detail, not resume language.
Interviewer follows up until the invariant, failure mode, and tradeoff are clear.
Passing is allowed, but repeated passes lower the leveling signal.
At the end, evaluate hire/no-hire, level, weaknesses, and expected compensation band.

Part 1: Questions Asked In The Thread

1. Edward End-To-End Architecture

For Edward, walk through the end-to-end architecture after a user types:

“Build me a todo app with deployable preview.”

export default function App() { return }

as one of the control tags? Once the complete file tag is available:

Do you write the whole file at once?
Or stream file content progressively into the sandbox?

6. Nested Control-Like Tags Inside File Content

If the parser is inside a block, does it ignore all nested unknown tags until the matching ? Example:

return not a control tag

inside App.tsx. Would that prematurely close the file block? Or do you only treat as a terminator when it matches the currently active control tag from parser state? Also explain whether file content is written only after the closing file tag or streamed progressively while the file block is open.

7. More Robust Streaming Protocol Design

Design a more robust version of the streaming structured event protocol. You still want streaming structured events from the model, but file contents may contain arbitrary JSX/HTML/XML-like text. Pick one protocol and explain how it streams, validates, and recovers from malformed chunks. Possible options:

Length-prefixed blocks
JSON Lines with base64 content
Tool/function calls
Multipart boundaries
Escaped CDATA-like sections
Custom grammar

Explain why your chosen protocol prevents file content from being confused with control syntax.

8. Bonkers Pipeline Design

For Bonkers, you claimed you unified recreation, regeneration, and editing workflows into a single image-to-image pipeline, contributing to a 50% DAU increase. Walk through the pipeline design. Cover:

What the separate v2 flows were before
What abstraction was introduced in v3
How templates fit into the pipeline
How routing worked across Flux, Recraft, Ideogram, or other providers
How provider selection, retries, latency, and quality tradeoffs were handled

9. Agentic Chat Request Router

For Agentic Chat, you claimed:

LangGraph multi-step workflows
Intelligent tool selection
RAG with pgvector embeddings
Cohere reranking
Mem0 long-term memory
Semantic caching
Conversation branching

A user asks:

“Summarize the contract I uploaded last week and email the key risks to my manager.”

How does the system decide between RAG, memory, tools, and Google Workspace actions? Describe:

Graph nodes
Routing conditions
Permission boundaries
Failure handling
Whether email sending requires confirmation
How uploaded documents are located
How manager identity is resolved
What happens if required context is missing

Part 2: Extended Edward Question Bank

1. Product Boundary And Requirements

What exactly can Edward generate today: static frontend only, API calls, mock data, stateful forms, routing, external SDKs?
What is explicitly out of scope: auth flows, backend persistence, payments, server actions?
How do you stop the model from generating unsupported features?
When the user asks for unsupported functionality, do you reject, degrade, mock, or explain?
How are product constraints represented in the prompt, intent analyzer, and UI?
How do you test that unsupported features are handled consistently?

2. Prompt To Intent Analyzer

What are the intent categories?
How do you distinguish casual chat, edit request, new app generation, bug fix, preview request, and GitHub sync?
Is the classifier LLM-based, rules-based, or hybrid?
What happens on low confidence?
How do you evaluate false positives and false negatives?
What telemetry do you track for misclassified requests?
Can the user override intent classification?
How do you prevent casual conversation from triggering costly agent runs?

3. Orchestration Loop

What is one “turn” in the orchestrator?
What state is persisted between turns?
What makes the loop stop?
How do you prevent infinite continue loops?
How do you handle partial completion, such as app generated but build failed?
How do you replay or resume a failed run?
What is the exact run state machine?
Where are turn outputs stored?
How do you prevent duplicate handler execution on retry?
How do you decide whether the agent should continue, repair, or stop?

4. Streaming Parser

What exact tags/events exist?
What does raw model output look like?
How do you parse partial chunks?
How do you handle malformed tags?
How do you avoid JSX/HTML inside file content being parsed as control syntax?
Are handlers idempotent?
If the stream disconnects mid-file, what state is saved?
How would you redesign the protocol more safely?
Do you validate event schemas before dispatch?
Do you support nested events?
Do you emit partial UI events before full validation?
How do you test parser edge cases?

5. File Write Pipeline

Do you write files progressively or only after complete file blocks?
How do you ensure atomic file writes?
Do you write a temp file and then rename?
How do you handle concurrent edits to the same file?
How do you preserve user edits when AI regenerates?
How do you diff/patch instead of overwrite?
What happens if two agent steps write the same path?
How do you handle binary assets?
How do you validate file paths to prevent path traversal?
How do you prevent writes outside the workspace?
How is the file tree synchronized to the UI?
What happens if file write succeeds but DB update fails?

6. Sandbox Design

How do you create a sandbox?
What image is used for each framework?
How do you isolate filesystem, network, CPU, and memory?
How do you prevent malicious generated code from escaping?
Are containers long-lived or per-run?
How do you clean them up?
How do you handle dependency install time?
Do you cache node_modules or the package manager store?
How do you enforce timeout limits?
How do you prevent network abuse or SSRF?
What happens if Docker daemon restarts?
How do you detect zombie containers?

7. Concurrency And Admission Control

Why 2 containers per user?
How is the cap enforced on the backend?
What Redis keys, locks, or state exist?
What happens when 3 requests hit within 50ms?
What happens if the worker crashes after acquiring a slot?
What TTL or heartbeat exists?
How do you reconcile Redis with real Docker state?
What metrics tell you slots are leaking?
Is the limit per user, per org, per browser session, or global?
What happens when a user opens multiple tabs?
Is the 3rd request rejected, queued, or delayed?
How do you avoid starvation?

8. BullMQ / Worker Architecture

What queues exist?
What is the job payload?
What retries are safe?
Which jobs are idempotent?
How do you dedupe repeated requests?
What is your backoff strategy?
How do you handle stalled jobs?
How do you separate agent jobs, build jobs, cleanup jobs, and preview jobs?
What do you persist before enqueueing vs inside the worker?
How do you avoid losing a job between DB write and queue publish?
How do you handle poison jobs?
How do you monitor queue depth and processing latency?

9. Build System

After code generation, how do you install dependencies?
How do you detect package manager?
How do you prevent unsafe install scripts?
How do you handle build errors?
Do you ask the model to repair build errors automatically?
How many repair attempts?
How do you stream build logs to the UI?
How do you separate TypeScript errors, lint errors, and runtime errors?
How do you handle missing dependencies?
How do you prevent infinite install/build loops?
How do you cache builds?
How do you make builds reproducible?

10. Preview Pipeline

What exactly goes to S3?
What path structure do you use: user/chat/run/version?
How do you make preview URLs stable?
How does CloudFront rewrite work?
How do you invalidate cache?
How do you prevent one user accessing another user’s preview?
How do you handle rollback to a previous preview version?
What happens if upload succeeds but DB update fails?
How do you handle partial upload failure?
How do you version previews?
How do you expire old previews?
How do you debug a blank preview?

11. Persistence Model

What tables exist for users, chats, runs, builds, files, and previews?
What is the source of truth for project state?
Do you store generated files in DB, S3, Git, or Docker filesystem?
How do you version file states?
How do you resume after API restart?
How do you migrate schema without breaking old projects?
How do you represent failed, partial, and completed runs?
How do you associate model events with persisted files?
How do you handle DB transaction boundaries?
What data is safe to delete during cleanup?

12. GitHub Sync

How do you authenticate GitHub?
How do you create repo, branch, and commit?
How do you handle conflicts?
Do you force push or open PRs?
How do you map generated file tree to Git blobs?
What happens if sync fails midway?
How do you avoid leaking API keys into commits?
How do you handle large files?
How do you handle existing repos?
How do you preserve commit history?
How do you map Edward project versions to Git commits?
How do you recover if GitHub API rate limits you?

13. Model Orchestration

OpenAI vs Gemini: how do you choose?
How do you handle model-specific streaming differences?
What is the system prompt structure?
How do you fit large projects into context?
Do you use file summaries, embeddings, or dependency graphs?
How do you validate generated code before applying it?
How do you prevent prompt injection from project files?
How do you handle tool/model failure?
How do you prevent the model from hallucinating unavailable libraries?
How do you track model cost per run?
How do you evaluate output quality?
How do you roll out prompt changes safely?

14. Unsupported Capability Handling

User asks for auth, database, payments, cron, or backend API. What happens?
Do you generate frontend mocks?
Do you explain limitation?
Do you scaffold placeholders?
How does the intent analyzer know the frontend-only app builder constraints?
How do you stop generated UI from claiming real persistence exists?
How do you communicate unsupported features in the UI?
How do you avoid breaking trust while still being helpful?

15. Recovery And Reliability

API crashes mid-generation. What is recoverable?
Worker crashes mid-build. What is recoverable?
Docker daemon restarts. What do you do?
Redis restarts. What state is lost?
S3 upload partially fails. How do you detect it?
User refreshes browser during stream. Can they reconnect?
What is the recovery path for stuck “building” state?
How do you handle duplicate retries?
What cron/cleanup jobs exist?
How do you test failure recovery?

16. Observability

What logs do you emit per run?
What trace/span boundaries exist?
What metrics matter?
What alerts would you set?
How do you debug “preview stuck building”?
How do you debug “generated app is blank”?
How do you correlate user session → run → worker → container → S3 objects?
What dashboards would you build?
What are the top 5 production alerts?
How do you expose run-level debugging to users or support?

17. Security

How do you store user model API keys?
How are keys encrypted?
Who can access generated files?
How do you prevent SSRF/network abuse from generated code?
How do you prevent dependency supply-chain issues?
How do you scan for secrets before GitHub sync?
How do you isolate untrusted code execution?
How do you prevent path traversal in generated file paths?
How do you protect preview URLs?
How do you handle malicious prompts?
How do you handle generated code that tries to exfiltrate environment variables?
How do you rotate encryption keys?

18. Performance

Where is latency spent: model, install, build, upload, CDN?
How do you reduce time-to-first-file?
How do you reduce time-to-preview?
What do you stream to UI immediately?
What do you cache?
How do you prewarm containers?
How do you measure p50/p95 run duration?
How do you optimize dependency install?
How do you avoid cold-start spikes?
How do you prioritize interactive latency vs total completion time?

19. UX And State Management

How does the chat UI receive events?
SSE or WebSocket?
How do you handle reconnect?
How do you show file tree changes in real time?
How do you prevent UI from showing a file before write succeeds?
How do you represent run states: queued, running, building, failed, preview-ready?
How do you display partial generation?
How do you display repair attempts?
How do you let users inspect logs?
How do you avoid confusing the user when generation succeeds but preview fails?

20. Tradeoffs And Redesign

If rebuilding Edward today, what would you change?
Would you keep tag-based streaming or move to tool calls/JSONL?
Would you keep Docker per user or move to Firecracker/microVMs?
Would you store files in DB, S3, or Git from the start?
What was the hardest bug?
What was the biggest design mistake?
What did you deliberately not build?
What tradeoff are you still unsure about?
What part would fail first at 10x users?
What would you do differently for a team of 5 engineers maintaining it?

Part 3: Strong Interview Sequence

A strong Edward-focused deep interview could follow this sequence:

1. Architecture walkthrough

2. Product boundaries and unsupported capabilities

3. Intent classification

4. Orchestration loop and state machine

5. Streaming parser correctness

6. File write pipeline and idempotency

7. Sandbox isolation and security

8. Concurrency/admission control

9. BullMQ and worker failure recovery

10. Build failure repair loop

11. Preview routing and consistency

12. GitHub sync and conflict handling

13. Persistence model and recovery

14. Observability incident debugging

15. Redesign tradeoffs

Part 4: What Strong Answers Should Include

Strong answers should include:

Exact state machines, not just “orchestrator”
Redis key shapes, TTLs, locks, and cleanup behavior
DB source-of-truth decisions
Queue lifecycle and retry semantics
Idempotency guarantees
Failure-mode handling
Security boundaries
Tradeoffs and why alternatives were rejected
Observability/debugging strategy
Product constraints and graceful degradation

Weak answers usually rely on phrases like:

“We use Redis locking”
“The orchestrator handles it”
“Docker makes it isolated”
“The LLM gives tags”
“The UI prevents it”
“It works perfectly”

Those are starting points, not senior-level explanations.

Part 5: Practice Prompt

Use this prompt to start a real drill:

Interview me deeply on Edward as a senior/principal engineer. Ask one question at a time. Push on exact mechanisms, failure modes, tradeoffs, and production debugging. Do not break character until the interview is done. At the end, give honest hire/no-hire, level, weaknesses, and INR salary band.

For topic-specific practice:

Interview me deeply on as a senior/principal engineer. Ask one question at a time. Focus on exact mechanisms and failure cases. Do not break character until done. At the end, give honest hire/no-hire, level, weaknesses, and salary band.

Example topics:

Redis queues and worker failure recovery
React performance and streaming UI
Docker sandbox security
AI streaming parser design
RAG and reranking systems
GitHub sync and conflict handling
S3/CloudFront preview routing
Observability and incident debugging

questions

Edward / AI Systems Interview Question Bank

How To Use This

Part 1: Questions Asked In The Thread

1. Edward End-To-End Architecture

questions

Edward / AI Systems Interview Question Bank

How To Use This

Part 1: Questions Asked In The Thread

1. Edward End-To-End Architecture

2. Per-User Container Admission Control

3. Backend Concurrency Guarantee When UI Is Bypassed

1. The actual mechanism used, or

4. Streaming State-Machine Parser

5. JSX / HTML Safety In Tag-Based Parsing

1. <file path="src/Ap

2. p.tsx">\nexport default function App() {

3. \n return Hello\n}

4. \n

6. Nested Control-Like Tags Inside File Content

7. More Robust Streaming Protocol Design

8. Bonkers Pipeline Design

9. Agentic Chat Request Router

Part 2: Extended Edward Question Bank

1. Product Boundary And Requirements

2. Prompt To Intent Analyzer

3. Orchestration Loop

4. Streaming Parser

5. File Write Pipeline

6. Sandbox Design

7. Concurrency And Admission Control

8. BullMQ / Worker Architecture

9. Build System

10. Preview Pipeline

11. Persistence Model

12. GitHub Sync

13. Model Orchestration

14. Unsupported Capability Handling

15. Recovery And Reliability

16. Observability

17. Security

18. Performance

19. UX And State Management

20. Tradeoffs And Redesign

Part 3: Strong Interview Sequence

1. Architecture walkthrough

2. Product boundaries and unsupported capabilities

3. Intent classification

4. Orchestration loop and state machine

5. Streaming parser correctness

6. File write pipeline and idempotency

7. Sandbox isolation and security

8. Concurrency/admission control

9. BullMQ and worker failure recovery

10. Build failure repair loop

11. Preview routing and consistency

12. GitHub sync and conflict handling

13. Persistence model and recovery

14. Observability incident debugging

15. Redesign tradeoffs

Part 4: What Strong Answers Should Include

Part 5: Practice Prompt

3. \n return
Hello
\n}