InkdownInkdown
Start writing

Interview Questions

13 files·4 subfolders

Shared Workspace

Interview Questions
Agentic

questions

Shared from "Interview Questions" on Inkdown

Edward / AI Systems Interview Question Bank

Prepared from the interview thread plus the extended Edward-specific question bank.

How To Use This

Use this as a deep interview/practice guide. The goal is not to memorize answers, but to explain exact mechanisms, invariants, failure modes, tradeoffs, and production debugging paths. Recommended format:

  • Ask one question at a time.
  • Candidate answers in concrete system detail, not resume language.
  • Interviewer follows up until the invariant, failure mode, and tradeoff are clear.
  • Passing is allowed, but repeated passes lower the leveling signal.
  • At the end, evaluate hire/no-hire, level, weaknesses, and expected compensation band.

Part 1: Questions Asked In The Thread

1. Edward End-To-End Architecture

For Edward, walk through the end-to-end architecture after a user types:

“Build me a todo app with deployable preview.”

export default function App() { return }
interview-questions.md
Bonkers
BONKERS_END_TO_END_GUIDE.md
BONKERS_INTERVIEW_QUESTIONS.md
interview_questions.md
PROJECT_WALKTHROUGH_SCRIPT.md
Edward
pookie
questions
interview-questions-part1.md
interview-questions-part2.md
interview-questions-part3.md
interview-questions-part4.md
interview-questions-part5.md
interview-questions-part6.md
interview-questions-part7.md

Explain the lifecycle from prompt request to preview:

  • Prompt request
  • Intent analysis
  • Orchestration
  • Code generation
  • File writes
  • Sandbox/build
  • Preview routing
  • Recovery/failure handling

Be specific. Avoid marketing language. Explain ownership-level system behavior.

2. Per-User Container Admission Control

Edward allows 2 running containers per user. User A submits 3 app-generation prompts rapidly from the same account/session. Walk through the exact admission-control flow for the 3rd request. Cover:

  • Where the per-user running-container count is stored
  • Whether the 3rd request queues, rejects, or waits
  • How Redis locking prevents races between two workers starting containers for the same user
  • What TTL/heartbeat/recovery exists if a worker crashes after incrementing the count but before cleanup
  • How you reconcile Redis state with actual Docker state if they diverge

Answer like you are debugging a real production incident, not describing the happy path.

3. Backend Concurrency Guarantee When UI Is Bypassed

Assume the UI check is bypassed. Three HTTP requests from the same user hit your API within 50ms. What exact backend logic makes sure only two become accepted? Give either:

1. The actual mechanism used, or
  1. The mechanism you would implement now.

Be concrete:

  • Redis command/script
  • BullMQ limiter/grouping
  • DB transaction
  • Unique row/lease table
  • Any other exact design

Also explain:

  • What Redis keys exist
  • What state is mutated
  • How stale locks are released
  • What happens if the process crashes after incrementing the count
4. Streaming State-Machine Parser

Edward has a streaming state-machine parser that converts raw LLM chunks into structured events in real time. Walk through the parser. Cover:

  • What the raw model output looks like
  • What event types are emitted
  • How partial chunks are handled when they split JSON/tags/code fences
  • What happens when the model emits malformed or out-of-order data
  • How handlers are dispatched
  • Whether malformed output is skipped, repaired, or retried
5. JSX / HTML Safety In Tag-Based Parsing

Suppose the model generates:

Hello

But streaming chunks arrive like:

1. <file path="src/Ap
2. p.tsx">\nexport default function App() {
3. \n return
Hello
\n}
4. \n

How does the parser avoid treating React

as one of the control tags? Once the complete file tag is available:

  • Do you write the whole file at once?
  • Or stream file content progressively into the sandbox?
6. Nested Control-Like Tags Inside File Content

If the parser is inside a block, does it ignore all nested unknown tags until the matching ? Example:

return not a control tag

inside App.tsx. Would that prematurely close the file block? Or do you only treat as a terminator when it matches the currently active control tag from parser state? Also explain whether file content is written only after the closing file tag or streamed progressively while the file block is open.

7. More Robust Streaming Protocol Design

Design a more robust version of the streaming structured event protocol. You still want streaming structured events from the model, but file contents may contain arbitrary JSX/HTML/XML-like text. Pick one protocol and explain how it streams, validates, and recovers from malformed chunks. Possible options:

  • Length-prefixed blocks
  • JSON Lines with base64 content
  • Tool/function calls
  • Multipart boundaries
  • Escaped CDATA-like sections
  • Custom grammar

Explain why your chosen protocol prevents file content from being confused with control syntax.

8. Bonkers Pipeline Design

For Bonkers, you claimed you unified recreation, regeneration, and editing workflows into a single image-to-image pipeline, contributing to a 50% DAU increase. Walk through the pipeline design. Cover:

  • What the separate v2 flows were before
  • What abstraction was introduced in v3
  • How templates fit into the pipeline
  • How routing worked across Flux, Recraft, Ideogram, or other providers
  • How provider selection, retries, latency, and quality tradeoffs were handled
9. Agentic Chat Request Router

For Agentic Chat, you claimed:

  • LangGraph multi-step workflows
  • Intelligent tool selection
  • RAG with pgvector embeddings
  • Cohere reranking
  • Mem0 long-term memory
  • Semantic caching
  • Conversation branching

A user asks:

“Summarize the contract I uploaded last week and email the key risks to my manager.”

How does the system decide between RAG, memory, tools, and Google Workspace actions? Describe:

  • Graph nodes
  • Routing conditions
  • Permission boundaries
  • Failure handling
  • Whether email sending requires confirmation
  • How uploaded documents are located
  • How manager identity is resolved
  • What happens if required context is missing

Part 2: Extended Edward Question Bank

1. Product Boundary And Requirements
  • What exactly can Edward generate today: static frontend only, API calls, mock data, stateful forms, routing, external SDKs?
  • What is explicitly out of scope: auth flows, backend persistence, payments, server actions?
  • How do you stop the model from generating unsupported features?
  • When the user asks for unsupported functionality, do you reject, degrade, mock, or explain?
  • How are product constraints represented in the prompt, intent analyzer, and UI?
  • How do you test that unsupported features are handled consistently?
2. Prompt To Intent Analyzer
  • What are the intent categories?
  • How do you distinguish casual chat, edit request, new app generation, bug fix, preview request, and GitHub sync?
  • Is the classifier LLM-based, rules-based, or hybrid?
  • What happens on low confidence?
  • How do you evaluate false positives and false negatives?
  • What telemetry do you track for misclassified requests?
  • Can the user override intent classification?
  • How do you prevent casual conversation from triggering costly agent runs?
3. Orchestration Loop
  • What is one “turn” in the orchestrator?
  • What state is persisted between turns?
  • What makes the loop stop?
  • How do you prevent infinite continue loops?
  • How do you handle partial completion, such as app generated but build failed?
  • How do you replay or resume a failed run?
  • What is the exact run state machine?
  • Where are turn outputs stored?
  • How do you prevent duplicate handler execution on retry?
  • How do you decide whether the agent should continue, repair, or stop?
4. Streaming Parser
  • What exact tags/events exist?
  • What does raw model output look like?
  • How do you parse partial chunks?
  • How do you handle malformed tags?
  • How do you avoid JSX/HTML inside file content being parsed as control syntax?
  • Are handlers idempotent?
  • If the stream disconnects mid-file, what state is saved?
  • How would you redesign the protocol more safely?
  • Do you validate event schemas before dispatch?
  • Do you support nested events?
  • Do you emit partial UI events before full validation?
  • How do you test parser edge cases?
5. File Write Pipeline
  • Do you write files progressively or only after complete file blocks?
  • How do you ensure atomic file writes?
  • Do you write a temp file and then rename?
  • How do you handle concurrent edits to the same file?
  • How do you preserve user edits when AI regenerates?
  • How do you diff/patch instead of overwrite?
  • What happens if two agent steps write the same path?
  • How do you handle binary assets?
  • How do you validate file paths to prevent path traversal?
  • How do you prevent writes outside the workspace?
  • How is the file tree synchronized to the UI?
  • What happens if file write succeeds but DB update fails?
6. Sandbox Design
  • How do you create a sandbox?
  • What image is used for each framework?
  • How do you isolate filesystem, network, CPU, and memory?
  • How do you prevent malicious generated code from escaping?
  • Are containers long-lived or per-run?
  • How do you clean them up?
  • How do you handle dependency install time?
  • Do you cache node_modules or the package manager store?
  • How do you enforce timeout limits?
  • How do you prevent network abuse or SSRF?
  • What happens if Docker daemon restarts?
  • How do you detect zombie containers?
7. Concurrency And Admission Control
  • Why 2 containers per user?
  • How is the cap enforced on the backend?
  • What Redis keys, locks, or state exist?
  • What happens when 3 requests hit within 50ms?
  • What happens if the worker crashes after acquiring a slot?
  • What TTL or heartbeat exists?
  • How do you reconcile Redis with real Docker state?
  • What metrics tell you slots are leaking?
  • Is the limit per user, per org, per browser session, or global?
  • What happens when a user opens multiple tabs?
  • Is the 3rd request rejected, queued, or delayed?
  • How do you avoid starvation?
8. BullMQ / Worker Architecture
  • What queues exist?
  • What is the job payload?
  • What retries are safe?
  • Which jobs are idempotent?
  • How do you dedupe repeated requests?
  • What is your backoff strategy?
  • How do you handle stalled jobs?
  • How do you separate agent jobs, build jobs, cleanup jobs, and preview jobs?
  • What do you persist before enqueueing vs inside the worker?
  • How do you avoid losing a job between DB write and queue publish?
  • How do you handle poison jobs?
  • How do you monitor queue depth and processing latency?
9. Build System
  • After code generation, how do you install dependencies?
  • How do you detect package manager?
  • How do you prevent unsafe install scripts?
  • How do you handle build errors?
  • Do you ask the model to repair build errors automatically?
  • How many repair attempts?
  • How do you stream build logs to the UI?
  • How do you separate TypeScript errors, lint errors, and runtime errors?
  • How do you handle missing dependencies?
  • How do you prevent infinite install/build loops?
  • How do you cache builds?
  • How do you make builds reproducible?
10. Preview Pipeline
  • What exactly goes to S3?
  • What path structure do you use: user/chat/run/version?
  • How do you make preview URLs stable?
  • How does CloudFront rewrite work?
  • How do you invalidate cache?
  • How do you prevent one user accessing another user’s preview?
  • How do you handle rollback to a previous preview version?
  • What happens if upload succeeds but DB update fails?
  • How do you handle partial upload failure?
  • How do you version previews?
  • How do you expire old previews?
  • How do you debug a blank preview?
11. Persistence Model
  • What tables exist for users, chats, runs, builds, files, and previews?
  • What is the source of truth for project state?
  • Do you store generated files in DB, S3, Git, or Docker filesystem?
  • How do you version file states?
  • How do you resume after API restart?
  • How do you migrate schema without breaking old projects?
  • How do you represent failed, partial, and completed runs?
  • How do you associate model events with persisted files?
  • How do you handle DB transaction boundaries?
  • What data is safe to delete during cleanup?
12. GitHub Sync
  • How do you authenticate GitHub?
  • How do you create repo, branch, and commit?
  • How do you handle conflicts?
  • Do you force push or open PRs?
  • How do you map generated file tree to Git blobs?
  • What happens if sync fails midway?
  • How do you avoid leaking API keys into commits?
  • How do you handle large files?
  • How do you handle existing repos?
  • How do you preserve commit history?
  • How do you map Edward project versions to Git commits?
  • How do you recover if GitHub API rate limits you?
13. Model Orchestration
  • OpenAI vs Gemini: how do you choose?
  • How do you handle model-specific streaming differences?
  • What is the system prompt structure?
  • How do you fit large projects into context?
  • Do you use file summaries, embeddings, or dependency graphs?
  • How do you validate generated code before applying it?
  • How do you prevent prompt injection from project files?
  • How do you handle tool/model failure?
  • How do you prevent the model from hallucinating unavailable libraries?
  • How do you track model cost per run?
  • How do you evaluate output quality?
  • How do you roll out prompt changes safely?
14. Unsupported Capability Handling
  • User asks for auth, database, payments, cron, or backend API. What happens?
  • Do you generate frontend mocks?
  • Do you explain limitation?
  • Do you scaffold placeholders?
  • How does the intent analyzer know the frontend-only app builder constraints?
  • How do you stop generated UI from claiming real persistence exists?
  • How do you communicate unsupported features in the UI?
  • How do you avoid breaking trust while still being helpful?
15. Recovery And Reliability
  • API crashes mid-generation. What is recoverable?
  • Worker crashes mid-build. What is recoverable?
  • Docker daemon restarts. What do you do?
  • Redis restarts. What state is lost?
  • S3 upload partially fails. How do you detect it?
  • User refreshes browser during stream. Can they reconnect?
  • What is the recovery path for stuck “building” state?
  • How do you handle duplicate retries?
  • What cron/cleanup jobs exist?
  • How do you test failure recovery?
16. Observability
  • What logs do you emit per run?
  • What trace/span boundaries exist?
  • What metrics matter?
  • What alerts would you set?
  • How do you debug “preview stuck building”?
  • How do you debug “generated app is blank”?
  • How do you correlate user session → run → worker → container → S3 objects?
  • What dashboards would you build?
  • What are the top 5 production alerts?
  • How do you expose run-level debugging to users or support?
17. Security
  • How do you store user model API keys?
  • How are keys encrypted?
  • Who can access generated files?
  • How do you prevent SSRF/network abuse from generated code?
  • How do you prevent dependency supply-chain issues?
  • How do you scan for secrets before GitHub sync?
  • How do you isolate untrusted code execution?
  • How do you prevent path traversal in generated file paths?
  • How do you protect preview URLs?
  • How do you handle malicious prompts?
  • How do you handle generated code that tries to exfiltrate environment variables?
  • How do you rotate encryption keys?
18. Performance
  • Where is latency spent: model, install, build, upload, CDN?
  • How do you reduce time-to-first-file?
  • How do you reduce time-to-preview?
  • What do you stream to UI immediately?
  • What do you cache?
  • How do you prewarm containers?
  • How do you measure p50/p95 run duration?
  • How do you optimize dependency install?
  • How do you avoid cold-start spikes?
  • How do you prioritize interactive latency vs total completion time?
19. UX And State Management
  • How does the chat UI receive events?
  • SSE or WebSocket?
  • How do you handle reconnect?
  • How do you show file tree changes in real time?
  • How do you prevent UI from showing a file before write succeeds?
  • How do you represent run states: queued, running, building, failed, preview-ready?
  • How do you display partial generation?
  • How do you display repair attempts?
  • How do you let users inspect logs?
  • How do you avoid confusing the user when generation succeeds but preview fails?
20. Tradeoffs And Redesign
  • If rebuilding Edward today, what would you change?
  • Would you keep tag-based streaming or move to tool calls/JSONL?
  • Would you keep Docker per user or move to Firecracker/microVMs?
  • Would you store files in DB, S3, or Git from the start?
  • What was the hardest bug?
  • What was the biggest design mistake?
  • What did you deliberately not build?
  • What tradeoff are you still unsure about?
  • What part would fail first at 10x users?
  • What would you do differently for a team of 5 engineers maintaining it?

Part 3: Strong Interview Sequence

A strong Edward-focused deep interview could follow this sequence:

1. Architecture walkthrough
2. Product boundaries and unsupported capabilities
3. Intent classification
4. Orchestration loop and state machine
5. Streaming parser correctness
6. File write pipeline and idempotency
7. Sandbox isolation and security
8. Concurrency/admission control
9. BullMQ and worker failure recovery
10. Build failure repair loop
11. Preview routing and consistency
12. GitHub sync and conflict handling
13. Persistence model and recovery
14. Observability incident debugging
15. Redesign tradeoffs

Part 4: What Strong Answers Should Include

Strong answers should include:

  • Exact state machines, not just “orchestrator”
  • Redis key shapes, TTLs, locks, and cleanup behavior
  • DB source-of-truth decisions
  • Queue lifecycle and retry semantics
  • Idempotency guarantees
  • Failure-mode handling
  • Security boundaries
  • Tradeoffs and why alternatives were rejected
  • Observability/debugging strategy
  • Product constraints and graceful degradation

Weak answers usually rely on phrases like:

  • “We use Redis locking”
  • “The orchestrator handles it”
  • “Docker makes it isolated”
  • “The LLM gives tags”
  • “The UI prevents it”
  • “It works perfectly”

Those are starting points, not senior-level explanations.

Part 5: Practice Prompt

Use this prompt to start a real drill:

Interview me deeply on Edward as a senior/principal engineer. Ask one question at a time. Push on exact mechanisms, failure modes, tradeoffs, and production debugging. Do not break character until the interview is done. At the end, give honest hire/no-hire, level, weaknesses, and INR salary band.

For topic-specific practice:

Interview me deeply on as a senior/principal engineer. Ask one question at a time. Focus on exact mechanisms and failure cases. Do not break character until done. At the end, give honest hire/no-hire, level, weaknesses, and salary band.

Example topics:

  • Redis queues and worker failure recovery
  • React performance and streaming UI
  • Docker sandbox security
  • AI streaming parser design
  • RAG and reranking systems
  • GitHub sync and conflict handling
  • S3/CloudFront preview routing
  • Observability and incident debugging