Interview Questions

13 files4 subfolders

Shared Workspace

Interview Questions

Agentic

interview-questions-part5

Shared from "Interview Questions" on Inkdown

Edward Project - Comprehensive Interview Questions - Part 5

Table of Contents

General & Introductory Questions
Frontend Deep Dive
Backend API Patterns
Database Deep Dive
LLM Deep Dive
Docker Deep Dive
Redis Deep Dive
Security Deep Dive

General & Introductory Questions

interview-questions.md

Bonkers

BONKERS_END_TO_END_GUIDE.md

BONKERS_INTERVIEW_QUESTIONS.md

interview_questions.md

PROJECT_WALKTHROUGH_SCRIPT.md

Edward

pookie

interview-questions-part1.md

interview-questions-part2.md

interview-questions-part3.md

interview-questions-part4.md

interview-questions-part5.md

interview-questions-part6.md

interview-questions-part7.md

Q: What is Edward and what problem does it solve?

Answer: Edward is an AI-powered full-stack development platform that allows users to describe what they want to build in natural language and have an AI agent generate, build, and deploy a working web application. The problem it solves is the complexity and time required to go from an idea to a deployed application. Users don't need to know frameworks, build tools, or deployment infrastructure - they simply describe their requirements, and Edward handles everything from code generation to preview hosting. It targets both developers who want to accelerate prototyping and non-technical users who want to build web applications without coding.

Q: What is the high-level architecture of Edward?

Answer: Edward consists of four main components: (1) Next.js frontend (apps/web) that provides the chat interface, file editor, and preview viewer, (2) Express API server (apps/api) that handles HTTP requests, orchestrates AI agents, and manages background jobs, (3) BullMQ workers that process build jobs and agent run jobs asynchronously, (4) Docker containers that provide isolated sandboxes for code execution and building. The system uses Postgres for persistent data, Redis for queues and pub/sub, S3 and CloudFront for asset storage and CDN, and integrates with OpenAI/Anthropic/Gemini for LLM capabilities. The architecture is designed as a monorepo with pnpm workspaces for code sharing.

Q: Who is the target user for Edward?

Answer: Edward targets two primary user segments: (1) Developers who want to rapidly prototype ideas without setting up boilerplate, choosing frameworks, or configuring build tools - they can describe what they want and get a working application they can then customize, (2) Non-technical users or business stakeholders who want to build web applications but lack coding skills - they can interact with Edward through natural language and get functional applications. The system is designed to be approachable for beginners while providing enough control and extensibility for experienced developers who want to export or customize the generated code.

Q: What makes Edward different from other AI coding assistants?

Answer: Edward differentiates itself through end-to-end automation - it doesn't just generate code snippets but handles the entire lifecycle from idea to deployed preview. Key differentiators include: (1) Built-in build and deployment pipeline with Docker sandboxes, (2) Real-time preview hosting with subdomain or path-based routing, (3) Multi-turn conversation with context awareness and follow-up support, (4) Framework-agnostic with auto-detection and template-based builds, (5) Streaming responses with file editor integration for real-time code viewing, (6) Persistent project state with version history and rollback capability. Unlike assistants that only generate code, Edward is a complete development environment.

Q: What is the technology stack and why were these choices made?

Answer: The technology stack includes: Next.js for the frontend (chosen for React ecosystem, SSR capabilities, and App Router), Express for the API (chosen for minimal overhead and middleware control), BullMQ for job queues (chosen for Redis integration and simplicity), Drizzle ORM for database (chosen for SQL-first approach and type safety), Docker for sandboxes (chosen for full Node.js ecosystem support), Postgres for data (chosen for ACID compliance and JSONB support), Redis for queues/pub/sub (chosen for speed and existing requirement), S3/CloudFront for storage (chosen for scalability and CDN). Each choice balances functionality, team expertise, and operational simplicity.

Q: How does Edward handle user data and privacy?

Answer: Edward handles user data with several privacy measures: (1) User API keys are encrypted at rest using AES encryption before storage, (2) Chat messages and generated code are stored in Postgres with user-scoped access controls, (3) Sandbox containers are ephemeral and cleaned up after TTL expiration, (4) Preview URLs can be configured with custom subdomains for isolation, (5) GitHub integration requires explicit user authorization via OAuth, (6) All API requests require authentication via Better Auth session cookies, (7) Security telemetry logs anomalies for audit trails. The system is designed to minimize data retention - sandbox files are not permanently stored, only the generated code in the chat context is persisted.

Frontend Deep Dive

Q: How does Next.js App Router integration work in Edward?

Answer: Edward uses Next.js 14+ with the App Router pattern. The app structure includes server components for data fetching and client components for interactivity. The chat page (app/chat/[chatId]/page.tsx) is a server component that fetches initial chat history and renders the ChatPageClient client component. This hybrid approach leverages server-side rendering for initial load and client-side streaming for real-time updates. The App Router's file-based routing matches the chat URL structure. The API routes are in app/api/ for serverless functions, though Edward primarily uses the Express API server. The frontend uses Next.js Image optimization for user-uploaded images.

Q: How does the streaming implementation work in the frontend?

Answer: The frontend streaming implementation uses the native EventSource API for SSE. The openRunEventsStream function in lib/api/chat creates an EventSource connection to the /chat/:chatId/runs/:runId/stream endpoint. The chatStreamProcessor.ts processes incoming SSE events using a custom parser that handles the SSE format (data: JSON\n\n). Events are dispatched to a Zustand store which triggers React re-renders. The processor implements automatic replay with exponential backoff if the connection drops. It also batches dispatches using requestAnimationFrame to optimize rendering performance. The stream processor handles different event types (TEXT, THINKING, FILE_START, FILE_CONTENT, etc.) and updates the appropriate state slices.

Q: How does the file editor integrate with the streaming state?

Answer: The file editor (Monaco-based) integrates with streaming state through the sandbox store. When file events arrive (FILE_START, FILE_CONTENT, FILE_END), the stream processor updates the sandbox store's generatedFiles map. The file editor subscribes to this store and displays files as they're generated. Users can click on files in the file tree to open them in the editor. When users edit files in the editor, changes are sent to the API via the sandbox write endpoints (/chat/:chatId/sandbox/file). The editor also supports read-only mode and syntax highlighting based on file extensions. The integration ensures that AI-generated code and user edits are synchronized in real-time.

Q: How does the state management architecture work?

Answer: Edward uses Zustand for state management with a multi-store pattern. The main stores are: (1) chatStream store for streaming data keyed by chatId, (2) sandbox store for file editor state and sandbox operations, (3) rateLimit store for quota state, (4) chatHistory store for message persistence. Each store is a separate Zustand instance with its own state, actions, and selectors. The stores use React hooks (useChatStreamState, useSandbox, etc.) for component integration. State updates are optimistic - dispatched immediately when events arrive. The architecture allows multiple chats to be open simultaneously with independent state, and enables features like replay and resume without complex state synchronization.

Q: How does the frontend handle real-time updates across browser tabs?

Answer: The frontend uses the Broadcast Channel API to synchronize state across browser tabs. When a user sends a message in one tab, other tabs with the same chat open receive updates via broadcast events. The chatStreamContext and chatWorkspaceContext include broadcast channel listeners that update local state when remote changes occur. This ensures that if a user has multiple tabs open, they all stay in sync. The broadcast channel is also used for stop notice propagation - if a user stops a run in one tab, other tabs are notified to update their UI. This implementation provides a multi-tab experience without requiring server-side push for tab synchronization.

Q: How does the code splitting and lazy loading work?

Answer: Edward uses Next.js automatic code splitting based on routes. Each route is a separate chunk that's loaded on demand. The Monaco editor is loaded asynchronously via dynamic import to avoid blocking the initial page load. The file editor component uses next/dynamic with loading state to lazy load the heavy Monaco library. The streaming processor is also loaded on-demand. The build output is analyzed using the Next.js build analyzer to identify large chunks. The package.json includes sideEffects: false to enable tree shaking. This lazy loading strategy ensures the initial page load is fast even with heavy dependencies like Monaco Editor.

Q: How does the frontend handle image uploads?

Answer: Image uploads go through a dedicated endpoint /chat/image-upload with rate limiting. The frontend reads the file as an ArrayBuffer, sends it as raw binary in the request body with the appropriate Content-Type header. The API validates the file size (max 10MB) and MIME type (allowed types: image/jpeg, image/png, image/gif, image/webp). The uploaded image is stored in S3 with a unique key, and the URL is returned. The image URL is then included in the multimodal content sent to the LLM. The frontend displays uploaded images in the chat interface and includes them in the message context. Images are associated with the user message via the attachments table in the database.

Q: How does the preview integration work in the frontend?

Answer: The preview integration allows users to view their built application in an iframe. The preview URL is received via PREVIEW_URL events during the build process. The frontend has a preview panel (in components/chat/sandbox/preview/) that loads the preview URL in an iframe. For path-based previews, the iframe loads the S3/CloudFront URL directly. For subdomain-based previews, the iframe loads the custom subdomain URL. The preview panel includes controls to refresh, open in new tab, and toggle full-screen mode. The preview also handles postMessage communication for cross-origin interactions if needed. The preview state is managed in the sandbox store and persists across page reloads.

Backend API Patterns

Q: How is the route organization structured?

Answer: The API routes are organized by domain in separate files under apps/api/routes/. The main route files are: chat.routes.ts for chat-related endpoints, api-key.routes.ts for API key management, github.routes.ts for GitHub integration, and index.ts which mounts all routers. Each router is an Express Router instance with grouped endpoints. Routes use middleware composition - rate limiters, validation, and authentication are applied per-route or per-router. This organization follows domain-driven design principles where each route file handles a specific business domain. The routes are mounted in app.factory.ts with path prefixes (e.g., /chat for chat routes).

Q: How does the controller layer work?

Answer: The controller layer in apps/api/controllers/ contains the HTTP delivery logic. Controllers are responsible for: (1) Request validation using Zod schemas, (2) Calling use cases or services, (3) Mapping domain responses to HTTP responses, (4) Setting appropriate HTTP status codes, (5) Handling errors and converting to error responses. Controllers are thin - they don't contain business logic, only HTTP concerns. For example, the chat controller (chat.controller.ts) handles HTTP concerns while delegating to services like unifiedSendMessage for the actual logic. This separation allows the same business logic to be reused across different delivery mechanisms (HTTP, CLI, etc.).

Q: How does the service layer work?

Answer: The service layer in apps/api/services/ contains the business logic. Services are organized by domain: chat/ for chat operations, runs/ for agent run management, sandbox/ for Docker operations, planning/ for workflow engine, queue/ for job queue management. Services are pure functions or classes that don't depend on HTTP. They handle: (1) Business rules and validation, (2) Database operations via Drizzle, (3) External API calls (LLM providers, GitHub, S3), (4) Complex orchestration across multiple services. Services are called by controllers and by background workers. This layer enables testing without HTTP and reuse of business logic.

Q: How does request validation work?

Answer: Request validation uses Zod schemas defined in apps/api/schemas/. The validateRequest middleware takes a Zod schema and validates the request body against it. If validation fails, it returns a 400 error with detailed validation errors. Schemas are defined for each endpoint - for example, UnifiedSendMessageRequestSchema for the chat message endpoint. Validation includes type checking, required field validation, string length limits, enum value validation, and custom validators. The validation happens before the controller logic, ensuring invalid requests don't reach business logic. This provides early error detection and clear error messages to clients.

Q: How are error responses structured?

Answer: Error responses follow a consistent structure defined in utils/response.ts. The sendError function takes an HTTP status code and a message, and returns a JSON response with { error: message }. For validation errors, the response includes field-level errors. For rate limit errors, the response includes retry-after information. Errors are also logged with context (request ID, user ID, IP) via security telemetry. The error handling middleware in app.factory.ts catches all errors and converts them to error responses. This consistent error format allows the frontend to parse and display errors appropriately. Fatal errors are also sent to Sentry for monitoring.

Q: How does the SSE response streaming work on the server?

Answer: SSE streaming on the server uses Express's res.write() to send events. The response headers are set to Content-Type: text/event-stream, Cache-Control: no-cache, and Connection: keep-alive. Events are sent in the format data: JSON\n\n. The streamRunEventsFromPersistence function queries the database for events after a given sequence number and streams them to the client. It uses Redis pub/sub to get notified of new events and streams them as they arrive. The stream handles client disconnection gracefully via the close event. The server also supports Last-Event-ID for resuming streams from a specific point. The streaming implementation is in services/run-event-stream-utils/service.ts.

Database Deep Dive

Q: How does the database schema migration work?

Answer: Edward uses Drizzle Kit for database migrations. The schema is defined in packages/auth/lib/schema.ts using Drizzle ORM's schema builder. Migrations are generated by running drizzle-kit generate which compares the schema to the database and creates migration SQL files. Migrations are applied using drizzle-kit push in development or via a custom migration script in production. The schema includes tables for users, sessions, accounts (auth), chats, messages, attachments (chat), builds (build), runs, runEvents, runToolCalls (agent execution). The schema uses enums for status fields (runStatus, buildStatus, etc.) and JSONB for flexible metadata. Foreign key relationships are defined with proper cascading rules.

Q: What is the indexing strategy?

Answer: The database uses several indexes for query optimization: (1) Unique index on (runId, seq) in runEvent for event ordering, (2) Index on (chatId, createdAt DESC) in messages for chat history queries, (3) Index on (userId, status) in runs for user's active runs, (4) Index on (chatId, messageId, createdAt DESC) in builds for latest build lookup, (5) Index on (userId) in users for user lookups. The schema also uses unique constraints for (userId, provider) in accounts and (runId, idempotencyKey) in runToolCall. Indexes are added via Drizzle's .index() and .uniqueIndex() methods. The indexing strategy focuses on the most common query patterns.

Q: How does connection pooling work?

Answer: Database connection pooling is handled by the Postgres client (pg) configured in packages/auth/lib/db.ts. The connection string is built from the DATABASE_URL environment variable. Drizzle ORM uses this connection pool for all queries. The pool configuration includes parameters like max connections, connection timeout, and idle timeout. The pool is created once when the module loads and reused across all queries. In the API server, the pool is shared across all request handlers. In workers, each worker has its own connection pool. Connection pooling reduces the overhead of establishing new connections for each query and limits the maximum number of concurrent database connections.

Q: How does the transaction handling work?

Answer: Transactions are used for operations that require atomicity. Drizzle provides a db.transaction() method that wraps operations in a database transaction. The createRunWithUserLimit function uses transactions to ensure run admission checks and run creation happen atomically. The appendRunEvent function uses transactions to increment the sequence number and insert the event atomically. Transactions are also used for operations that modify multiple tables (e.g., creating a chat with initial messages). The transaction wraps multiple queries and commits only if all succeed, or rolls back if any fail. This ensures data consistency and prevents partial updates.

Q: How does the JSONB column usage work?

Answer: JSONB columns are used for flexible metadata storage. The users table has a JSONB column for custom settings. The runs table has a JSONB column for metadata (agent run metadata, checkpoint data, etc.). The build table has a JSONB column for error reports. JSONB allows storing structured data without requiring schema changes. Drizzle ORM provides type-safe access to JSONB fields using TypeScript types. Queries can filter on JSONB fields using Postgres's JSON operators (->>, @>). JSONB is more efficient than JSON because it's stored in a decomposed binary format that supports indexing.

Q: How does the backup and restore strategy work?

Answer: Database backup is handled through the BullMQ backup job system. The processBackupJob function in queue.worker.ts calls backupSandboxInstance which backs up sandbox state to S3. For database backups, the system relies on the infrastructure's backup strategy (e.g., AWS RDS automated backups). The backup job is enqueued after successful builds to preserve sandbox state. Restore happens during sandbox provisioning if shouldRestore is true - the system calls restoreSandboxInstance to restore files from S3 to the container. This backup/restore strategy ensures that sandbox state is preserved across container restarts and allows recovery from failures.

LLM Deep Dive

Q: How does the prompt engineering work?

Answer: Prompt engineering in Edward is implemented in lib/llm/prompts/sections.ts. The system uses modular prompt sections that are composed based on context. The composePrompt function takes parameters like framework, complexity, verifiedDependencies, mode, and profile. It builds a system prompt by combining sections: role definition, framework-specific instructions, complexity guidelines, tool usage instructions, output format requirements, and constraints. The prompt profile (COMPACT vs VERBOSE) controls verbosity. The prompt is dynamically generated based on the detected framework and the user's intent from the planning workflow. This modular approach allows easy adjustment of prompts without code changes.

Q: How does the tool calling implementation work?

Answer: Tool calling is implemented through a custom parser that extracts tool calls from LLM responses. The parser in lib/llm/parser.ts detects tool call patterns in the stream (e.g., <edward_sandbox> tags for file operations). When a tool call is detected, the system executes it via the appropriate service (sandbox write, sandbox command, etc.). Tool results are fed back to the LLM as assistant messages in the conversation history. The system tracks tool calls in the runToolCall table with idempotency keys to prevent duplicate execution. Tool calls can be retried on failure. The system supports multiple tool types: file operations, command execution, package installation, web search, and URL scraping.

Q: How does the streaming implementation work for different LLM providers?

Answer: The streaming implementation in lib/llm/provider.client.ts handles provider-specific streaming APIs. For OpenAI, it uses the Responses API with stream: true and extracts text deltas from the stream. For Anthropic, it uses the messages API with streaming and extracts text from content_block_delta events. For Gemini, it uses the generateContentStream API and extracts text from chunks. Each provider has different event formats and requires different parsing logic. The streamResponse function is an async generator that yields text chunks uniformly across providers. The function also handles abort signals for cancellation and token usage tracking (for Anthropic). This abstraction allows switching providers without changing the calling code.

Q: How does the cost optimization work?

Answer: Cost optimization happens through several strategies: (1) Context window validation prevents sending excessive tokens, (2) Compact prompt profiles reduce instruction overhead, (3) Skill compaction reduces history size, (4) Pre-verified dependencies avoid redundant tool calls, (5) Token counting before each LLM call helps users stay within limits, (6) The system tracks token usage per run and exposes it in the UI, (7) Users can select their preferred model to balance cost vs. quality. The system also caches prompt segments where possible. For vision models, images are resized to reduce image token costs. These optimizations help control LLM API costs while maintaining functionality.

Q: How does the fallback strategy work for LLM failures?

Answer: The system has multiple fallback strategies for LLM failures: (1) For OpenAI legacy models, it falls back to the completions endpoint if the Responses API fails, (2) For network errors, the system retries with exponential backoff, (3) For context limit errors, the system truncates history and retries, (4) For rate limit errors, the system waits and retries, (5) For authentication errors, the system prompts the user to update their API key, (6) For provider outages, the system can fall back to an alternative provider if configured. The fallback logic is implemented in the LLM client and in the agent loop runner. These fallbacks ensure resilience against transient failures and provider issues.

Q: How does the multi-modal content handling work?

Answer: Multi-modal content (text + images) is handled through the multimodal-utils/service.ts. The parseMultimodalContent function parses the content array which can contain text strings and image objects. Images are uploaded to S3 and the URLs are included in the LLM request. For OpenAI, images are formatted in the GPT-4 Vision format with image URLs. For Anthropic, images are base64-encoded in the messages. For Gemini, images are passed as inline data. The system validates that the selected model supports vision before including images. The buildMultimodalContentForLLM function formats the content appropriately for each provider. This allows users to include screenshots or reference images in their prompts.

Docker Deep Dive

Q: How does the container image strategy work?

Answer: The container image strategy uses a base Node.js image with framework-specific variations. The default image is a Node.js Alpine image with common build tools. Framework-specific images (Next.js, Vite, etc.) include framework-specific dependencies and build tools. Images are pulled from a Docker registry configured via DOCKER_REGISTRY_BASE. The system supports a prewarm strategy where the base image is pulled during server startup to reduce cold start latency. Images are tagged with version identifiers for reproducibility. The image selection happens in the template registry based on the detected or specified framework. This strategy balances image size (using Alpine) with functionality (including necessary tools).

Q: How does the multi-stage build work?

Answer: Multi-stage builds are not currently used in Edward's sandbox images, but could be implemented for optimization. The current approach uses single-stage images with all tools included. Multi-stage builds would involve: (1) A builder stage with build tools (gcc, make, Python) for compiling native modules, (2) A runtime stage with only the Node.js runtime and compiled artifacts, (3) Copying artifacts from builder to runtime. This would reduce final image size by excluding build tools. The system could implement this if image size becomes a concern. Currently, the priority is on having all tools available in the container for flexibility.

Q: How does the container networking work?

Answer: Container networking is managed dynamically based on operation needs. Containers are created without network access by default. When dependency installation or building is needed, the container is connected to a Docker network that provides internet access via the connectToNetwork function. After the operation completes, the container is disconnected via disconnectContainerFromNetwork. This just-in-time networking reduces security surface area by only granting network access when needed. The network configuration is handled by the Docker daemon and doesn't require complex network setup. The system uses Docker's default bridge network for simplicity.

Q: How does volume management work?

Answer: Volume management in Edward is handled through Docker's bind mounts or named volumes. The container's working directory (/app) is mapped to the container's filesystem. Files written to the container persist only in the container's filesystem - they are not mounted to the host. For backup, files are copied from the container to S3 via the Docker exec API, not via volume mounts. This approach avoids host filesystem dependencies and makes the system more portable. The system could be extended to use named volumes for persistent storage if needed, but currently uses ephemeral containers with backup to S3.

Q: How does the container resource limiting work?

Answer: Container resource limiting is configured via Docker daemon settings rather than per-container limits. The system relies on the Docker daemon's default resource limits (CPU shares, memory limits). For production deployments, these could be configured via Docker daemon options or container orchestration (Kubernetes). The system doesn't currently set per-container resource limits, which means one runaway container could affect others. This is a potential area for improvement - setting CPU and memory limits per sandbox would prevent resource contention between concurrent builds.

Redis Deep Dive

Q: What Redis data structures are used?

Answer: Edward uses several Redis data structures: (1) Strings for distributed locks (SET NX PX), (2) Strings for flush markers with TTL, (3) Lists for job queues (managed by BullMQ), (4) Pub/Sub channels for build status and run event updates, (5) Hashes could be used for more complex state but are currently not used. The system primarily uses simple string operations for locks and markers, while BullMQ uses lists for queues. Pub/Sub is used for real-time communication between the API server and workers. The key naming convention follows patterns like edward:locking:provision:{chatId} for locks and edward:flush:due:{sandboxId} for flush markers.

Q: How does the pub/sub implementation work?

Answer: The pub/sub implementation is in lib/redisPubSub.ts. The system creates two Redis clients: a subscriber client for receiving messages and a publisher client for sending messages. The subscriber client subscribes to channels like edward:build-status:{chatId} for build updates. The publisher client publishes messages to these channels when status changes occur. The API server uses pub/sub to notify clients of build status changes without polling. The workers use pub/sub to coordinate (e.g., signaling when a build completes). The pub/sub implementation handles reconnection on connection drops and includes error handling.

Q: How does the key naming convention work?

Answer: The Redis key naming convention uses a hierarchical pattern with prefixes: edward: for all Edward keys, followed by the subsystem (e.g., locking:, flush:, queue:), followed by the resource identifier (chatId, sandboxId, etc.). For example: edward:locking:provision:{chatId} for sandbox provisioning locks, edward:flush:due:{sandboxId} for flush markers, edward:build-status:{chatId} for pub/sub channels. BullMQ queues use keys like bull:build-queue and bull:agent-run-queue. This convention makes it easy to identify keys in Redis CLI and prevents key collisions.

Q: How does memory management work?

Answer: Redis memory management is handled by Redis's built-in eviction policies. Edward sets TTL on keys that should expire (flush markers have TTL equal to sandbox TTL, locks have TTL based on operation duration). Queue data is managed by BullMQ which has its own job retention policies. The system doesn't currently use Redis maxmemory settings, relying on the Redis server's default configuration. For production, it would be important to set an eviction policy (e.g., allkeys-lru) and monitor memory usage. The system could also implement key expiration for old chat history or run events to prevent unbounded memory growth.

Q: How does the persistence configuration work?

Answer: Redis persistence configuration is managed at the Redis server level, not by the application. Edward uses Redis's default persistence (typically RDB snapshots). For production, AOF (Append Only File) persistence might be preferred for durability. The application doesn't configure persistence - it relies on the Redis server configuration. This is acceptable because queue data (BullMQ) can be recovered from the database if Redis is lost, and transient data (locks, flush markers) can be recreated. For critical data that must survive Redis restarts, the system uses the database as the source of truth.

Security Deep Dive

Q: How does XSS prevention work?

Answer: XSS prevention is implemented through multiple layers: (1) React's automatic escaping of JSX content prevents most XSS, (2) The CSP (Content Security Policy) header restricts script sources to trusted domains, (3) User input is sanitized before being rendered in certain contexts, (4) The file editor uses Monaco which has built-in XSS protection, (5) Preview iframes are loaded with sandbox attributes to restrict their capabilities. The CSP doesn't allow unsafe-inline or unsafe-eval, which prevents script injection attacks. For user-generated content in the chat, the system trusts the LLM output but could add sanitization if needed. The preview iframe uses the sandbox attribute to limit its access to the parent page.

Q: How does CSRF protection work?

Answer: CSRF protection is implemented through SameSite cookie settings and the authentication flow. Better Auth sets cookies with SameSite=Lax by default, which prevents CSRF in most scenarios. The API also checks the Origin header for state-changing requests (though this is not explicitly implemented in the current code). For extra protection, the system could implement CSRF tokens for state-changing endpoints. Currently, the reliance on SameSite cookies and the fact that the API doesn't support cross-origin requests from untrusted domains provides sufficient CSRF protection for the current threat model.

Q: How does SQL injection prevention work?

Answer: SQL injection prevention is handled by Drizzle ORM's parameterized queries. All database queries use Drizzle's query builder which automatically parameterizes user input. Raw SQL is only used in specific cases (like advisory locks) with hardcoded values, not user input. The sql template literal in Drizzle is used for dynamic SQL but still uses parameterization. For example, the sequence increment uses sql${run.nextEventSeq} + 1`` which is parameterized. The system never concatenates user input into SQL strings. This approach ensures that user input cannot influence the SQL query structure.

Q: How does secrets management work?

Answer: Secrets management relies on environment variables and encryption at rest. Sensitive data (API keys, encryption keys, AWS credentials) are stored as environment variables and never committed to git. The ENCRYPTION_KEY is used to encrypt user API keys before database storage. The encryption key itself is an environment variable that must be provided at deployment. In production, this would typically come from a secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.). The system doesn't currently implement secret rotation, but this could be added by re-encrypting user API keys when the encryption key changes.

Q: How does audit logging work?

Answer: Audit logging is implemented through the security telemetry middleware and Sentry integration. Security events (auth failures, rate limit violations, HTTP anomalies) are logged with context including request ID, user ID, IP, path, and timestamp. These logs are sent to the logging system (console, file, or external log aggregation). Sentry captures errors with user context for debugging. The system could be extended to write audit events to a dedicated audit log table in the database for long-term retention and compliance. Currently, audit logs are primarily for operational monitoring and security incident response.