CREATOR.md - Bonkers Monorepo Architecture Document
System: Bonkers Monorepo Date: March 2026 Author: Principal Engineering Team Purpose: Zero-compromise architecture and engineering knowledge transfer
1. SYSTEM OVERVIEW
1.1 What Is Bonkers
Bonkers is a TypeScript monorepo containing a multi-platform AI application stack.
Applications:
- Website (
apps/website) - Next.js 14 web application (port 3001) - Extension (
apps/extension) - Chrome Extension (Manifest V3, Vite) - Arcane (
apps/arcane) - Express.js API server (port 8080) - Session Manager (
apps/session-manager) - Session state synchronization service
Shared Packages:
packages/app-config- Configuration (models, prompts, feature flags)packages/components- Reusable React componentspackages/hooks- Custom React hookspackages/types- Shared TypeScript typespackages/utils- Utility functionspackages/config- ESLint, Prettier, TypeScript configspackages/assets- Static assets
Infrastructure:
- Package Manager: pnpm 9.15.5 (workspaces)
- Build System: Turborepo
- Backend Framework: Express-Zod-API
- Database: Firestore (GCP)
- Cache: Redis
- Deployment: Vercel (frontend), Cloud Run (backend)
1.2 Architecture Layers
┌─────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Website │ │ Extension │ │
│ │ (Next.js) │ │ (Vite/CRX) │ │
│ └──────┬───────┘ └──────┬───────┘ │
└─────────┼─────────────────┼──────────────────────────────────┘
│ │
└─────────────────┼──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API GATEWAY / LOAD BALANCER │
│ (Firebase Auth + GCP Load Balancer) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ ARCANE BACKEND (Express + TypeScript) │ │
│ │ │ │
│ │ Middleware Chain: │ │
│ │ Auth → RateLimit → Context → Preware → Handler │ │
│ │ → Postware → Analytics │ │
│ └───────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ SESSION MANAGER (Real-time State Sync) │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Firestore │ │ Redis │ │ GCP Storage │ │
│ │ (Primary) │ │ (Cache) │ │ (Files) │ │
│ └──────────────┘ └───────┬──────┘ └──────────────────┘ │
└────────────────────────────┼────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ EXTERNAL AI PROVIDERS │
│ OpenAI • Anthropic • Google • Meta • Mistral • Fal AI │
└─────────────────────────────────────────────────────────────┘2. REPOSITORY STRUCTURE
2.1 Monorepo Layout
bonkers/
├── apps/
│ ├── website/ # Next.js web application
│ ├── extension/ # Chrome extension (Vite)
│ ├── arcane/ # Backend API server
│ └── session-manager/ # Session state service
│
├── packages/
│ ├── app-config/ # Shared configuration
│ ├── components/ # React components
│ ├── hooks/ # React hooks
│ ├── types/ # TypeScript types
│ ├── utils/ # Utilities
│ ├── config/ # ESLint/Prettier/TS configs
│ └── assets/ # Static assets
│
├── docs/ # Documentation
├── patches/ # npm patches
├── package.json # Root package.json
├── pnpm-workspace.yaml # Workspace config
├── turbo.json # Turborepo config
└── .github/ # GitHub configs2.2 Key Configuration Files
Root package.json:
{
"name": "cauldron",
"packageManager": "pnpm@9.15.5",
"engines": { "node": "^20", "pnpm": ">=9" },
"scripts": {
"build": "turbo run build",
"dev": "turbo run dev",
"init": "git submodule update --init --recursive && pnpm i",
"lint": "turbo run lint"
}
}turbo.json:
{
"tasks": {
"build": {
"dependsOn": ["^build"],
"outputs": [".next/**", "!.next/cache/**"]
},
"dev": { "cache": false, "persistent": true },
"lint": { "dependsOn": ["^lint"] }
}
}pnpm-workspace.yaml:
packages:
- "apps/*"
- "packages/*"3. APPLICATION ARCHITECTURE
3.1 Website (apps/website/)
Tech Stack: Next.js 14, React 18, TypeScript, Tailwind CSS, shadcn/ui
Key Files:
| File | Purpose |
|---|---|
middleware.ts | Auth routing, locale prefixing, cookie management |
navigation.ts | Custom navigation (replaces next/link) |
auth/auth.config.ts | NextAuth configuration, token refresh |
auth/auth.cookies.ts | Cookie configuration for production |
next.config.js | Transpile packages, image domains |
tailwind.config.ts | Theme, plugins, typography |
Middleware Flow:
// middleware.ts
1. Clear old auth cookies
2. Check authentication state
3. Block routes based on auth (guest, emailNotVerified, normal)
4. Apply locale prefix (i18n)
5. Redirect unauthorized accessProject Requirements:
- Do NOT use
next/link- use customnavigation.ts - Do NOT use
useRouterfromnext/navigation- use wrapper - All API calls use axios
- All React queries wrapped in react-query
3.2 Extension (apps/extension/)
Tech Stack: Vite, React 18, Manifest V3, Tailwind CSS
Key Files:
| File | Purpose |
|---|---|
manifest.config.ts | Extension manifest (permissions, content scripts) |
src/background/index.ts | Service worker entry point |
src/background/background.messages.ts | Message handler |
src/contents/index.ts | Content script (injected into pages) |
src/lib/storage.ts | LocalStorageInstance (NOT chrome.storage) |
Architecture:
┌─────────────────┐
│ Background │ ← Service worker (persistent)
│ Service Worker │
└────────┬────────┘
│ Messages
▼
┌─────────────────┐
│ Content │ ← Injected into all URLs
│ Scripts │
└────────┬────────┘
│ DOM Access
▼
┌─────────────────┐
│ Web Page │
│ (Any URL) │
└─────────────────┘Project Requirements:
- Do NOT use
chrome.storage- useLocalStorageInstance - All API calls proxied through background script
- Content scripts run at
document_end
3.3 Arcane Backend (apps/arcane/)
Tech Stack: Express, Express-Zod-API, TypeScript, Firebase Admin
Entry Point: src/index.ts
import { createServer } from "express-zod-api";
import { config } from "./config/config";
import { routing } from "./config/routing";
if (isDevEnv) {
await generateTypes();
await generateDocumentation();
}
await createServer(config, routing);Directory Structure:
apps/arcane/src/
├── config/
│ ├── config.ts # Server config, CORS, Swagger
│ ├── logger.ts # Pino logger
│ └── routing.ts # ALL API routes
├── server/
│ ├── constantsSchemasAndTypes/ # Zod schemas, types
│ ├── endpoints/ # Route handlers
│ ├── factories/ # Endpoint factories
│ ├── middlewares/ # Request middlewares
│ ├── models/ # Business logic models
│ ├── repositories/ # Data access layer
│ ├── services/ # External integrations
│ └── utilities/ # Helpers
└── index.tsMiddleware Chain (execution order):
// apps/arcane/src/server/endpoints/unified/unified.ts
export const unified = timedAuthStreamEndpointsFactory
.addMiddleware(usageLimitsMiddleware) // 1. Check quotas
.addMiddleware(threadPreware) // 2. Setup context
.addMiddleware(providerConfigOverride) // 3. Model overrides
.addMiddleware(unifiedController) // 4. Main logic
.addMiddleware(threadPostware) // 5. Post-processing
.addMiddleware(usageAnalyticsMiddleware) // 6. Track usage
.build({...});3.4 Session Manager (apps/session-manager/)
Tech Stack: Express, Express-Zod-API, Firebase Admin, jose
Purpose: Real-time session state synchronization across devices
Key Dependencies:
express-zod-api- API frameworkfirebase-admin- Auth verificationjose- JWT handling@panva/hkdf- Key derivation
4. BACKEND DEEP DIVE
4.1 Middleware Architecture
Init Context (middlewares/initContext/initContext.ts):
// Sets up AsyncLocalStorage for request context
const storage = new AsyncLocalStorage<RequestContext>();
export const initContext = (req, res, next) => {
const context = { requestId: randomUUID(), startTime: Date.now() };
storage.run(context, () => next());
};Auth (middlewares/auth/auth.ts):
// 1. Extract JWT from Authorization header
// 2. Verify with Firebase Admin SDK
// 3. Create User instance, load from Firestore
// 4. Check/update user status
// 5. Store in request contextUsage Limits (middlewares/usageLimits/usageLimits.ts):
// 1. Check monthly usage < limit
// 2. Check daily usage < limit
// 3. Reset if reset time passed
// 4. Throw 429 if exceededThread Preware (middlewares/threadPreware/threadPreware.ts):
// CRITICAL: Sets up entire request context
// 1. Determine API version (V1/V2)
// 2. Load user settings
// 3. Initialize/load thread
// 4. Load personalization
// 5. Start async side actions
// 6. Initialize SSE stream
// 7. Process attachments
// 8. Create message nodes
// 9. Start content moderation
// 10. Calculate context window
// 11. Load history with embeddings
// 12. Initialize Schema
// 13. Store in request context
// 14. Update thread4.2 Request Context (AsyncLocalStorage)
interface RequestContext {
user: TUserDoc;
userInstance: User;
chatNode: Thread;
userMessageNode: Message;
assistantMessageNode: Message;
schema: Schema;
eventManager?: EventManager;
settingsV3: TSettings;
personalization: TPersonalization;
executionContext: 'USER' | 'TASK';
}
// Access anywhere after initContext
import { requestContext } from '../repositories/context/requestContext';
const { user, chatNode } = requestContext.get();4.3 Models
User Model (models/user.ts):
class User {
decodedToken: DecodedIdToken;
user: TUserDoc | null;
uid: string | null;
shouldUpdate: boolean;
async init() {
// 1. Get user from Firebase Auth
// 2. Load user document from Firestore
// 3. Handle guest → free conversion
// 4. Check email verification
// 5. Handle temporary Pro rewards
}
async updateUser() {
// Save to Firestore if shouldUpdate
}
}Thread Model (models/thread.ts):
class Thread implements TChatDB {
chatId: string;
threadMap: Map<string, Message>;
ref: DocumentReference;
static async init(chatId, createNew, config) {
// Load from Firestore, repair orphans, create if missing
}
async getActiveThreadWithEmbeddings(messageId, query, contextWindow) {
// Traverse backwards, accumulate messages, query embeddings
}
async update({ userMessage, assistantMessage }, options) {
// Save to Firestore with conflict retry logic
}
}4.4 Repositories
Context (repositories/context/requestContext.ts):
const storage = new AsyncLocalStorage<RequestContext>();
export const requestContext = {
get() { return storage.getStore() || {}; },
set(context) {
const store = storage.getStore() || {};
Object.assign(store, context);
}
};Schema (repositories/engine/schema.ts):
class Schema {
schema: TSchema;
setModel(model: TLLMModels, basedOnPlan: boolean) {
this.schema.model = model;
this.setMaxTokens(basedOnPlan);
}
injectPrompt({ message, messages, prompt, variables }) {
// Add system/user messages, validate placeholders
}
}Side Actions (repositories/sideActions/sideActions.ts):
// Run async operation (non-blocking)
sideActions.run(ActionTypes.CONTENT_MODERATION);
// Run with custom handler
sideActions.runCustom(fn, args, actionType);
// Wait for result (blocking)
const result = await sideActions.wait(ActionTypes.SELECT_MODEL);Streamer (repositories/streamer/streamer.ts):
// Initialize SSE
const init = (response: Response) => {
response.writeHead(200, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
});
};
// Stream chunks
const stream = async (req, response, inputStream) => {
// Process chunks, send SSE events
};
// End stream
const end = async (response: Response) => {
response.write(getMessage(END_MESSAGE));
response.end();
};Inter-Request Communication (repositories/irc/irc.ts):
// Redis pub/sub for cross-request communication
class InterRequestCommunicator {
static async addSubscription(ircId, callback) {
await this.redisSub.subscribe(channelId);
this.subList.set(channelId, callback);
}
static async sendMessage(ircId, message) {
await redisPub.publish(channelId, JSON.stringify(payload));
}
}5. DATA MODELS
5.1 User Document
interface TUserDoc {
uid: string;
email: string;
role: 'guest' | 'free' | 'paid' | 'member' | 'owner' | 'tools';
userPlan: 'GUEST' | 'FREE' | 'PRO' | 'PRO_TEMP' | 'TOOLS';
provider: 'anonymous' | 'google' | 'microsoft' | 'apple';
photoUrl: string;
monthlyUsage: {
cost: number;
resetsAt: number;
models?: Record<string, number>;
};
dailyUsage: {
cost: number;
resetsAt: number;
models?: Record<string, number>;
};
settings?: TUserSettings;
preferences?: TUserPreferences;
connectedApps?: { gdrive?: {...} };
connectedComposioApps?: Record<string, any>;
notificationTokensByDeviceId?: Record<string, any>;
topUps?: Array<{ amount: number; expiresAt: number }>;
}5.2 Thread Document
interface TChatDB {
chatId: string;
userId: string;
projectId?: string;
mode: TChatModes;
metadata: {
title?: string;
folderId?: string;
chatbot?: { id: string; name: string };
projectId?: string;
};
createdAt: Timestamp;
updatedAt: Timestamp;
}5.3 Message Document (V2)
interface TMessageDBV2 {
id: string;
chatId: string;
parentId?: string;
childrenId: string[];
role: 'user' | 'assistant' | 'system';
contentV2: TMessageContentV2;
modelId?: TLLMModels;
timestamp: Timestamp;
tokens: number;
attachments?: TAttachment[];
isPending: boolean;
error?: { type: ErrorType; message: string };
}
type TMessageContentV2 = Array<
| { type: 'TEXT'; text: string; tokens: number; reasoning?: string }
| { type: 'TOOL_RESULT'; toolResults: Array<{ name: string; content: string; tokens: number }> }
| { type: 'PROGRESS'; name: string; icon: string; steps: TProgressStep[]; metadata?: any }
>;6. API ROUTES
6.1 Public Routes
| Route | Method | Description |
|---|---|---|
/v1/public/health | GET | Health check |
/v1/rewards | GET | Get rewards |
/v1/register-ads | POST | Register ad views |
6.2 Private Routes (Auth Required)
Thread:
| Route | Method | Description |
|---|---|---|
/v1/thread/unified | POST | Main endpoint |
/v1/thread/stop | POST | Stop generation |
/v1/thread/message | POST | Send message |
Canvas:
| Route | Method | Description |
|---|---|---|
/v1/user/canvas/:canvasId | GET | Get canvas content |
/v1/user/canvas/:canvasId | POST | Update canvas content |
Canvas Architecture:
- Canvas content stored in GCP Storage (GCS) as JSON
- Path:
{uid}/canvas/{canvasId}.json - Structure:
{ values: TCanvasValues[], history: { undos: [], redos: [] } } - Supports version history with undo/redo
- Content type:
application/json(gzipped)
User:
| Route | Method | Description |
|---|---|---|
/v1/user/status | GET | Get user status |
/v1/user/history | GET | List history |
/v1/user/settings | GET/POST | Get/set settings |
/v1/user/shareChat | POST | Share chat |
Projects:
| Route | Method | Description |
|---|---|---|
/v1/projects | GET | List projects |
/v1/projects/create | POST | Create project |
/v1/projects/:id | GET/DELETE | Get/archive project |
Tools:
| Route | Method | Description |
|---|---|---|
/v1/tools/text/:toolId | POST | Text tools |
/v1/tools/image/:toolId | POST | Image tools |
/v1/tools/ai-detector | POST | AI detection |
Wallflower (Image Generation):
| Route | Method | Description |
|---|---|---|
/v1/wallflower/image-generation | POST | Generate images |
/v1/wallflower/images | GET | Get image history |
/v1/wallflower/like | POST | Like image |
/v1/wallflower/pin-image | POST | Pin image |
Full Route List: apps/arcane/src/config/routing.ts
7. TEMPLATES (WALLFLOWER)
7.1 Available Templates
Templates are pre-configured image generation presets:
| Template ID | Name | Model | Description |
|---|---|---|---|
ghibli-style | Ghiblify | gpt-image-1-medium | Convert to Studio Ghibli style |
watermark-remover | Watermark Remover | gemini-2.0-flash-exp | Remove watermarks |
product-photography | Product Photography | gpt-image-1-high | Professional product shots |
make-me-bald | Make Me Bald | gemini-2.0-flash-exp | Bald transformation |
minecraft-style | Minecraft Style | gpt-image-1-medium | Minecraft block style |
simpson-style | Simpson Style | gpt-image-1-high | Simpsons cartoon style |
pixar-style | Pixar Style | gpt-image-1-high | Pixar 3D animation style |
humanize-my-pet | Humanize My Pet | gpt-image-1-medium | Pet to human transformation |
7.2 Template Structure
// apps/arcane/src/server/constantsSchemasAndTypes/wallflower/templates.constants.ts
export const TEMPLATES = {
"ghibli-style": {
id: "ghibli-style",
name: "Ghiblify",
preset: {
prompt: "turn this into ghibli style",
style: "Auto",
modelConfig: {
modelId: "gpt-image-1-medium",
numberOfImages: 1,
},
},
},
// ... more templates
} as const;7.3 Template Processing
Controller: endpoints/wallflower/unified-generation.controller.ts
case "TEMPLATE":
switch (feature.templateId) {
case "product-photography":
formattedImage = await handleProductPhotographyTemplate({...});
break;
// ... other templates
}Usage Limits: Templates inherit model config from presets, usage calculated based on model + numberOfImages.
8. FALLBACK STRATEGIES
8.1 Generic Fallback Pattern
Utility: utilities/call-function-with-fallback.ts
export async function callWithFallback<T, U>(
primaryFunc: T,
fallbackFunc: U,
): Promise<Awaited<ReturnType<T> | ReturnType<U>>> {
try {
const res = await primaryFunc();
return res;
} catch (error) {
logger.error(error, "ERROR/PRIMARY_FUNCTION_FAILED");
return await fallbackFunc();
}
}8.2 Image Generation Fallbacks
Fal.ai ↔ Replicate Fallback:
// endpoints/wallflower/helpers/fal-ai.ts
export async function handleFalWithReplicateFallback({...}) {
return await callWithFallback(
async () => await callFalAI({...}),
async () => await callReplicate({
modelId: FALLBACK_MODELS_MAP[feature.modelConfig.modelId],
...
})
);
}
// endpoints/wallflower/helpers/replicate.ts
export async function handleReplicateWithFalAIFallback({...}) {
return await callWithFallback(
async () => await callReplicate({...}),
async () => await callFalAI({
modelId: FALLBACK_MODELS_MAP[feature.modelConfig.modelId],
...
})
);
}Fallback Models Map (constantsSchemasAndTypes/wallflower/unified-generation.constants.ts):
export const FALLBACK_MODELS_MAP = {
// Maps primary models to fallback equivalents
};8.3 Deep Research Fallbacks
Serp Query Fallback (features/deepResearch/firecrawlSerp.ts):
- Primary: Bing-based scraping
- Fallback 1: Firecrawl scraping
- Fallback 2: Basic URL fetch
// Helper function with fallback chain
async function scrapeWithFallbackMethod(urls: string[]) {
try {
// Try Firecrawl first
const result = await firecrawl.scrape(urls);
return result;
} catch (error) {
logger.error(error, "ERROR/DEEP_RESEARCH/FALLBACK_SCRAPING");
// Fall back to basic fetch
return await basicFetch(urls);
}
}Google Search Fallback (features/deepResearch/generateSerpQueries.ts):
// Parallel Google search fallback if Bing fails
try {
const results = await bingSearch(queries);
} catch (error) {
logger.error("ERROR/DEEP_RESEARCH/GOOGLE_SEARCH_FALLBACK");
const results = await googleSearch(queries);
}8.4 AI Detection Fallback
Service: services/aiDetection.ts
async detectWithFallback(text: string): Promise<{...}> {
try {
return await primaryDetectionAPI(text);
} catch (error) {
return await secondaryDetectionAPI(text);
}
}Usage:
endpoints/tools/aiDetection.tsendpoints/tools/public/aiDetectionPublic.tsendpoints/tools/aiEssayMetricsGenerator.ts
8.5 RAG Embeddings Fallback
File: endpoints/unified/features/rag.ts
const FALLBACK_ATTACHMENT_KEY = "DOCUMENT_CHUNK";
const getChunksFromFallbackLIResults = (embeddings: TLIResult[]): string[] => {
// Fallback for items without attachmentId/nodeIndex (old embeddings)
const fallbackKey = `${item.attributes?.name ?? FALLBACK_ATTACHMENT_KEY}`;
// Group and return chunks
};
// Used when:
// 1. New embedding format not available
// 2. attachmentId missing
// 3. nodeIndex missing (legacy embeddings)8.6 Model Selection Fallback (Merlin Magic)
File: endpoints/unified/features/merlinMagic.ts
const inferenceFallbackOutput =
// Fallback model selection logic when primary inference fails
return inferenceFallbackOutput;
logger.error("ERROR/IMAGE_INFERENCE_FALLBACK");8.7 YouTube Transcription Fallback
File: utilities/youtube/youtube.ts
// Commented out but documented fallback strategy
// const CACHE_FALLBACK_QUERY_CONSUMED = 2;
// const CACHE_FALLBACK_TOKENS_CONSUMED = 1000;
// Fallback when primary transcription method fails8.8 MCP Tool Result Fallback
File: utilities/mcp/functions/zapMCPToolResult.ts
const MAX_FALLBACK_CHAR_LIMIT = 20000; // approx. 5k tokens
// Truncate tool results to fit within limit
if (result.length > MAX_FALLBACK_CHAR_LIMIT) {
result = result.substring(0, MAX_FALLBACK_CHAR_LIMIT);
}8.9 Progress Event Fallback Index
File: constantsSchemasAndTypes/streamer/streamer.constants.ts
// Default fallback index for tool results as progress events
export const DEFAULT_FALLBACK_INDEX = 0;Usage: When tool result index is not specified, defaults to 0.
9. CRITICAL ARCHITECTURE DECISIONS
9.1 Why Express-Zod-API?
Decision: Use express-zod-api over raw Express, NestJS, Fastify
Rationale:
- Type safety with Zod schemas
- Auto-generated OpenAPI documentation
- Type-safe API client generation
- Clean middleware composition
- Built-in error serialization
Trade-offs:
- ✅ Pros: Type safety, auto-docs, less boilerplate
- ❌ Cons: Learning curve, vendor lock-in
9.2 Why AsyncLocalStorage?
Decision: Use Node.js AsyncLocalStorage for request context
Rationale:
- No prop drilling across 6+ middleware layers
- Global access without parameters
- Request isolation
- Minimal overhead
Risk: Single point of failure - if initContext fails, all context access returns empty object
9.3 Why Firestore?
Decision: Use Firestore (NoSQL) over PostgreSQL/MySQL
Rationale:
- Flexible schema for varying message structures
- Auto-scaling without sharding
- Nested data model (Thread → Messages)
- Firebase Auth integration
Limitations:
- No SQL joins (must denormalize)
- Transactions limited to 25 documents
- Eventual consistency
9.4 Why SSE Over WebSockets?
Decision: Use Server-Sent Events for streaming
Rationale:
- HTTP-based, no upgrade handshake
- Auto-reconnect built-in
- Firewall friendly
- One-way is sufficient for streaming
Trade-offs:
- ✅ Pros: Simple, low overhead, auto-reconnect
- ❌ Cons: One-way only, no binary data
9.5 Why Monorepo?
Decision: Use pnpm monorepo with Turborepo
Rationale:
- Code sharing across apps
- Atomic commits
- Consistent tooling
- Efficient builds (caching, parallelization)
Trade-offs:
- ✅ Pros: Code sharing, atomic commits, efficient builds
- ❌ Cons: Larger repo, coupled deployments
10. SECURITY
10.1 Authentication Flow
Client → Firebase Auth → JWT → Backend verifies → Firestore user docSecurity Measures:
- JWT verification (Firebase Admin SDK)
- Custom claims for RBAC
- User document for additional permissions
- Token refresh via NextAuth
Vulnerabilities:
| Vulnerability | Risk | Status |
|---|---|---|
| JWT token theft | High | ✅ Mitigated (short expiry, HttpOnly cookies) |
| Custom claims tampering | Critical | ✅ Mitigated (server-side only) |
| Firestore rule bypass | Critical | ✅ Mitigated (all queries through backend) |
| CSRF | Medium | ⚠️ Needs review |
10.2 Rate Limiting
Current: Only guest users rate limited (50 requests / 15 min via Redis)
Gaps:
- No rate limiting for authenticated users
- IP-based (bypassable with rotating IPs)
- No endpoint-specific limits
10.3 Input Validation
Layers:
- Zod schemas (all API inputs)
- Content moderation (async side action)
- Token limits (query size validation)
- File type validation (MIME type)
11. SCALABILITY
11.1 Current Scaling
Horizontal Scaling (Cloud Run):
arcane(primary)arcane-copy(failover)arcane-deepresearch(specialized)
Bottlenecks:
| Component | Current | At Scale | Solution |
|---|---|---|---|
| Firestore writes | ~1K/sec | Sharding needed | Shard by user ID |
| Redis | Single instance | Connection pool exhausted | Redis Cluster |
| LLM APIs | Rate limited per key | Multiple keys | Round-robin keys |
11.2 Performance Optimizations
Implemented:
- Async side actions (non-blocking)
- Redis caching (user settings, model configs)
- SSE streaming (reduced time-to-first-token)
- Skip embeddings for large contexts (>6000 tokens)
- Pre-calculated token counts
Opportunities:
- Response caching (identical queries)
- Embedding caching (RAG queries)
- Database indexing
- Connection pooling
11.3 Memory Management
Cloud Run Limits: 16GB max, 60min timeout, 4 vCPU
Memory Leaks to Watch:
- AsyncLocalStorage context not cleaned up on error
- Redis IRC subscriptions not cleaned up
- PassThrough streams not destroyed on error
Monitoring:
process.on("uncaughtException", (err) => {
logger.error(err, "ERROR/COMPLETE_INSTANCE_FAILURE");
setTimeout(() => process.exit(1), 20000);
});12. FAILURE MODES
12.1 Single Points of Failure
| Component | Impact | Recovery |
|---|---|---|
| Firebase Auth | Complete auth failure | 5-10 min |
| Firestore | All data operations fail | 10-30 min |
| Redis | Rate limiting, IRC fails | Immediate (bypass) |
| OpenAI API | GPT models unavailable | Immediate (fallback) |
12.2 Error Handling
Current Pattern:
try {
await riskyOperation();
} catch (error) {
if (error instanceof ClientError) throw error;
logger.error(error, "ERROR/OPERATION_FAILED");
throw new ServerError(500, ErrorType.INTERNAL_SERVER_ERROR);
}Issues:
- No retry logic for transient failures
- No circuit breakers
- No graceful degradation
12.3 Database Conflict Resolution
Retry Logic (Firestore document conflicts):
let messageIncrement = 0;
while (trying) {
try {
const batch = db.batch();
this.setMessage(batch, userMessage, this.totalDocs + messageIncrement);
await batch.commit();
trying = false;
} catch (err) {
if (messageIncrement === 3) throw MAX_RETRY_ERROR;
messageIncrement++;
}
}13. DEPLOYMENT
13.1 Pipelines
Backend (Cloud Run via cloudbuild.yaml):
steps:
- Build Docker image with Kaniko
- Push to GCR
- Deploy to Cloud Run (arcane, arcane-copy, arcane-deepresearch)Website (Vercel):
- Auto-deploy on push to develop/review branches
- Environment variables in Vercel dashboard
13.2 Environment Variables
Backend:
FIREBASE_PROJECT_ID=foyer-work
OPENAI_API_KEY=sk-...
REDIS_HOST=localhostWebsite:
NEXT_PUBLIC_API_URL=http://localhost:8080
NEXTAUTH_SECRET=...Critical: Environment variables NOT validated at startup - missing vars cause runtime errors.
13.3 Rollback
# Backend
gcloud run services update arcane --to-revisions=arcane-abc123=100
# Website
vercel rollback [deployment-url]14. PRINCIPAL ENGINEER INTERVIEW Q&A
Q1: How do you ensure atomic writes for related documents?
Problem: Two Firestore writes can result in orphaned documents if the second fails.
Current Solution: Retry with document index increment
let messageIncrement = 0;
while (trying) {
try {
const batch = db.batch();
this.setMessage(batch, msg1, this.totalDocs + messageIncrement);
this.setMessage(batch, msg2, this.totalDocs + messageIncrement);
await batch.commit();
trying = false;
} catch (err) {
if (messageIncrement === 3) throw error;
messageIncrement++;
}
}Better Solutions:
- Firestore Transactions: Atomic but limited to 25 docs
- Outbox Pattern: Write to outbox, process async
- Event Sourcing: Store changes as events
Key Insight: Retry-with-increment is pragmatic for Firestore contention. Not truly atomic but achieves eventual consistency.
Q2: How would you scale to 100K concurrent users?
Current Bottlenecks:
| Component | Current | Solution |
|---|---|---|
| Firestore writes | ~1K/sec | Shard by user ID |
| Redis | Single instance | Redis Cluster |
| LLM APIs | Rate limited | Multiple keys + round-robin |
Architecture Changes:
- Database Sharding:
const shardId = hash(userId) % NUM_SHARDS;
const db = getFirestoreShard(shardId);- Request Queue:
const queue = new Queue('llm-calls', {
redis: redisCluster,
defaultJobOptions: { attempts: 3, backoff: 'exponential' }
});- Response Caching:
const cacheKey = hash(prompt + model + settings);
const cached = await redis.get(cacheKey);
if (cached) return cached;Key Insight: LLM API rate limits are the biggest bottleneck, not infrastructure. Solution: multi-key rotation + caching.
Q3: How do you handle slow LLM providers?
Current: No timeout, no fallback - request hangs.
Better: Circuit Breaker + Timeout + Fallback
const openaiBreaker = new CircuitBreaker(callOpenAI, {
timeout: 30000,
errorThresholdPercentage: 50,
resetTimeout: 60000
});
async function callLLM(prompt, model) {
try {
return await openaiBreaker.fire(prompt);
} catch (error) {
if (openaiBreaker.opened) {
return await anthropicBreaker.fire(prompt); // Fallback
}
throw error;
}
}Key Insight: Use circuit breakers to fail fast, not just timeouts. Prevents cascading failures.
Q4: How would you implement fair rate limiting?
Current: Only guest users limited (50/15min).
Fair Design:
const rateLimits = {
GUEST: { points: 50, duration: 900 },
FREE: { points: 100, duration: 60 },
PRO: { points: 500, duration: 60 },
};
const endpointLimits = {
'/v1/thread/unified': { multiplier: 1 },
'/v1/tools/image': { multiplier: 5 },
'/v1/deep-research': { multiplier: 10 }
};
async function rateLimit(req, user) {
const limit = rateLimits[user.userPlan];
const cost = limit.points * endpointLimits[req.path].multiplier;
await rateLimiter.consume(user.uid, cost);
}Key Insight: Rate limit by user ID (not IP), apply endpoint-specific costs.
Q5: How do you prevent prompt injection?
Current: Basic content moderation, no specific injection detection.
Defense Layers:
- Input Sanitization:
const sanitized = input
.replace(/ignore previous instructions/gi, '')
.replace(/system prompt/gi, '');- Prompt Structure:
const prompt = `
<system>You are a helpful assistant</system>
<user-input>${escapeXml(userInput)}</user-input>
<instructions>Only respond to user-input above</instructions>
`;- Output Validation:
if (output.includes('system prompt')) {
throw new ClientError(400, ErrorType.INPUT_NOT_MODERATED);
}Key Insight: Prompt injection is an input validation problem. Defense in depth: sanitize, structure, validate.
Q6: How would you optimize history retrieval from O(n) to O(1)?
Current: Linear traversal through threadMap
getMessages() {
let parent = this.threadMap.get('root');
const messages = [];
while (parent) {
const child = this.threadMap.get(parent.childrenId[0]);
messages.push(child.getMessage());
parent = child;
}
return messages;
}Optimized:
- Denormalized Last N Messages:
interface TChatDB {
recentMessages: TMessageDBV2[]; // Last 10
totalMessages: number;
}- Message Index with Pointers:
interface TChatDB {
firstMessageId: string;
lastMessageId: string;
messageIndex: Map<string, { prev: string; next: string }>;
}Key Insight: For chat, you almost always need recent messages first. Denormalize last N, lazy load older.
Q7: How do you handle concurrent edits from multiple devices?
Current: Last-write-wins with Firestore server timestamp. No conflict resolution.
Solutions:
- Optimistic Concurrency Control:
await db.runTransaction(async (transaction) => {
const chat = await transaction.get(chatRef);
if (chat.data().version !== clientVersion) {
throw new ConflictError('Chat modified by another device');
}
transaction.update(chatRef, { version: clientVersion + 1 });
});- Queue-Based Serialization:
await chatQueue.add(chatId, { message, userId });
// Processed sequentially per chatIdKey Insight: For chat, optimistic concurrency + client-side merge is sufficient.
Q8: What's the cost structure per request?
GPT-4o (15x query cost):
| Component | Cost |
|---|---|
| Input tokens (1K) | $0.0075 |
| Output tokens (500) | $0.01125 |
| Embeddings (RAG) | $0.0001 |
| Firestore writes | $0.00002 |
| Cloud Run | $0.00001 |
| Total | ~$0.02 |
Deep Research (Claude 3 Opus, 50x): ~$1.00 per request
Key Insight: LLM API costs dominate (99%+). Optimize: reduce tokens, cache responses, use cheaper models.
Q9: What would cause catastrophic failure?
Answer: Firebase Auth + Firestore simultaneous outage.
Why:
- No auth fallback → all requests rejected
- No database fallback → can't read any data
- No offline mode → complete failure
Mitigation:
- Session Cache (Immediate):
const cached = await redis.get(`session:${token}`);
if (cached) return JSON.parse(cached);
const user = await verifyToken(token);
await redis.setex(`session:${token}`, 300, JSON.stringify(user));- Read-Only Fallback:
try {
await db.collection('chats').doc(id).get();
} catch (error) {
return cachedChats.get(userId);
}Key Insight: System has no graceful degradation. Any Firebase failure causes complete outage.
15. QUICK REFERENCE
Commands
# Local Development
pnpm dev # Start all apps
pnpm build # Build everything
# Backend (port 8080)
cd apps/arcane && pnpm dev
# Website (port 3001)
cd apps/website && pnpm dev
# Extension
cd apps/extension && pnpm devKey Files
| Purpose | File |
|---|---|
| API Routes | apps/arcane/src/config/routing.ts |
| Main Endpoint | apps/arcane/src/server/endpoints/unified/unified.ts |
| Auth Middleware | apps/arcane/src/server/middlewares/auth/auth.ts |
| User Model | apps/arcane/src/server/models/user.ts |
| Thread Model | apps/arcane/src/server/models/thread.ts |
| Schema Builder | apps/arcane/src/server/repositories/engine/schema.ts |
| Website Middleware | apps/website/middleware.ts |
Troubleshooting
| Symptom | Fix |
|---|---|
| 401 Unauthorized | Check Authorization header |
| 429 Rate Limited | Wait 15 minutes or upgrade |
| 500 Internal Error | Check Cloud Logging |
| Streaming fails | Check network, retry |
Last Updated: March 27, 2026 Version: 6.0.0 (Complete Architecture + Canvas + Templates + Fallbacks) Document Status: ✅ Complete — Bonkers monorepo architecture with all critical systems