InkdownInkdown
Start writing

Bonkers

1 file·0 subfolders

Shared Workspace

Bonkers
A-Z

A-Z

Shared from "Bonkers" on Inkdown

CREATOR.md - Bonkers Monorepo Architecture Document

System: Bonkers Monorepo Date: March 2026 Author: Principal Engineering Team Purpose: Zero-compromise architecture and engineering knowledge transfer


1. SYSTEM OVERVIEW

1.1 What Is Bonkers

Bonkers is a TypeScript monorepo containing a multi-platform AI application stack.

Applications:

  1. Website (apps/website) - Next.js 14 web application (port 3001)
  2. Extension (apps/extension) - Chrome Extension (Manifest V3, Vite)
  • Arcane (apps/arcane) - Express.js API server (port 8080)
  • Session Manager (apps/session-manager) - Session state synchronization service
  • Shared Packages:

    • packages/app-config - Configuration (models, prompts, feature flags)
    • packages/components - Reusable React components
    • packages/hooks - Custom React hooks
    • packages/types - Shared TypeScript types
    • packages/utils - Utility functions
    • packages/config - ESLint, Prettier, TypeScript configs
    • packages/assets - Static assets

    Infrastructure:

    • Package Manager: pnpm 9.15.5 (workspaces)
    • Build System: Turborepo
    • Backend Framework: Express-Zod-API
    • Database: Firestore (GCP)
    • Cache: Redis
    • Deployment: Vercel (frontend), Cloud Run (backend)
    1.2 Architecture Layers
    Plain text
    ┌─────────────────────────────────────────────────────────────┐
    │                    CLIENT LAYER                             │
    │  ┌──────────────┐  ┌──────────────┐                         │
    │  │   Website    │  │   Extension  │                         │
    │  │  (Next.js)   │  │  (Vite/CRX)  │                         │
    │  └──────┬───────┘  └──────┬───────┘                         │
    └─────────┼─────────────────┼──────────────────────────────────┘
              │                 │
              └─────────────────┼──────────────────────────────────┘
                                │
                                ▼
    ┌─────────────────────────────────────────────────────────────┐
    │                 API GATEWAY / LOAD BALANCER                 │
    │              (Firebase Auth + GCP Load Balancer)            │
    └─────────────────────────────────────────────────────────────┘
                                │
                                ▼
    ┌─────────────────────────────────────────────────────────────┐
    │                  APPLICATION LAYER                          │
    │                                                             │
    │  ┌───────────────────────────────────────────────────────┐ │
    │  │           ARCANE BACKEND (Express + TypeScript)       │ │
    │  │                                                       │ │
    │  │  Middleware Chain:                                    │ │
    │  │  Auth → RateLimit → Context → Preware → Handler       │ │
    │  │                      → Postware → Analytics           │ │
    │  └───────────────────────────────────────────────────────┘ │
    │  ┌───────────────────────────────────────────────────────┐ │
    │  │        SESSION MANAGER (Real-time State Sync)         │ │
    │  └───────────────────────────────────────────────────────┘ │
    └─────────────────────────────────────────────────────────────┘
                                │
                                ▼
    ┌─────────────────────────────────────────────────────────────┐
    │                     DATA LAYER                              │
    │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
    │  │  Firestore   │  │    Redis     │  │   GCP Storage    │  │
    │  │  (Primary)   │  │   (Cache)    │  │   (Files)        │  │
    │  └──────────────┘  └───────┬──────┘  └──────────────────┘  │
    └────────────────────────────┼────────────────────────────────┘
                                 │
                                 ▼
    ┌─────────────────────────────────────────────────────────────┐
    │                  EXTERNAL AI PROVIDERS                      │
    │  OpenAI • Anthropic • Google • Meta • Mistral • Fal AI    │
    └─────────────────────────────────────────────────────────────┘

    2. REPOSITORY STRUCTURE

    2.1 Monorepo Layout
    Plain text
    bonkers/
    ├── apps/
    │   ├── website/           # Next.js web application
    │   ├── extension/         # Chrome extension (Vite)
    │   ├── arcane/            # Backend API server
    │   └── session-manager/   # Session state service
    │
    ├── packages/
    │   ├── app-config/        # Shared configuration
    │   ├── components/        # React components
    │   ├── hooks/             # React hooks
    │   ├── types/             # TypeScript types
    │   ├── utils/             # Utilities
    │   ├── config/            # ESLint/Prettier/TS configs
    │   └── assets/            # Static assets
    │
    ├── docs/                  # Documentation
    ├── patches/               # npm patches
    ├── package.json           # Root package.json
    ├── pnpm-workspace.yaml    # Workspace config
    ├── turbo.json            # Turborepo config
    └── .github/               # GitHub configs
    2.2 Key Configuration Files

    Root package.json:

    JSON
    {
      "name": "cauldron",
      "packageManager": "pnpm@9.15.5",
      "engines": { "node": "^20", "pnpm": ">=9" },
      "scripts": {
        "build": "turbo run build",
        "dev": "turbo run dev",
        "init": "git submodule update --init --recursive && pnpm i",
        "lint": "turbo run lint"
      }
    }

    turbo.json:

    JSON
    {
      "tasks": {
        "build": {
          "dependsOn": ["^build"],
          "outputs": [".next/**", "!.next/cache/**"]
        },
        "dev": { "cache": false, "persistent": true },
        "lint": { "dependsOn": ["^lint"] }
      }
    }

    pnpm-workspace.yaml:

    YAML
    packages:
      - "apps/*"
      - "packages/*"

    3. APPLICATION ARCHITECTURE

    3.1 Website (apps/website/)

    Tech Stack: Next.js 14, React 18, TypeScript, Tailwind CSS, shadcn/ui

    Key Files:

    FilePurpose
    middleware.tsAuth routing, locale prefixing, cookie management
    navigation.tsCustom navigation (replaces next/link)
    auth/auth.config.tsNextAuth configuration, token refresh
    auth/auth.cookies.tsCookie configuration for production
    next.config.jsTranspile packages, image domains
    tailwind.config.tsTheme, plugins, typography

    Middleware Flow:

    TypeScript
    // middleware.ts
    1. Clear old auth cookies
    2. Check authentication state
    3. Block routes based on auth (guest, emailNotVerified, normal)
    4. Apply locale prefix (i18n)
    5. Redirect unauthorized access

    Project Requirements:

    • Do NOT use next/link - use custom navigation.ts
    • Do NOT use useRouter from next/navigation - use wrapper
    • All API calls use axios
    • All React queries wrapped in react-query
    3.2 Extension (apps/extension/)

    Tech Stack: Vite, React 18, Manifest V3, Tailwind CSS

    Key Files:

    FilePurpose
    manifest.config.tsExtension manifest (permissions, content scripts)
    src/background/index.tsService worker entry point
    src/background/background.messages.tsMessage handler
    src/contents/index.tsContent script (injected into pages)
    src/lib/storage.tsLocalStorageInstance (NOT chrome.storage)

    Architecture:

    Plain text
    ┌─────────────────┐
    │  Background     │ ← Service worker (persistent)
    │  Service Worker │
    └────────┬────────┘
             │ Messages
             ▼
    ┌─────────────────┐
    │  Content        │ ← Injected into all URLs
    │  Scripts        │
    └────────┬────────┘
             │ DOM Access
             ▼
    ┌─────────────────┐
    │  Web Page       │
    │  (Any URL)      │
    └─────────────────┘

    Project Requirements:

    • Do NOT use chrome.storage - use LocalStorageInstance
    • All API calls proxied through background script
    • Content scripts run at document_end
    3.3 Arcane Backend (apps/arcane/)

    Tech Stack: Express, Express-Zod-API, TypeScript, Firebase Admin

    Entry Point: src/index.ts

    TypeScript
    import { createServer } from "express-zod-api";
    import { config } from "./config/config";
    import { routing } from "./config/routing";
    
    if (isDevEnv) {
      await generateTypes();
      await generateDocumentation();
    }
    
    await createServer(config, routing);

    Directory Structure:

    Plain text
    apps/arcane/src/
    ├── config/
    │   ├── config.ts           # Server config, CORS, Swagger
    │   ├── logger.ts           # Pino logger
    │   └── routing.ts          # ALL API routes
    ├── server/
    │   ├── constantsSchemasAndTypes/  # Zod schemas, types
    │   ├── endpoints/          # Route handlers
    │   ├── factories/          # Endpoint factories
    │   ├── middlewares/        # Request middlewares
    │   ├── models/             # Business logic models
    │   ├── repositories/       # Data access layer
    │   ├── services/           # External integrations
    │   └── utilities/          # Helpers
    └── index.ts

    Middleware Chain (execution order):

    TypeScript
    // apps/arcane/src/server/endpoints/unified/unified.ts
    export const unified = timedAuthStreamEndpointsFactory
      .addMiddleware(usageLimitsMiddleware)      // 1. Check quotas
      .addMiddleware(threadPreware)              // 2. Setup context
      .addMiddleware(providerConfigOverride)     // 3. Model overrides
      .addMiddleware(unifiedController)          // 4. Main logic
      .addMiddleware(threadPostware)             // 5. Post-processing
      .addMiddleware(usageAnalyticsMiddleware)   // 6. Track usage
      .build({...});
    3.4 Session Manager (apps/session-manager/)

    Tech Stack: Express, Express-Zod-API, Firebase Admin, jose

    Purpose: Real-time session state synchronization across devices

    Key Dependencies:

    • express-zod-api - API framework
    • firebase-admin - Auth verification
    • jose - JWT handling
    • @panva/hkdf - Key derivation

    4. BACKEND DEEP DIVE

    4.1 Middleware Architecture

    Init Context (middlewares/initContext/initContext.ts):

    TypeScript
    // Sets up AsyncLocalStorage for request context
    const storage = new AsyncLocalStorage<RequestContext>();
    
    export const initContext = (req, res, next) => {
      const context = { requestId: randomUUID(), startTime: Date.now() };
      storage.run(context, () => next());
    };

    Auth (middlewares/auth/auth.ts):

    TypeScript
    // 1. Extract JWT from Authorization header
    // 2. Verify with Firebase Admin SDK
    // 3. Create User instance, load from Firestore
    // 4. Check/update user status
    // 5. Store in request context

    Usage Limits (middlewares/usageLimits/usageLimits.ts):

    TypeScript
    // 1. Check monthly usage < limit
    // 2. Check daily usage < limit
    // 3. Reset if reset time passed
    // 4. Throw 429 if exceeded

    Thread Preware (middlewares/threadPreware/threadPreware.ts):

    TypeScript
    // CRITICAL: Sets up entire request context
    // 1. Determine API version (V1/V2)
    // 2. Load user settings
    // 3. Initialize/load thread
    // 4. Load personalization
    // 5. Start async side actions
    // 6. Initialize SSE stream
    // 7. Process attachments
    // 8. Create message nodes
    // 9. Start content moderation
    // 10. Calculate context window
    // 11. Load history with embeddings
    // 12. Initialize Schema
    // 13. Store in request context
    // 14. Update thread
    4.2 Request Context (AsyncLocalStorage)
    TypeScript
    interface RequestContext {
      user: TUserDoc;
      userInstance: User;
      chatNode: Thread;
      userMessageNode: Message;
      assistantMessageNode: Message;
      schema: Schema;
      eventManager?: EventManager;
      settingsV3: TSettings;
      personalization: TPersonalization;
      executionContext: 'USER' | 'TASK';
    }
    
    // Access anywhere after initContext
    import { requestContext } from '../repositories/context/requestContext';
    const { user, chatNode } = requestContext.get();
    4.3 Models

    User Model (models/user.ts):

    TypeScript
    class User {
      decodedToken: DecodedIdToken;
      user: TUserDoc | null;
      uid: string | null;
      shouldUpdate: boolean;
    
      async init() {
        // 1. Get user from Firebase Auth
        // 2. Load user document from Firestore
        // 3. Handle guest → free conversion
        // 4. Check email verification
        // 5. Handle temporary Pro rewards
      }
    
      async updateUser() {
        // Save to Firestore if shouldUpdate
      }
    }

    Thread Model (models/thread.ts):

    TypeScript
    class Thread implements TChatDB {
      chatId: string;
      threadMap: Map<string, Message>;
      ref: DocumentReference;
    
      static async init(chatId, createNew, config) {
        // Load from Firestore, repair orphans, create if missing
      }
    
      async getActiveThreadWithEmbeddings(messageId, query, contextWindow) {
        // Traverse backwards, accumulate messages, query embeddings
      }
    
      async update({ userMessage, assistantMessage }, options) {
        // Save to Firestore with conflict retry logic
      }
    }
    4.4 Repositories

    Context (repositories/context/requestContext.ts):

    TypeScript
    const storage = new AsyncLocalStorage<RequestContext>();
    
    export const requestContext = {
      get() { return storage.getStore() || {}; },
      set(context) {
        const store = storage.getStore() || {};
        Object.assign(store, context);
      }
    };

    Schema (repositories/engine/schema.ts):

    TypeScript
    class Schema {
      schema: TSchema;
    
      setModel(model: TLLMModels, basedOnPlan: boolean) {
        this.schema.model = model;
        this.setMaxTokens(basedOnPlan);
      }
    
      injectPrompt({ message, messages, prompt, variables }) {
        // Add system/user messages, validate placeholders
      }
    }

    Side Actions (repositories/sideActions/sideActions.ts):

    TypeScript
    // Run async operation (non-blocking)
    sideActions.run(ActionTypes.CONTENT_MODERATION);
    
    // Run with custom handler
    sideActions.runCustom(fn, args, actionType);
    
    // Wait for result (blocking)
    const result = await sideActions.wait(ActionTypes.SELECT_MODEL);

    Streamer (repositories/streamer/streamer.ts):

    TypeScript
    // Initialize SSE
    const init = (response: Response) => {
      response.writeHead(200, {
        'Content-Type': 'text/event-stream',
        'Cache-Control': 'no-cache',
        'Connection': 'keep-alive',
      });
    };
    
    // Stream chunks
    const stream = async (req, response, inputStream) => {
      // Process chunks, send SSE events
    };
    
    // End stream
    const end = async (response: Response) => {
      response.write(getMessage(END_MESSAGE));
      response.end();
    };

    Inter-Request Communication (repositories/irc/irc.ts):

    TypeScript
    // Redis pub/sub for cross-request communication
    class InterRequestCommunicator {
      static async addSubscription(ircId, callback) {
        await this.redisSub.subscribe(channelId);
        this.subList.set(channelId, callback);
      }
    
      static async sendMessage(ircId, message) {
        await redisPub.publish(channelId, JSON.stringify(payload));
      }
    }

    5. DATA MODELS

    5.1 User Document
    TypeScript
    interface TUserDoc {
      uid: string;
      email: string;
      role: 'guest' | 'free' | 'paid' | 'member' | 'owner' | 'tools';
      userPlan: 'GUEST' | 'FREE' | 'PRO' | 'PRO_TEMP' | 'TOOLS';
      provider: 'anonymous' | 'google' | 'microsoft' | 'apple';
      photoUrl: string;
    
      monthlyUsage: {
        cost: number;
        resetsAt: number;
        models?: Record<string, number>;
      };
      dailyUsage: {
        cost: number;
        resetsAt: number;
        models?: Record<string, number>;
      };
    
      settings?: TUserSettings;
      preferences?: TUserPreferences;
      connectedApps?: { gdrive?: {...} };
      connectedComposioApps?: Record<string, any>;
      notificationTokensByDeviceId?: Record<string, any>;
      topUps?: Array<{ amount: number; expiresAt: number }>;
    }
    5.2 Thread Document
    TypeScript
    interface TChatDB {
      chatId: string;
      userId: string;
      projectId?: string;
      mode: TChatModes;
      metadata: {
        title?: string;
        folderId?: string;
        chatbot?: { id: string; name: string };
        projectId?: string;
      };
      createdAt: Timestamp;
      updatedAt: Timestamp;
    }
    5.3 Message Document (V2)
    TypeScript
    interface TMessageDBV2 {
      id: string;
      chatId: string;
      parentId?: string;
      childrenId: string[];
      role: 'user' | 'assistant' | 'system';
      contentV2: TMessageContentV2;
      modelId?: TLLMModels;
      timestamp: Timestamp;
      tokens: number;
      attachments?: TAttachment[];
      isPending: boolean;
      error?: { type: ErrorType; message: string };
    }
    
    type TMessageContentV2 = Array<
      | { type: 'TEXT'; text: string; tokens: number; reasoning?: string }
      | { type: 'TOOL_RESULT'; toolResults: Array<{ name: string; content: string; tokens: number }> }
      | { type: 'PROGRESS'; name: string; icon: string; steps: TProgressStep[]; metadata?: any }
    >;

    6. API ROUTES

    6.1 Public Routes
    RouteMethodDescription
    /v1/public/healthGETHealth check
    /v1/rewardsGETGet rewards
    /v1/register-adsPOSTRegister ad views
    6.2 Private Routes (Auth Required)

    Thread:

    RouteMethodDescription
    /v1/thread/unifiedPOSTMain endpoint
    /v1/thread/stopPOSTStop generation
    /v1/thread/messagePOSTSend message

    Canvas:

    RouteMethodDescription
    /v1/user/canvas/:canvasIdGETGet canvas content
    /v1/user/canvas/:canvasIdPOSTUpdate canvas content

    Canvas Architecture:

    • Canvas content stored in GCP Storage (GCS) as JSON
    • Path: {uid}/canvas/{canvasId}.json
    • Structure: { values: TCanvasValues[], history: { undos: [], redos: [] } }
    • Supports version history with undo/redo
    • Content type: application/json (gzipped)

    User:

    RouteMethodDescription
    /v1/user/statusGETGet user status
    /v1/user/historyGETList history
    /v1/user/settingsGET/POSTGet/set settings
    /v1/user/shareChatPOSTShare chat

    Projects:

    RouteMethodDescription
    /v1/projectsGETList projects
    /v1/projects/createPOSTCreate project
    /v1/projects/:idGET/DELETEGet/archive project

    Tools:

    RouteMethodDescription
    /v1/tools/text/:toolIdPOSTText tools
    /v1/tools/image/:toolIdPOSTImage tools
    /v1/tools/ai-detectorPOSTAI detection

    Wallflower (Image Generation):

    RouteMethodDescription
    /v1/wallflower/image-generationPOSTGenerate images
    /v1/wallflower/imagesGETGet image history
    /v1/wallflower/likePOSTLike image
    /v1/wallflower/pin-imagePOSTPin image

    Full Route List: apps/arcane/src/config/routing.ts


    7. TEMPLATES (WALLFLOWER)

    7.1 Available Templates

    Templates are pre-configured image generation presets:

    Template IDNameModelDescription
    ghibli-styleGhiblifygpt-image-1-mediumConvert to Studio Ghibli style
    watermark-removerWatermark Removergemini-2.0-flash-expRemove watermarks
    product-photographyProduct Photographygpt-image-1-highProfessional product shots
    make-me-baldMake Me Baldgemini-2.0-flash-expBald transformation
    minecraft-styleMinecraft Stylegpt-image-1-mediumMinecraft block style
    simpson-styleSimpson Stylegpt-image-1-highSimpsons cartoon style
    pixar-stylePixar Stylegpt-image-1-highPixar 3D animation style
    humanize-my-petHumanize My Petgpt-image-1-mediumPet to human transformation
    7.2 Template Structure
    TypeScript
    // apps/arcane/src/server/constantsSchemasAndTypes/wallflower/templates.constants.ts
    export const TEMPLATES = {
      "ghibli-style": {
        id: "ghibli-style",
        name: "Ghiblify",
        preset: {
          prompt: "turn this into ghibli style",
          style: "Auto",
          modelConfig: {
            modelId: "gpt-image-1-medium",
            numberOfImages: 1,
          },
        },
      },
      // ... more templates
    } as const;
    7.3 Template Processing

    Controller: endpoints/wallflower/unified-generation.controller.ts

    TypeScript
    case "TEMPLATE":
      switch (feature.templateId) {
        case "product-photography":
          formattedImage = await handleProductPhotographyTemplate({...});
          break;
        // ... other templates
      }

    Usage Limits: Templates inherit model config from presets, usage calculated based on model + numberOfImages.


    8. FALLBACK STRATEGIES

    8.1 Generic Fallback Pattern

    Utility: utilities/call-function-with-fallback.ts

    TypeScript
    export async function callWithFallback<T, U>(
      primaryFunc: T,
      fallbackFunc: U,
    ): Promise<Awaited<ReturnType<T> | ReturnType<U>>> {
      try {
        const res = await primaryFunc();
        return res;
      } catch (error) {
        logger.error(error, "ERROR/PRIMARY_FUNCTION_FAILED");
        return await fallbackFunc();
      }
    }
    8.2 Image Generation Fallbacks

    Fal.ai ↔ Replicate Fallback:

    TypeScript
    // endpoints/wallflower/helpers/fal-ai.ts
    export async function handleFalWithReplicateFallback({...}) {
      return await callWithFallback(
        async () => await callFalAI({...}),
        async () => await callReplicate({
          modelId: FALLBACK_MODELS_MAP[feature.modelConfig.modelId],
          ...
        })
      );
    }
    
    // endpoints/wallflower/helpers/replicate.ts
    export async function handleReplicateWithFalAIFallback({...}) {
      return await callWithFallback(
        async () => await callReplicate({...}),
        async () => await callFalAI({
          modelId: FALLBACK_MODELS_MAP[feature.modelConfig.modelId],
          ...
        })
      );
    }

    Fallback Models Map (constantsSchemasAndTypes/wallflower/unified-generation.constants.ts):

    TypeScript
    export const FALLBACK_MODELS_MAP = {
      // Maps primary models to fallback equivalents
    };
    8.3 Deep Research Fallbacks

    Serp Query Fallback (features/deepResearch/firecrawlSerp.ts):

    1. Primary: Bing-based scraping
    2. Fallback 1: Firecrawl scraping
    3. Fallback 2: Basic URL fetch
    TypeScript
    // Helper function with fallback chain
    async function scrapeWithFallbackMethod(urls: string[]) {
      try {
        // Try Firecrawl first
        const result = await firecrawl.scrape(urls);
        return result;
      } catch (error) {
        logger.error(error, "ERROR/DEEP_RESEARCH/FALLBACK_SCRAPING");
        // Fall back to basic fetch
        return await basicFetch(urls);
      }
    }

    Google Search Fallback (features/deepResearch/generateSerpQueries.ts):

    TypeScript
    // Parallel Google search fallback if Bing fails
    try {
      const results = await bingSearch(queries);
    } catch (error) {
      logger.error("ERROR/DEEP_RESEARCH/GOOGLE_SEARCH_FALLBACK");
      const results = await googleSearch(queries);
    }
    8.4 AI Detection Fallback

    Service: services/aiDetection.ts

    TypeScript
    async detectWithFallback(text: string): Promise<{...}> {
      try {
        return await primaryDetectionAPI(text);
      } catch (error) {
        return await secondaryDetectionAPI(text);
      }
    }

    Usage:

    • endpoints/tools/aiDetection.ts
    • endpoints/tools/public/aiDetectionPublic.ts
    • endpoints/tools/aiEssayMetricsGenerator.ts
    8.5 RAG Embeddings Fallback

    File: endpoints/unified/features/rag.ts

    TypeScript
    const FALLBACK_ATTACHMENT_KEY = "DOCUMENT_CHUNK";
    
    const getChunksFromFallbackLIResults = (embeddings: TLIResult[]): string[] => {
      // Fallback for items without attachmentId/nodeIndex (old embeddings)
      const fallbackKey = `${item.attributes?.name ?? FALLBACK_ATTACHMENT_KEY}`;
      // Group and return chunks
    };
    
    // Used when:
    // 1. New embedding format not available
    // 2. attachmentId missing
    // 3. nodeIndex missing (legacy embeddings)
    8.6 Model Selection Fallback (Merlin Magic)

    File: endpoints/unified/features/merlinMagic.ts

    TypeScript
    const inferenceFallbackOutput = 
      // Fallback model selection logic when primary inference fails
      return inferenceFallbackOutput;
    
    logger.error("ERROR/IMAGE_INFERENCE_FALLBACK");
    8.7 YouTube Transcription Fallback

    File: utilities/youtube/youtube.ts

    TypeScript
    // Commented out but documented fallback strategy
    // const CACHE_FALLBACK_QUERY_CONSUMED = 2;
    // const CACHE_FALLBACK_TOKENS_CONSUMED = 1000;
    // Fallback when primary transcription method fails
    8.8 MCP Tool Result Fallback

    File: utilities/mcp/functions/zapMCPToolResult.ts

    TypeScript
    const MAX_FALLBACK_CHAR_LIMIT = 20000; // approx. 5k tokens
    
    // Truncate tool results to fit within limit
    if (result.length > MAX_FALLBACK_CHAR_LIMIT) {
      result = result.substring(0, MAX_FALLBACK_CHAR_LIMIT);
    }
    8.9 Progress Event Fallback Index

    File: constantsSchemasAndTypes/streamer/streamer.constants.ts

    TypeScript
    // Default fallback index for tool results as progress events
    export const DEFAULT_FALLBACK_INDEX = 0;

    Usage: When tool result index is not specified, defaults to 0.


    9. CRITICAL ARCHITECTURE DECISIONS

    9.1 Why Express-Zod-API?

    Decision: Use express-zod-api over raw Express, NestJS, Fastify

    Rationale:

    • Type safety with Zod schemas
    • Auto-generated OpenAPI documentation
    • Type-safe API client generation
    • Clean middleware composition
    • Built-in error serialization

    Trade-offs:

    • ✅ Pros: Type safety, auto-docs, less boilerplate
    • ❌ Cons: Learning curve, vendor lock-in
    9.2 Why AsyncLocalStorage?

    Decision: Use Node.js AsyncLocalStorage for request context

    Rationale:

    • No prop drilling across 6+ middleware layers
    • Global access without parameters
    • Request isolation
    • Minimal overhead

    Risk: Single point of failure - if initContext fails, all context access returns empty object

    9.3 Why Firestore?

    Decision: Use Firestore (NoSQL) over PostgreSQL/MySQL

    Rationale:

    • Flexible schema for varying message structures
    • Auto-scaling without sharding
    • Nested data model (Thread → Messages)
    • Firebase Auth integration

    Limitations:

    • No SQL joins (must denormalize)
    • Transactions limited to 25 documents
    • Eventual consistency
    9.4 Why SSE Over WebSockets?

    Decision: Use Server-Sent Events for streaming

    Rationale:

    • HTTP-based, no upgrade handshake
    • Auto-reconnect built-in
    • Firewall friendly
    • One-way is sufficient for streaming

    Trade-offs:

    • ✅ Pros: Simple, low overhead, auto-reconnect
    • ❌ Cons: One-way only, no binary data
    9.5 Why Monorepo?

    Decision: Use pnpm monorepo with Turborepo

    Rationale:

    • Code sharing across apps
    • Atomic commits
    • Consistent tooling
    • Efficient builds (caching, parallelization)

    Trade-offs:

    • ✅ Pros: Code sharing, atomic commits, efficient builds
    • ❌ Cons: Larger repo, coupled deployments

    10. SECURITY

    10.1 Authentication Flow
    Plain text
    Client → Firebase Auth → JWT → Backend verifies → Firestore user doc

    Security Measures:

    1. JWT verification (Firebase Admin SDK)
    2. Custom claims for RBAC
    3. User document for additional permissions
    4. Token refresh via NextAuth

    Vulnerabilities:

    VulnerabilityRiskStatus
    JWT token theftHigh✅ Mitigated (short expiry, HttpOnly cookies)
    Custom claims tamperingCritical✅ Mitigated (server-side only)
    Firestore rule bypassCritical✅ Mitigated (all queries through backend)
    CSRFMedium⚠️ Needs review
    10.2 Rate Limiting

    Current: Only guest users rate limited (50 requests / 15 min via Redis)

    Gaps:

    • No rate limiting for authenticated users
    • IP-based (bypassable with rotating IPs)
    • No endpoint-specific limits
    10.3 Input Validation

    Layers:

    1. Zod schemas (all API inputs)
    2. Content moderation (async side action)
    3. Token limits (query size validation)
    4. File type validation (MIME type)

    11. SCALABILITY

    11.1 Current Scaling

    Horizontal Scaling (Cloud Run):

    • arcane (primary)
    • arcane-copy (failover)
    • arcane-deepresearch (specialized)

    Bottlenecks:

    ComponentCurrentAt ScaleSolution
    Firestore writes~1K/secSharding neededShard by user ID
    RedisSingle instanceConnection pool exhaustedRedis Cluster
    LLM APIsRate limited per keyMultiple keysRound-robin keys
    11.2 Performance Optimizations

    Implemented:

    • Async side actions (non-blocking)
    • Redis caching (user settings, model configs)
    • SSE streaming (reduced time-to-first-token)
    • Skip embeddings for large contexts (>6000 tokens)
    • Pre-calculated token counts

    Opportunities:

    • Response caching (identical queries)
    • Embedding caching (RAG queries)
    • Database indexing
    • Connection pooling
    11.3 Memory Management

    Cloud Run Limits: 16GB max, 60min timeout, 4 vCPU

    Memory Leaks to Watch:

    • AsyncLocalStorage context not cleaned up on error
    • Redis IRC subscriptions not cleaned up
    • PassThrough streams not destroyed on error

    Monitoring:

    TypeScript
    process.on("uncaughtException", (err) => {
      logger.error(err, "ERROR/COMPLETE_INSTANCE_FAILURE");
      setTimeout(() => process.exit(1), 20000);
    });

    12. FAILURE MODES

    12.1 Single Points of Failure
    ComponentImpactRecovery
    Firebase AuthComplete auth failure5-10 min
    FirestoreAll data operations fail10-30 min
    RedisRate limiting, IRC failsImmediate (bypass)
    OpenAI APIGPT models unavailableImmediate (fallback)
    12.2 Error Handling

    Current Pattern:

    TypeScript
    try {
      await riskyOperation();
    } catch (error) {
      if (error instanceof ClientError) throw error;
      logger.error(error, "ERROR/OPERATION_FAILED");
      throw new ServerError(500, ErrorType.INTERNAL_SERVER_ERROR);
    }

    Issues:

    • No retry logic for transient failures
    • No circuit breakers
    • No graceful degradation
    12.3 Database Conflict Resolution

    Retry Logic (Firestore document conflicts):

    TypeScript
    let messageIncrement = 0;
    while (trying) {
      try {
        const batch = db.batch();
        this.setMessage(batch, userMessage, this.totalDocs + messageIncrement);
        await batch.commit();
        trying = false;
      } catch (err) {
        if (messageIncrement === 3) throw MAX_RETRY_ERROR;
        messageIncrement++;
      }
    }

    13. DEPLOYMENT

    13.1 Pipelines

    Backend (Cloud Run via cloudbuild.yaml):

    YAML
    steps:
      - Build Docker image with Kaniko
      - Push to GCR
      - Deploy to Cloud Run (arcane, arcane-copy, arcane-deepresearch)

    Website (Vercel):

    • Auto-deploy on push to develop/review branches
    • Environment variables in Vercel dashboard
    13.2 Environment Variables

    Backend:

    Bash
    FIREBASE_PROJECT_ID=foyer-work
    OPENAI_API_KEY=sk-...
    REDIS_HOST=localhost

    Website:

    Bash
    NEXT_PUBLIC_API_URL=http://localhost:8080
    NEXTAUTH_SECRET=...

    Critical: Environment variables NOT validated at startup - missing vars cause runtime errors.

    13.3 Rollback
    Bash
    # Backend
    gcloud run services update arcane --to-revisions=arcane-abc123=100
    
    # Website
    vercel rollback [deployment-url]

    14. PRINCIPAL ENGINEER INTERVIEW Q&A

    Q1: How do you ensure atomic writes for related documents?

    Problem: Two Firestore writes can result in orphaned documents if the second fails.

    Current Solution: Retry with document index increment

    TypeScript
    let messageIncrement = 0;
    while (trying) {
      try {
        const batch = db.batch();
        this.setMessage(batch, msg1, this.totalDocs + messageIncrement);
        this.setMessage(batch, msg2, this.totalDocs + messageIncrement);
        await batch.commit();
        trying = false;
      } catch (err) {
        if (messageIncrement === 3) throw error;
        messageIncrement++;
      }
    }

    Better Solutions:

    1. Firestore Transactions: Atomic but limited to 25 docs
    2. Outbox Pattern: Write to outbox, process async
    3. Event Sourcing: Store changes as events

    Key Insight: Retry-with-increment is pragmatic for Firestore contention. Not truly atomic but achieves eventual consistency.


    Q2: How would you scale to 100K concurrent users?

    Current Bottlenecks:

    ComponentCurrentSolution
    Firestore writes~1K/secShard by user ID
    RedisSingle instanceRedis Cluster
    LLM APIsRate limitedMultiple keys + round-robin

    Architecture Changes:

    1. Database Sharding:
    TypeScript
    const shardId = hash(userId) % NUM_SHARDS;
    const db = getFirestoreShard(shardId);
    1. Request Queue:
    TypeScript
    const queue = new Queue('llm-calls', {
      redis: redisCluster,
      defaultJobOptions: { attempts: 3, backoff: 'exponential' }
    });
    1. Response Caching:
    TypeScript
    const cacheKey = hash(prompt + model + settings);
    const cached = await redis.get(cacheKey);
    if (cached) return cached;

    Key Insight: LLM API rate limits are the biggest bottleneck, not infrastructure. Solution: multi-key rotation + caching.


    Q3: How do you handle slow LLM providers?

    Current: No timeout, no fallback - request hangs.

    Better: Circuit Breaker + Timeout + Fallback

    TypeScript
    const openaiBreaker = new CircuitBreaker(callOpenAI, {
      timeout: 30000,
      errorThresholdPercentage: 50,
      resetTimeout: 60000
    });
    
    async function callLLM(prompt, model) {
      try {
        return await openaiBreaker.fire(prompt);
      } catch (error) {
        if (openaiBreaker.opened) {
          return await anthropicBreaker.fire(prompt); // Fallback
        }
        throw error;
      }
    }

    Key Insight: Use circuit breakers to fail fast, not just timeouts. Prevents cascading failures.


    Q4: How would you implement fair rate limiting?

    Current: Only guest users limited (50/15min).

    Fair Design:

    TypeScript
    const rateLimits = {
      GUEST: { points: 50, duration: 900 },
      FREE: { points: 100, duration: 60 },
      PRO: { points: 500, duration: 60 },
    };
    
    const endpointLimits = {
      '/v1/thread/unified': { multiplier: 1 },
      '/v1/tools/image': { multiplier: 5 },
      '/v1/deep-research': { multiplier: 10 }
    };
    
    async function rateLimit(req, user) {
      const limit = rateLimits[user.userPlan];
      const cost = limit.points * endpointLimits[req.path].multiplier;
      await rateLimiter.consume(user.uid, cost);
    }

    Key Insight: Rate limit by user ID (not IP), apply endpoint-specific costs.


    Q5: How do you prevent prompt injection?

    Current: Basic content moderation, no specific injection detection.

    Defense Layers:

    1. Input Sanitization:
    TypeScript
    const sanitized = input
      .replace(/ignore previous instructions/gi, '')
      .replace(/system prompt/gi, '');
    1. Prompt Structure:
    TypeScript
    const prompt = `
    <system>You are a helpful assistant</system>
    <user-input>${escapeXml(userInput)}</user-input>
    <instructions>Only respond to user-input above</instructions>
    `;
    1. Output Validation:
    TypeScript
    if (output.includes('system prompt')) {
      throw new ClientError(400, ErrorType.INPUT_NOT_MODERATED);
    }

    Key Insight: Prompt injection is an input validation problem. Defense in depth: sanitize, structure, validate.


    Q6: How would you optimize history retrieval from O(n) to O(1)?

    Current: Linear traversal through threadMap

    TypeScript
    getMessages() {
      let parent = this.threadMap.get('root');
      const messages = [];
      while (parent) {
        const child = this.threadMap.get(parent.childrenId[0]);
        messages.push(child.getMessage());
        parent = child;
      }
      return messages;
    }

    Optimized:

    1. Denormalized Last N Messages:
    TypeScript
    interface TChatDB {
      recentMessages: TMessageDBV2[];  // Last 10
      totalMessages: number;
    }
    1. Message Index with Pointers:
    TypeScript
    interface TChatDB {
      firstMessageId: string;
      lastMessageId: string;
      messageIndex: Map<string, { prev: string; next: string }>;
    }

    Key Insight: For chat, you almost always need recent messages first. Denormalize last N, lazy load older.


    Q7: How do you handle concurrent edits from multiple devices?

    Current: Last-write-wins with Firestore server timestamp. No conflict resolution.

    Solutions:

    1. Optimistic Concurrency Control:
    TypeScript
    await db.runTransaction(async (transaction) => {
      const chat = await transaction.get(chatRef);
      if (chat.data().version !== clientVersion) {
        throw new ConflictError('Chat modified by another device');
      }
      transaction.update(chatRef, { version: clientVersion + 1 });
    });
    1. Queue-Based Serialization:
    TypeScript
    await chatQueue.add(chatId, { message, userId });
    // Processed sequentially per chatId

    Key Insight: For chat, optimistic concurrency + client-side merge is sufficient.


    Q8: What's the cost structure per request?

    GPT-4o (15x query cost):

    ComponentCost
    Input tokens (1K)$0.0075
    Output tokens (500)$0.01125
    Embeddings (RAG)$0.0001
    Firestore writes$0.00002
    Cloud Run$0.00001
    Total~$0.02

    Deep Research (Claude 3 Opus, 50x): ~$1.00 per request

    Key Insight: LLM API costs dominate (99%+). Optimize: reduce tokens, cache responses, use cheaper models.


    Q9: What would cause catastrophic failure?

    Answer: Firebase Auth + Firestore simultaneous outage.

    Why:

    • No auth fallback → all requests rejected
    • No database fallback → can't read any data
    • No offline mode → complete failure

    Mitigation:

    1. Session Cache (Immediate):
    TypeScript
    const cached = await redis.get(`session:${token}`);
    if (cached) return JSON.parse(cached);
    const user = await verifyToken(token);
    await redis.setex(`session:${token}`, 300, JSON.stringify(user));
    1. Read-Only Fallback:
    TypeScript
    try {
      await db.collection('chats').doc(id).get();
    } catch (error) {
      return cachedChats.get(userId);
    }

    Key Insight: System has no graceful degradation. Any Firebase failure causes complete outage.


    15. QUICK REFERENCE

    Commands
    Bash
    # Local Development
    pnpm dev                    # Start all apps
    pnpm build                  # Build everything
    
    # Backend (port 8080)
    cd apps/arcane && pnpm dev
    
    # Website (port 3001)
    cd apps/website && pnpm dev
    
    # Extension
    cd apps/extension && pnpm dev
    Key Files
    PurposeFile
    API Routesapps/arcane/src/config/routing.ts
    Main Endpointapps/arcane/src/server/endpoints/unified/unified.ts
    Auth Middlewareapps/arcane/src/server/middlewares/auth/auth.ts
    User Modelapps/arcane/src/server/models/user.ts
    Thread Modelapps/arcane/src/server/models/thread.ts
    Schema Builderapps/arcane/src/server/repositories/engine/schema.ts
    Website Middlewareapps/website/middleware.ts
    Troubleshooting
    SymptomFix
    401 UnauthorizedCheck Authorization header
    429 Rate LimitedWait 15 minutes or upgrade
    500 Internal ErrorCheck Cloud Logging
    Streaming failsCheck network, retry

    Last Updated: March 27, 2026 Version: 6.0.0 (Complete Architecture + Canvas + Templates + Fallbacks) Document Status: ✅ Complete — Bonkers monorepo architecture with all critical systems