Bonkers — End-to-End System Guide (Knowledge Base)
Confidential Internal Document — Preserves complete architectural context of the Bonkers image generation platform and its backend (Merlin Arcane / Cauldron monorepo).
Shared from "Bonkers" on Inkdown
Confidential Internal Document — Preserves complete architectural context of the Bonkers image generation platform and its backend (Merlin Arcane / Cauldron monorepo).
Bonkers (bonkers/ — codename "cauldron"):
Merlin Arcane (merlin-arcane/):
The website uses Next.js 14 App Router with:
next-intl with [lang] prefix (27 languages)next-auth v5 (beta) with Credentials provider@tanstack/react-query v5 with async persistence (24h gc)Zustand v5 (auth UI, SSE, attachments)class strategy)| Route | Purpose |
|---|---|
/{lang}/bonkers | Bonkers image generation — canvas, model selection, generation |
/{lang}/old-bonkers | Legacy Bonkers |
/{lang}/chat/[[...chatId]] | Main chat interface |
/{lang}/creations/[iid] | Shared creations (public gallery) |
/{lang}/pricing | Subscription plans |
/{lang}/old-profile | User profile, settings, subscription |
/{lang}/old-vault | Knowledge vault (file management) |
/{lang}/updates | Changelog |
/{lang}/user/[id] | Public user profile |
/{lang}/templates/[id] | Templates |
/{lang}/ai-tools | AI tools directory |
/{lang}/ai-detection | AI content detection |
/{lang}/plagiarism-checker | Plagiarism checker |
/{lang}/ai-humanizer | AI text humanizer |
/{lang}/new-old-chat/history | Chat history |
/{lang}/new-old-chat/projects | Projects workspace |
/{lang}/new-old-chat/share/[chatId] | Shared chat view |
Multi-layer authentication:
__Secure-merlin-session_0..3)merlin-auth) — cross-tab syncchrome.runtime.sendMessageZustand session store (userSessionStore.ts):
isFree, isPaid, isOwner, isPro, isBonkersBasic, isBonkersPro, isAppSumo, etc.React Query setup:
QueryClient with 24-hour garbage collectionv2.99) forces cache reset on deploywaitForAuthInit)Axios instances: Two typed instances with interceptors:
ArcaneAxiosInstance → .../arcane/api (attaches Firebase token)UAMAxiosInstance → uam.getmerlin.inx-merlin-version headerThe frontend uses @microsoft/fetch-event-source for SSE connections. The sseBaseSecureStore (Zustand) handles:
message (text/reasoning/progress), attachments, references, usage, metadataExpress 5 + express-zod-api:
authEndpointsFactory, authStreamEndpointsFactory, usageLimitsStreamEndpointsFactoryThe end-to-end flow for a chat message:
The entire request shares a requestContext — a AsyncLocalStorage-based store that carries:
user (plan, uid, email, usage)chatNode (Thread model instance)userMessageNode, assistantMessageNodeschema (Schema model — controls prompt/messages/model)chatStateManager (mode resolution)eventManager (SSE progress events)response (raw Express res for SSE streaming)executionContext (USER vs TASK)decisionLog (tool orchestration debug log)logFields (structured logging context)firebase.auth().currentUser.getIdToken())Authorization: Bearer <token>authMiddleware verifies via firebase.auth().verifyIdToken(token)User model instance with computed propertiesmodels/user.ts)customers/{uid} Firestore documentuserPlan (free, pro, teams, bonkers_basic, bonkers_pro, apprentice_sumo, etc.)userType (owner, member, etc.)The user.features map contains per-feature objects:
repositories/engine/)The engine manages the LLM context window — the critical part that decides what fits in the prompt:
Context trimming strategies:
FULL — include everything (when context window is large enough)TOOL_PROVIDED_SUMMARY_IF_POSSIBLE — use tool-provided summaries for tool resultsSUMMARY — LLM-generated summary of historyThree sections of the context window:
HISTORY — past conversation turnsIN_LOOP — current tool call iteration messagesCURRENT_MESSAGE — the latest assistant response + tool resultsLayout optimization:
chooseOptimalLayout picks cheapest valid layout within context limitPREFERRED_TRIMMING_LAYOUTS until one fitsrepositories/provider/)rune.siddhartha-5c5.workers.devPayload structure sent to Rune:
repositories/streamer/)Two streaming versions:
streamAsToolResult flagSSE Event types:
| Event | Purpose |
|---|---|
message | Text content, reasoning content, tool results |
progress | Progress step lifecycle (init → in_progress → done) |
attachments | Generated images, web search links, citations |
references | Citation references with document index mapping |
usage | Token usage after completion |
metadata | Model info, MCP context |
init_message_content | Signals start of content streaming |
features | Identifies active features (e.g., "AGENTIC_RESEARCH") |
Content Indexing (V2):
Each tool result in the response has a unique contentIndex. The streamer maps text, reasoning, and progress events to the correct index so the frontend can render them in the right position within the message tree.
The ChatStateManager is the central decision-maker for a chat request. It:
ToolRegistry tools to enableMode compatibility rules:
IMAGE mode cannot combine with RAG, DEEP_RESEARCH, or MCPMERLIN_MAGIC auto-selects best modelDEEP_RESEARCH takes priority and runs agent sub-pipelineMCP injects plugin results before LLM callLARGE_CONTEXT gives full context window (no trimming)repositories/engine/schema.ts)The Schema is a prompt builder that:
Frontend (bonkers/website):
/bonkers route with full image generation canvasBackend (Wallflower System — /v1/wallflower/):
| Model ID | Provider | Type |
|---|---|---|
black-forest-labs/flux-schnell | Replicate / FAL | Text-to-Image |
black-forest-labs/flux-1.1-pro | Replicate / FAL | Text-to-Image |
black-forest-labs/flux-1.1-pro-ultra | Replicate / FAL | Text-to-Image |
black-forest-labs/flux-pro | Replicate / FAL | Text-to-Image |
recraft-ai/recraft-v3 | Replicate / FAL | Text-to-Image |
ideogram-ai/ideogram-v2-turbo | Replicate | Text-to-Image |
ideogram-ai/ideogram-v3-turbo | Replicate / FAL | Text-to-Image |
prunaai/hidream-l1-fast | Replicate / FAL | Text-to-Image |
fal-ai/flux-pro/v1/fill | FAL | Inpainting |
fal-ai/bria/eraser | FAL | Erasing |
fal-ai/clarity-upscaler | FAL | Upscaling |
fal-ai/bria/background/replace | FAL | Background Edit |
fal-ai/ghiblify | FAL | Style Transfer |
851-labs/background-remover | Replicate | Background Removal |
google/imagen-4 | Replicate / FAL | Text-to-Image |
gemini-2.0-flash-exp | Google Vertex AI | Text-to-Image |
gpt-image-1-high/medium/low | OpenAI | Text-to-Image |
midjourney-v6.1-relax | GoAPI | Text-to-Image |
fal-ai/bytedance/seedream/v3 | FAL | Text-to-Image |
Bonkers presents simplified model names to users, mapped to concrete providers:
| Abstract Name | Maps To | Type |
|---|---|---|
bonkers-lite | prunaai/hidream-l1-fast | Fast generation |
bonkers-advance | fal-ai/ideogram/v3 | High quality |
bonkers-magic-fill | fal-ai/ideogram/v3/edit | Inpainting |
bonkers-remix | gpt-image-1-medium | Image remixing |
bonkers-upscale | fal-ai/clarity-upscaler | Upscaling |
bonkers-magic-erase | fal-ai/bria/eraser | Object removal |
bonkers-bg-edit | fal-ai/bria/background/replace | Background change |
bonkers-bg-erase | 851-labs/background-remover | Background removal |
bonkers-omni-edit | fal-ai/flux-pro/kontext/multi | Multi-image editing |
The Wallflower system supports 9 image editing features:
Each feature type can use a Magic Prompt system:
MAGIC_PROMPT_MODELS mapwallflower-imageswallflower/{userId}/images/ — user's image documents/v1/wallflower/images endpointAvailable tools are registered in a ToolRegistry instance, keyed by function name:
| Tool | Description |
|---|---|
webSearch | Internet search via Tavily/SerpAPI/Firecrawl with academic/social/Youtube focus modes |
rag | Knowledge vault retrieval (LlamaIndex) |
imageGen | Image generation via Wallflower |
dataAnalysis | Code execution in E2B sandbox |
mcp | Model Context Protocol (external plugins) |
memory | User memory (mem0) |
think | Internal chain-of-thought reasoning |
craft | Canvas/craft creation |
chatbot | Marketplace chatbot queries |
deepResearchWebSearch | Web search for deep research |
createTodo / getTodo / updateTodo / markTodo / dumpFinding / feedbackQuestions / getSearchHistory / reportGeneration / researcherAgent | Deep research agent tools |
Different agent configurations determine behavior:
TOOL_RESULTS_CONTEXT_SUMMARY_TOKENSDatabase: Firebase project foyer-work, database merlindb
Collections:
| Collection | Path | Purpose |
|---|---|---|
customers/{uid} | customers/{uid} | User profile, plan, features, usage, settings |
customers/{uid}/chats/{chatId} | Chats per user | Chat metadata |
customers/{uid}/chats/{chatId}/thread/{docId} | Thread messages | Individual messages |
global_chats/{chatId} | global_chats/{chatId} | Project chats (global namespace) |
sharedChatsV2/{chatId} | sharedChatsV2/{chatId} | Publicly shared chats |
projects/{projectId} | projects/{projectId} | Project workspace |
projects/{projectId}/members/{uid} | Membership | Project members + roles |
wallflower/{userId}/images/{imageId} | Images | Generated image posts |
chatbots/{chatbotId} | chatbots/{chatbotId} | Marketplace chatbot definitions |
notifications/{uid}/tokens/{token} | Push tokens | FCM notification tokens |
vault/{uid}/items/{itemId} | vault/{uid}/items/{itemId} | Knowledge vault items |
attachments/{attachmentId} | attachments/{attachmentId} | File attachments metadata |
canvas/{canvasId} | canvas/{canvasId} | Craft canvas content |
surveys/{surveyId} | surveys/{surveyId} | User survey responses |
connectedApps/{appId} | connectedApps/{appId} | OAuth connected apps |
mcp/{connectionId} | mcp/{connectionId} | MCP server connections |
memories/{uid}/memories/{memoryId} | Memories | User memories (mem0 backup) |
Purpose: Caching, pub/sub inter-request communication, rate limiting
Channels:
STOP_GENERATING — stop generation signal (chatId + messageId)ARCANE_MCP_CHANNEL:{ircId} — MCP inter-request communicationIMPORT_CHATS — chat import queueATTACHMENT_QUEUE — attachment processing queueGOOGLE_API_KEY — Google API key poolUsage:
rate-limiter-flexibleBuckets:
wallflower-imagesDedicated Cloud Run service for vector embeddings and semantic search:
text-embedding-3-small and text-embedding-3-largeUsage analytics logging:
| Provider | Services Used | Auth Method |
|---|---|---|
| OpenAI | GPT-4, GPT-4o, GPT-4o-mini, o1, o3, DALL-E, GPT Image 1, Embeddings | API key (embedded) |
| Anthropic | Claude 3.5 Sonnet, Claude 3.7 Sonnet | API key via Rune |
| Google AI | Gemini 2.0 Flash, Gemini 2.5 Pro, Vertex AI Imagen | API key pool |
| Fireworks AI | DeepSeek V3, Llama 3, Qwen, Mistral | API key via Rune |
| FAL AI | Flux, Ideogram, Recraft, Bria, HiDream, Seedream, Ghiblify | API key |
| Replicate | Flux, Ideogram, Recraft, HiDream, Background Remover | API key |
| Ideogram | Ideogram V2/V3, inpainting | API key |
| GoAPI | Midjourney v6.1 | API key |
| Azure | Merlin Magic (custom ML models for image/web classification) | API key |
| Service | Usage |
|---|---|
| Firebase Auth | User authentication (email, Google, anonymous) |
| Firebase Cloud Messaging | Web push notifications |
| Firebase Cloud Functions | Stripe portal, reCAPTCHA, email update |
| Redis (ioredis) | Cache, pub/sub, rate limiting |
| SendGrid | Transactional emails |
| Mailchimp | Email marketing + transactional |
| Stripe | Subscription billing |
| TawkTo | Live chat support |
| PostHog | Product analytics (opt-in) |
| Google Tag Manager | GA4 events, TikTok/Facebook pixels |
| Sentry | Error tracking |
| BigQuery | Usage analytics warehouse |
| Google Cloud Tasks | Async task queue |
| Firecrawl | Web scraping for deep research |
| SerpAPI | Google search results API |
| Tavily | AI-optimized web search API |
| E2B | Code interpreter sandbox (data analysis) |
| mem0 | User memory/profile |
| Composio | External app integrations (2-way sync) |
| Raindrop AI | Analytics/signals platform |
| Undetectable AI | AI text humanization |
| Copyleaks | Plagiarism detection |
| ImageKit | Image CDN optimization |
| Photon | Image metadata service |
Several companion services run alongside Arcane:
| Service | URL | Purpose |
|---|---|---|
| LlamaIndex | merlin-llama-index-*.run.app | Vector embeddings + RAG |
| Tokenizer | merlin-tokenizer-*.run.app | Token counting |
| Session Manager | session.getmerlin.in | Cookie chunked sessions |
| UAM | uam.getmerlin.in | User account management |
| File Processor | file-processor-*.run.app | Document text extraction |
| Readable Text | merlin-readable-text-*.run.app | HTML→clean text |
| Scribe | scribe.siddhartha-5c5.workers.dev | HTML parsing |
| Whisper | merlin-backend-whisper-*.run.app | Speech-to-text |
| MCP Servers | mcp-servers-*.run.app | MCP server registry |
| Spells | spells-*.run.app | Spell/utility service |
| Rune | rune.siddhartha-5c5.workers.dev | LLM proxy (Cloudflare Worker) |
| Plan | Key Features |
|---|---|
| Free | Limited chat, basic models, web search, 102 queries/month |
| Pro | Full access, advanced models, deep research, attachments |
| Teams | Pro features + team collaboration, admin controls |
| Bonkers Basic | Image generation only (limited) |
| Bonkers Pro | Image generation full access |
| AppSumo | Lifetime deal variants |
| Apprentice Sumo | Limited lifetime |
| Friend/Family | Discounted internal plans |
| Starter | Entry-level paid |
Each user has a features map in Firestore with per-feature usage objects:
usage: current countlimit: max allowedresetsAt: timestamp for resetresetInterval: daily/monthlytopUps: temporary bonus allocations with expiryUsage limits middleware checks:
LLM costs calculated from MODEL_COSTS in LLMConstants.ts:
dollarCost (input/output per token), queryCost (abstract units)queryCost * numberOfImagesincrementUserUsage() in utilities/usage.ts (818 lines of business logic)MCP is the plugin system that allows external tools to be used within Merlin:
InterRequestCommunicator class:
addSubscription(ircId, callback) — subscribe to Redis channelsendMessage(ircId, message) — publish to Redis channelsendError(ircId, error) — publish error backARCANE_MCP_CHANNEL:{ircId}Projects are workspaces that group chats and members:
projects/{projectId} — Project document with name, visibility, settingsprojects/{projectId}/members/{uid} — Member role (owner, admin, member, viewer)global_chats collection with projectId metadataEnforced via projectPermissionService:
CREATE_CHATS — member+VIEW_CHATS — all membersEDIT_CHATS — admin+DELETE_CHATS — admin+INVITE_MEMBERS — admin+MANAGE_PROJECT — owner onlygetProjectUsingSecret)Deep Research is a multi-agent research system:
Supervisor Agent (deepResearchAgent):
Researcher Agents (researcherAgent):
Tool Set:
createTodo / getTodo / updateTodo / markTodo — task trackingdumpFinding — stores research findingsfeedbackQuestions — clarification questionsgetSearchHistory — avoid duplicate searchesreportGeneration — final report compilationresearcherAgent — spawn sub-researchersdeploy.sh — only develop and review branches deployturbo run build (Turborepo orchestration)mainus-west1gcloud run deploy arcane --source .arcane-dev for staging--max-old-space-size=6144)Critical vars (40+ total):
NEXT_PUBLIC_ARCANE_BASE_URL — API endpointNEXT_PUBLIC_UAM_BASE_URL — User account managementNEXT_PUBLIC_CLOUD_FUNCTION_BASE — Firebase functionsNEXT_PUBLIC_SESSION_MANAGER — Session cookie serverNEXT_PUBLIC_GTM_ID — Google Tag ManagerNEXT_PUBLIC_POSTHOG_KEY — PostHog analyticsSENTRY_AUTH_TOKEN / SENTRY_DSN — Error trackingNEXTAUTH_SECRET / NEXTAUTH_URL — Auth.js configFIREBASE_* — Firebase configARCANE_* — Arcane backend configCMS_* — CMS (Strapi) configREDISHOST / REDISPORT — Redis connectionK_REVISION — Cloud Run revision identifier@antfu/eslint-config + perfectionist sort-imports@trivago/prettier-plugin-sort-imports + Tailwind pluginshamefully-hoist=trueexpress-zod-api over tRPC/GraphQL: Auto-generated OpenAPI docs + TypeScript types from Zod schemas. Enables both internal use and external API clients.
SSE over WebSockets: Simpler infrastructure (no sticky sessions needed on Cloud Run), HTTP/2 compatible, works through proxies. Reconnection handled client-side via @microsoft/fetch-event-source.
AsyncLocalStorage for request context: Avoids passing req/res through every function. Thread-safe per request. Downside: makes code harder to unit test.
Rune (Cloudflare Worker) as LLM proxy: Centralizes API key management, enables consistent streaming format across providers, handles rate limiting and fallbacks.
Firestore over PostgreSQL/other relational: Serverless DB with real-time sync capabilities. Schema-on-read design enables rapid iteration. Subcollections provide natural hierarchy (chat→thread→messages).
LlamaIndex as separate microservice: Dedicated RAG service with GPU/TPU availability for embeddings. Decouples vector infrastructure from main API.
Redis pub/sub for IRC: Cloud Run instances can't communicate directly. Redis pub/sub provides cross-instance messaging for MCP responses without sticky sessions.
Cookie chunking: Browsers limit cookies to ~4KB. Merlin's JWT exceeds this, so it's split across 4 __Secure-merlin-session cookies.
Context window engine: Proactive context management (trimming + summarization) rather than reactive truncation. Ensures optimal LLM response quality within token limits.
Tool orchestrator pattern: Separates LLM calling from tool execution, enabling recursive tool use, parallel execution, and agent behavior without framework lock-in.