InkdownInkdown
Start writing

Study

59 files·8 subfolders

Shared Workspace

Study
core

09-RateLimiting

Shared from "Study" on Inkdown

Rate Limiting Architecture

Overview

The rate limiting system prevents abuse and ensures fair resource allocation across users. It uses Redis-based sliding window counters with plan-based limits.


Architecture

Plain text
Request → Check User Plan → Apply Limits
                │
    ┌───────────┼───────────┐
    │           │           │
    ▼           ▼           ▼
┌───────┐  ┌───────┐  ┌───────────┐
│ GUEST │  │  PRO  │  │   ULTRA   │
│50/15m │  │500/1h │  │2000/1h   │
│IP-based│ │User-based│ │User-based │
└───────┘  └───────┘  └───────────┘

Guest Rate Limiting

File: src/server/middlewares/rateLimiter/rateLimiter.ts

programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md
TypeScript
import { RateLimiterRedis } from "rate-limiter-flexible";

// Redis-based rate limiter
const rateLimiterRedis = new RateLimiterRedis({
	storeClient: redis, // Redis connection
	keyPrefix: "arcane_guest", // Key prefix for guest users
	points: 50, // 50 requests
	duration: 900, // per 15 minutes (900 seconds)
});

export async function rateLimitGuest(request: Request, user: TUserDoc) {
	let ip = "UNKNOWN";

	if (isDevEnv) {
		ip = request.ip ?? ip;
	} else {
		// Production: Get real IP from proxy
		ip = request.header("x-forwarded-for")?.split(",")[0] ?? ip;
	}

	// Only apply to guest users
	if (user.userPlan === "GUEST") {
		try {
			await rateLimiterRedis.consume(ip, 1); // Consume 1 point
		} catch (error) {
			// Check if limit exceeded
			if (
				error &&
				error._remainingPoints !== undefined &&
				error._remainingPoints === 0
			) {
				logger.info({ ip }, "ERROR/RATE_LIMITER");
				throw new ClientError(429, ErrorType.RATE_LIMIT_EXCEEDED);
			}
		}
	}
}

How It Works:

  1. Identify guest users (userPlan === "GUEST")
  2. Extract IP address (dev vs production handling)
  3. Consume 1 point from Redis counter
  4. If points exhausted (0 remaining), throw 429 error
  5. Redis key: arcane_guest:{ip}
  6. Auto-expires after 15 minutes (sliding window)

Plan-Based Usage Limits

File: src/server/middlewares/usageLimits/usageLimits.ts

TypeScript
export const checkUsageLimits = async (user: TUserDoc, usage: TUsageConfig) => {
	const planLimits = {
		GUEST: { queriesPerDay: 100, queriesPerMonth: 500 },
		FREE: { queriesPerDay: 500, queriesPerMonth: 5000 },
		PRO: { queriesPerDay: 2000, queriesPerMonth: 20000 },
		ULTRA: { queriesPerDay: 5000, queriesPerMonth: 50000 },
	};

	const limits = planLimits[user.userPlan] || planLimits.FREE;

	// Check daily limit
	const dailyUsage = await getDailyUsage(user.uid);
	if (dailyUsage + usage.queries > limits.queriesPerDay) {
		throw new ClientError(
			429,
			ErrorType.DAILY_LIMIT_EXCEEDED,
			Severity.WARNING,
			{
				limit: limits.queriesPerDay,
				used: dailyUsage,
				requested: usage.queries,
			},
		);
	}

	// Check monthly limit
	const monthlyUsage = await getMonthlyUsage(user.uid);
	if (monthlyUsage + usage.queries > limits.queriesPerMonth) {
		throw new ClientError(
			429,
			ErrorType.MONTHLY_LIMIT_EXCEEDED,
			Severity.WARNING,
			{
				limit: limits.queriesPerMonth,
				used: monthlyUsage,
				requested: usage.queries,
			},
		);
	}

	// Record usage
	await recordUsage(user.uid, usage);
};

Plan Limits:

PlanDaily QueriesMonthly QueriesConcurrent Requests
GUEST1005001
FREE5005,0002
PRO2,00020,0005
ULTRA5,00050,00010

Usage Tracking

File: src/server/middlewares/usageLimits/functions/saveDailyUsage.ts

TypeScript
export const recordUsage = async (userId: string, usage: TUsageConfig) => {
	const today = new Date().toISOString().split("T")[0]; // YYYY-MM-DD
	const month = today.substring(0, 7); // YYYY-MM

	const dailyKey = `usage:daily:${userId}:${today}`;
	const monthlyKey = `usage:monthly:${userId}:${month}`;

	// Increment counters in Redis
	await Promise.all([
		redis.incrby(dailyKey, usage.queries),
		redis.incrby(monthlyKey, usage.queries),
		redis.expire(dailyKey, 86400 * 2), // 2 days TTL
		redis.expire(monthlyKey, 86400 * 33), // 33 days TTL
	]);

	// Save detailed usage to Firestore
	await db.collection("usage").add({
		userId,
		timestamp: Timestamp.now(),
		queries: usage.queries,
		tokens: usage.tokens,
		model: usage.model,
		date: today,
	});
};

Usage Schema:

TypeScript
type TUsageConfig = {
	queries: number; // Cost in query units
	tokens: {
		input: number; // Prompt tokens
		output: number; // Completion tokens
		cached: number; // Cached prompt tokens
		reasoning: number; // Reasoning/thinking tokens
	};
	model: TLLMModels; // Which model was used
};

Token-Based Limits

Some operations count tokens instead of queries:

TypeScript
const checkTokenLimits = async (user: TUserDoc, tokens: number) => {
	const tokenLimits = {
		GUEST: 100_000, // 100k tokens/day
		FREE: 500_000, // 500k tokens/day
		PRO: 2_000_000, // 2M tokens/day
		ULTRA: 5_000_000, // 5M tokens/day
	};

	const limit = tokenLimits[user.userPlan];
	const used = await getDailyTokenUsage(user.uid);

	if (used + tokens > limit) {
		throw new ClientError(
			429,
			ErrorType.TOKEN_LIMIT_EXCEEDED,
			Severity.WARNING,
			{
				limit,
				used,
				requested: tokens,
			},
		);
	}
};

AI Tools Usage Limits

File: src/server/middlewares/aiToolsUsageLimits/aiToolsUsageLimits.ts

Special limits for expensive AI features:

TypeScript
export const checkAiToolLimits = async (user: TUserDoc, toolName: string) => {
	const toolLimits = {
		image_generation: {
			FREE: 10, // 10 images/day
			PRO: 50, // 50 images/day
			ULTRA: 100, // 100 images/day
		},
		deep_research: {
			FREE: 3, // 3 research sessions/day
			PRO: 10, // 10 research sessions/day
			ULTRA: 30, // 30 research sessions/day
		},
		data_analysis: {
			FREE: 5, // 5 analyses/day
			PRO: 20, // 20 analyses/day
			ULTRA: 50, // 50 analyses/day
		},
	};

	const limit = toolLimits[toolName]?.[user.userPlan];
	if (!limit) return; // No limit for this tool

	const key = `ai_tool_usage:${user.uid}:${toolName}:${today}`;
	const used = parseInt((await redis.get(key)) || "0");

	if (used >= limit) {
		throw new ClientError(
			429,
			ErrorType.AI_TOOL_LIMIT_EXCEEDED,
			Severity.WARNING,
			{
				tool: toolName,
				limit,
				used,
			},
		);
	}

	await redis.incr(key);
	await redis.expire(key, 86400); // 24 hours
};

Middleware Integration

File: src/server/middlewares/usageLimits/usageLimits.ts

TypeScript
export const usageLimitsMiddleware = async ({
	input,
}: {
	input: z.infer<typeof usageLimitSchema>;
}) => {
	const { user }: TAuthenticatedRequestContext = requestContext.get();

	// Check guest rate limit
	if (user.userPlan === "GUEST") {
		await rateLimitGuest(request, user);
	}

	// Check if user has exceeded plan limits
	// (Actual check happens post-request based on usage)

	return {};
};

Applied in routing:

TypeScript
// In routing.ts
endpoint({
    method: "post",
    path: "/v1/chat/completions",
    middlewares: [authMiddleware, usageLimitsMiddleware],
    ...
});

Burst Handling

TypeScript
// Allow short bursts for better UX
const rateLimiterWithBurst = new RateLimiterRedis({
	storeClient: redis,
	keyPrefix: "arcane_burst",
	points: 10, // 10 burst requests
	duration: 1, // per 1 second
});

// Then fall back to standard limiter
const rateLimiterStandard = new RateLimiterRedis({
	storeClient: redis,
	keyPrefix: "arcane_standard",
	points: 50, // 50 requests
	duration: 900, // per 15 minutes
});

Redis Key Structure

Plain text
# Rate limiting
arcane_guest:{ip}                    → Points remaining
arcane_burst:{userId}                  → Burst points
arcane_standard:{userId}               → Standard points

# Usage tracking
usage:daily:{userId}:{YYYY-MM-DD}      → Daily query count
usage:monthly:{userId}:{YYYY-MM}       → Monthly query count
usage:tokens:daily:{userId}:{date}     → Daily token count

# AI tool limits
ai_tool_usage:{userId}:{toolName}:{date} → Tool usage count

Error Responses

TypeScript
// Rate limit exceeded
{
    "error": {
        "type": "RATE_LIMIT_EXCEEDED",
        "message": "Too many requests. Please slow down.",
        "details": {
            "limit": 50,
            "window": "15 minutes",
            "retry_after": 847  // seconds
        }
    }
}

// Plan limit exceeded
{
    "error": {
        "type": "DAILY_LIMIT_EXCEEDED",
        "message": "Daily query limit exceeded",
        "details": {
            "plan": "FREE",
            "daily_limit": 500,
            "used": 500,
            "upgrade_url": "/upgrade"
        }
    }
}

Summary

The rate limiting system:

  1. Guest Limiting: IP-based, 50 req/15min via Redis
  2. Plan Tiers: Daily/monthly query limits per plan
  3. Token Tracking: Separate token limits for large operations
  4. AI Tool Limits: Per-tool daily limits (images, research, etc.)
  5. Sliding Window: Redis TTL for automatic reset
  6. Burst Handling: Short-term higher limits for UX
  7. Middleware Integration: Applied to all endpoints
  8. Clear Errors: Retry-after hints, upgrade CTAs

Key Principle: Protect resources while providing clear upgrade paths. Never silently fail - always tell users their limits.