InkdownInkdown
Start writing

Study

59 files·8 subfolders

Shared Workspace

Study
core

10-caching-strategies

Shared from "Study" on Inkdown

10 - Caching Strategies

Why Cache?

Database queries are expensive. Caching stores frequently accessed data in faster storage (memory) to reduce database load and improve response times.

Plain text
┌─────────────────────────────────────────────────────────────┐
│              Without Cache                                   │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Client ──► App ──► Database ──► Disk I/O ──► CPU/Memory    │
│         50ms      200ms       10ms         50ms              │
│                    ▲                                        │
│                    │                                        │
│         Every request hits database                         │
│         Even for same query!                                │
│                                                              │
│  Total: ~310ms per request                                 │
│                                                              │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│              With Cache                                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Client ──► App ──► Cache (Redis/Memcached)                 │
│         50ms      1-5ms                                    │
│                    │                                        │
│                    │ Cache Miss                              │
│                    ▼                                        │
│              Database (only on miss)                        │
│                                                              │
│  Cache Hit: ~55ms (95% of requests)                         │
│  Cache Miss: ~310ms (5% of requests)                        │
│  Effective average: ~67ms                                   │
│                                                              │
│  Plus: Database load reduced by 95%                         │
│                                                              │
└─────────────────────────────────────────────────────────────┘
programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md

Cache Types

1. In-Memory Cache (Application Level)
Python
# Simple in-memory cache (Python)
from functools import lru_cache

@lru_cache(maxsize=1000)
def get_user(user_id):
    return db.query("SELECT * FROM users WHERE id = %s", user_id)

# Manual cache with TTL
import time

class SimpleCache:
    def __init__(self):
        self._cache = {}
    
    def get(self, key):
        if key in self._cache:
            value, expiry = self._cache[key]
            if time.time() < expiry:
                return value
            del self._cache[key]
        return None
    
    def set(self, key, value, ttl_seconds=300):
        self._cache[key] = (value, time.time() + ttl_seconds)
2. Distributed Cache (Redis/Memcached)
Python
# Redis cache example
import redis
import json

r = redis.Redis(host='localhost', port=6379, db=0)

def get_user(user_id):
    # Try cache first
    cache_key = f"user:{user_id}"
    cached = r.get(cache_key)
    
    if cached:
        return json.loads(cached)
    
    # Cache miss - fetch from DB
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    
    # Store in cache (TTL = 5 minutes)
    r.setex(cache_key, 300, json.dumps(user))
    
    return user
3. Database Query Cache
Sql
-- MySQL query cache (deprecated in 8.0)
-- PostgreSQL doesn't have built-in query cache (by design)

-- But you can use prepared statements with plan cache
PREPARE get_user (INT) AS
    SELECT * FROM users WHERE id = $1;
4. CDN Cache (For Static Content)
Plain text
┌─────────────────────────────────────────────────────────────┐
│              CDN Cache                                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  User in India ──► CDN Edge (Mumbai) ──► Cache Hit          │
│  User in Brazil ──► CDN Edge (São Paulo) ──► Fetch Origin   │
│  User in Japan ──► CDN Edge (Tokyo) ──► Cache Hit           │
│                                                              │
│  Static assets: Images, CSS, JS, Videos                     │
│  Also: API responses with proper Cache-Control headers        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Caching Patterns

1. Cache-Aside (Lazy Loading)
Python
# Most common pattern
# Application manages cache

def get_data(key):
    # 1. Check cache
    value = cache.get(key)
    if value is not None:
        return value  # Cache hit
    
    # 2. Cache miss - load from database
    value = db.query(key)
    
    # 3. Store in cache
    cache.set(key, value, ttl=300)
    
    return value

def update_data(key, value):
    # 1. Update database first
    db.update(key, value)
    
    # 2. Invalidate cache
    cache.delete(key)
    # Or: cache.set(key, value, ttl=300)  # Update cache
Plain text
Flow:
┌─────────┐     ┌─────────┐     ┌─────────┐
│ Client  │────►│   App   │────►│  Cache  │
└─────────┘     └────┬────┘     └────┬────┘
                     │              │
                     │ Miss         │
                     │              │
                     └──────────────►│
                     │              │
                     │ Fetch        │
                     ▼              │
              ┌─────────────┐       │
              │  Database   │       │
              └─────────────┘       │
                     │              │
                     │ Return       │
                     └──────────────►│
                     │              │
                     │ Store        │
                     │              │
                     ▼              ▼
              Return to Client
2. Read-Through Cache
Plain text
┌─────────────────────────────────────────────────────────────┐
│              Read-Through Cache                              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  App ──► Cache ──► (Cache handles DB lookup if miss)        │
│                                                              │
│  Application doesn't know about database                     │
│  Cache is responsible for loading                          │
│                                                              │
│  Used by: ORMs, some cache libraries                        │
│                                                              │
└─────────────────────────────────────────────────────────────┘
3. Write-Through Cache
Plain text
┌─────────────────────────────────────────────────────────────┐
│              Write-Through Cache                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  App ──► Cache ──► Database                                  │
│         (Synchronous write to both)                        │
│                                                              │
│  Pros: Cache always consistent                              │
│  Cons: Slower writes (must update both)                     │
│                                                              │
└─────────────────────────────────────────────────────────────┘
4. Write-Behind (Write-Back) Cache
Plain text
┌─────────────────────────────────────────────────────────────┐
│              Write-Behind Cache                              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  App ──► Cache ──► (Async) ──► Database                     │
│                                                              │
│  Fast writes (to cache only)                                 │
│  Async background process flushes to DB                      │
│                                                              │
│  Risk: Data loss if cache fails before flush                │
│  Use: High write throughput, acceptable some loss           │
│                                                              │
└─────────────────────────────────────────────────────────────┘
5. Refresh-Ahead Cache
Python
# Proactively refresh cache before expiration
def background_refresh():
    for key in cache.get_expiring_soon(threshold=60):
        # Refresh if still frequently accessed
        if cache.get_access_count(key) > 100:
            value = db.query(key)
            cache.set(key, value)

Cache Invalidation Strategies

The Hard Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton

Python
# 1. Time-Based Expiration (TTL)
cache.set(key, value, ttl=300)  # Expires in 5 minutes

# 2. Explicit Invalidation
def update_user(user_id, data):
    db.update(user_id, data)
    cache.delete(f"user:{user_id}")
    cache.delete(f"user:{user_id}:orders")  # Related cache

# 3. Version-Based (Cache Key with Version)
version = get_schema_version()  # Increment on schema change
cache_key = f"user:{user_id}:v{version}"

# 4. Event-Based Invalidation
# When order placed, publish event
def place_order(user_id, order):
    db.insert(order)
    redis.publish("order_placed", {"user_id": user_id})
    
# Subscriber invalidates cache
redis.subscribe("order_placed", handler=invalidate_user_cache)

# 5. Write-Time Invalidation (Transactional)
def update_with_invalidation(user_id, data):
    with db.transaction() as tx:
        tx.update(user_id, data)
        tx.execute("SELECT pg_notify('cache_invalidate', %s)", f"user:{user_id}")

Common Caching Strategies

1. Object Caching
Python
# Cache database rows as objects
def get_user(user_id):
    return cache.get_or_set(
        f"user:{user_id}",
        lambda: db.users.find_by_id(user_id),
        ttl=600
    )

def get_product(product_id):
    return cache.get_or_set(
        f"product:{product_id}",
        lambda: db.products.find_by_id(product_id),
        ttl=3600  # Products change less often
    )
2. Query Result Caching
Python
# Cache expensive query results
def get_top_products(category, limit=10):
    cache_key = f"top_products:{category}:{limit}"
    return cache.get_or_set(
        cache_key,
        lambda: db.query("""
            SELECT p.*, COUNT(o.id) as order_count
            FROM products p
            JOIN order_items o ON p.id = o.product_id
            WHERE p.category = %s
            GROUP BY p.id
            ORDER BY order_count DESC
            LIMIT %s
        """, category, limit),
        ttl=300
    )
3. Page/Fragment Caching
Python
# Cache entire rendered pages or fragments
def get_homepage():
    return cache.get_or_set(
        "page:home",
        lambda: render_template("home.html"),
        ttl=60
    )

def get_product_list(category):
    # Cache just the product list fragment
    products = cache.get_or_set(
        f"fragment:products:{category}",
        lambda: render_product_list(category),
        ttl=300
    )
    return render_template("category.html", products=products)
4. Session Caching
Python
# Store user sessions in Redis
# Fast lookup, distributed across app servers

class SessionStore:
    def __init__(self, redis_client):
        self.r = redis_client
    
    def get(self, session_id):
        data = self.r.get(f"session:{session_id}")
        return json.loads(data) if data else None
    
    def set(self, session_id, data, ttl=3600):
        self.r.setex(
            f"session:{session_id}",
            ttl,
            json.dumps(data)
        )

Cache Warming

Python
# Pre-populate cache before it expires or on startup

# 1. On startup
async def warm_cache_on_startup():
    # Load most popular items
    popular = await db.query("SELECT id FROM products ORDER BY views DESC LIMIT 1000")
    for product in popular:
        await cache_product(product.id)

# 2. Scheduled warming
async def scheduled_warm():
    # Refresh cache for items expiring soon
    expiring = await cache.get_keys_expiring_in(minutes=5)
    for key in expiring:
        if await is_still_popular(key):
            await refresh_cache(key)

# 3. Event-driven warming
# When new product added, warm its cache
async def on_product_created(product_id):
    await cache_product(product_id)
    # Also warm related caches
    await invalidate_category_cache(product.category)

Handling Cache Stampede

Plain text
┌─────────────────────────────────────────────────────────────┐
│              Cache Stampede Problem                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. Popular cache entry expires                              │
│  2. 1000 requests arrive simultaneously                     │
│  3. All 1000 hit database (cache miss)                     │
│  4. Database overload!                                       │
│                                                              │
└─────────────────────────────────────────────────────────────┘
Python
# Solutions:

# 1. Lock/Mutex (Only one process rebuilds)
import asyncio

async def get_with_lock(key, fetch_func, ttl=300):
    value = await cache.get(key)
    if value:
        return value
    
    # Try to acquire lock
    lock_key = f"lock:{key}"
    if await cache.acquire_lock(lock_key, ttl=10):
        try:
            value = await fetch_func()
            await cache.set(key, value, ttl)
        finally:
            await cache.release_lock(lock_key)
    else:
        # Wait for other process to populate
        await asyncio.sleep(0.1)
        value = await cache.get(key)
    
    return value

# 2. Probabilistic Early Expiration
async def get_with_early_refresh(key, fetch_func, ttl=300):
    value, expiry_time = await cache.get_with_meta(key)
    
    if value:
        # Refresh if close to expiry (probabilistic)
        time_remaining = expiry_time - time.time()
        if time_remaining < ttl * 0.1:  # Last 10% of TTL
            if random.random() < 0.1:  # 10% chance
                asyncio.create_task(refresh_cache(key, fetch_func))
        
        return value
    
    return await fetch_func()

# 3. Always-Refresh Pattern (Never let it expire)
# Background job continuously refreshes popular items

Redis Specific Patterns

Python
# 1. Hash for structured data
# Store user object as hash instead of JSON string
r.hset(f"user:{user_id}", mapping={
    "name": user.name,
    "email": user.email,
    "age": user.age
})

# Get single field (efficient!)
email = r.hget(f"user:{user_id}", "email")

# Get all fields
user_data = r.hgetall(f"user:{user_id}")

# 2. Pipeline for batch operations
pipe = r.pipeline()
for user_id in user_ids:
    pipe.get(f"user:{user_id}")
results = pipe.execute()  # Single round-trip

# 3. Sorted Sets for leaderboards/rankings
# Add score
r.zadd("leaderboard", {user_id: score})

# Get top 10
r.zrevrange("leaderboard", 0, 9, withscores=True)

# Get rank of user
rank = r.zrevrank("leaderboard", user_id)

# 4. Sets for relationships
# Followers
r.sadd(f"followers:{user_id}", follower_id)
r.srem(f"followers:{user_id}", follower_id)
followers = r.smembers(f"followers:{user_id}")

# Check if following
is_following = r.sismember(f"followers:{user_id}", follower_id)

# Common followers (intersection)
common = r.sinter(f"followers:{user_a}", f"followers:{user_b}")

# 5. Rate Limiting (Sliding Window)
def check_rate_limit(user_id, max_requests=100, window=60):
    key = f"rate_limit:{user_id}"
    current = r.get(key)
    
    if current and int(current) >= max_requests:
        return False
    
    pipe = r.pipeline()
    pipe.incr(key)
    pipe.expire(key, window)
    pipe.execute()
    return True

# 6. Distributed Lock
import uuid

def acquire_lock(lock_name, acquire_time=10):
    identifier = str(uuid.uuid4())
    lock_key = f"lock:{lock_name}"
    
    # NX = Only if Not eXists
    if r.set(lock_key, identifier, nx=True, ex=acquire_time):
        return identifier
    return None

def release_lock(lock_name, identifier):
    lock_key = f"lock:{lock_name}"
    # Only release if we own it
    with r.pipeline() as pipe:
        while True:
            try:
                pipe.watch(lock_key)
                if pipe.get(lock_key) == identifier:
                    pipe.multi()
                    pipe.delete(lock_key)
                    pipe.execute()
                    return True
                pipe.unwatch()
                break
            except redis.WatchError:
                continue
    return False

Cache Configuration Guidelines

Plain text
TTL Guidelines:
┌─────────────────────────────────────────────────────────────┐
│  Data Type              │ Suggested TTL                     │
├─────────────────────────────────────────────────────────────┤
│  User sessions          │ 1-24 hours (match session expiry)  │
│  User profiles          │ 5-60 minutes                      │
│  Product catalogs       │ 1-24 hours                        │
│  Inventory counts       │ 1-5 minutes (stale OK)            │
│  Pricing                │ 1-60 minutes                      │
│  Search results         │ 5-30 minutes                      │
│  API responses          │ 1-60 minutes                      │
│  Configuration          │ 1-24 hours                        │
└─────────────────────────────────────────────────────────────┘

Cache Size Guidelines:
- Cache hot data (20% of data likely serves 80% of requests)
- Monitor cache hit ratio (should be > 90%)
- Eviction policy: LRU (Least Recently Used)
- Set maxmemory and appropriate eviction policy

Monitoring Cache

Python
# Key metrics to track:

# 1. Hit Ratio
hits = redis.info()['keyspace_hits']
misses = redis.info()['keyspace_misses']
hit_ratio = hits / (hits + misses)
# Target: > 90%

# 2. Latency
# p50, p95, p99 for cache operations

# 3. Memory usage
used_memory = redis.info()['used_memory']
maxmemory = redis.info()['maxmemory']
usage_percent = (used_memory / maxmemory) * 100

# 4. Evictions
evicted = redis.info()['evicted_keys']
# Should be low; if high, increase memory or reduce TTL

# 5. Connection count
connected_clients = redis.info()['connected_clients']

Next: CAP Theorem & Distributed Systems