Databases — A Senior Engineer's End-to-End Field Guide
A practical, in-depth walkthrough of database fundamentals → design trade-offs → how a senior engineer actually chooses. Written to be read top-to-bottom, with simplified Mermaid diagrams for the concepts that are easier seen than read.
How to read this: Each section builds on the previous one. The vocabulary (Part 1) feeds the trade-offs (Part 2), which feed the decision framework (Part 3). Real-world examples are called out in Example blocks.
The single most important idea in this whole document:
You don't pick a database. You describe a workload, attach real numbers to it, and the right database falls out.
The flow a senior engineer follows:
Rendering diagram…
Everything below is the toolkit that makes each box in that diagram concrete.
1. Foundations
A relational database stores data in tables — think spreadsheets with strict rules.
Row = one record (one user, one order).
Column = one field (email, created_at).
Schema = the defined structure: which tables exist, their columns, and each column's type. A strong schema (Postgres, MySQL) enforces the rules at write time; a schemaless store (MongoDB) lets each record differ.
Primary key (PK) = a column that uniquely identifies each row (e.g. user_id). No two rows share it.
Foreign key (FK) = a column that points to another table's PK. This is how tables relate.
Rendering diagram…
Example — An e-commerce app stores customers in USERS and purchases in ORDERS. Each order carries a user_id FK so you always know who bought what, without copying the customer's name and address onto every single order.
2. Joins
Because relational data is split across tables, you often need to recombine it. A join stitches rows from two tables together using a matching column.
Rendering diagram…
Sql
Why it matters: joins avoid duplicated data, but they are computationally expensive at scale — the engine matches rows across potentially huge tables. That cost is exactly why, at scale, people sometimes duplicate data to avoid joins (next section).
3. Normalization vs. Denormalization
This is the classic write-simplicity vs. read-speed trade-off.
Normalization
Denormalization
Idea
Each fact lives in exactly one place
Deliberately duplicate data
Reads
Slower (need joins)
Faster (everything in one place)
Writes
Clean (update once)
Painful (update many copies)
Storage
Less
More
Risk
—
Data drifts out of sync
Rendering diagram…
Example — A banking system normalizes aggressively (correctness is sacred; an address must never disagree with itself). A product catalog read 10,000×/sec denormalizes the product name and price onto the listing row so a page load doesn't trigger five joins.
4. Indexing (deep dive)
4.1 The problem
Without an index, finding a row = full table scan: read row 1, row 2, … all 10 million, checking each. Painfully slow.
An index is a separate, pre-sorted lookup structure — like the index at the back of a textbook.
4.2 What an index actually is
A mini-table holding only (the indexed value, kept sorted) → (a pointer to the full row):
Indexed value (sorted)
Pointer
aaron@x.com
→ row @ block 91
beth@x.com
→ row @ block 12
keto@shubho.tech
→ row @ block 57
zara@x.com
→ row @ block 33
Two properties make it powerful: it's small (one column + a pointer) and it's sorted.
4.3 Why sorted wins — binary search
Rendering diagram…
Linear scan: up to 10,000,000 steps. Binary search: ~24 steps. That's the difference between "slow" and "instant."
4.4 The real structure — a B-tree
Real databases use a B-tree (balanced tree), because inserting into the middle of a giant flat list is expensive.
Rendering diagram…
Root / branch nodes are signposts ("before M go left").
Leaf nodes hold the actual sorted values + row pointers, and are linked in order (so range scans just walk sideways).
Stays balanced: every lookup costs roughly the tree's depth — a handful of steps even at billions of rows. Inserts slot into the right leaf and rebalance locally.
4.5 The trade-off
Rendering diagram…
Rule of thumb: index the columns you frequently search by, join on, or sort by — and resist indexing everything. A write-heavy table with too many indexes grinds.
4.6 Flavors of indexes
Type
What it is
Use when
Clustered
The table itself is physically sorted by this key. Only one per table (usually the PK); the leaf is the row.
Primary key
Non-clustered / secondary
Separate structure with pointers back to the row. Many allowed.
Any other searched column
Composite
Multi-column, e.g. (last_name, first_name). Sorted by first, then second — order matters (like a phone book).
Multi-column filters
Unique
Also forbids duplicates
Emails, usernames
Partial
Indexes only some rows, e.g. WHERE status='active'
You only query a subset
Covering
Includes the columns a query needs, so it answers from the index alone (skips the row fetch)
Hot queries
Example — A phone book is a composite index on (last_name, first_name). Finding "Smith, John" is instant. Finding everyone named "John" (any last name) is useless — the book isn't sorted that way. Same rule applies to composite DB indexes: leading column first.
5. ACID & Transactions
A transaction groups operations into one unit. ACID are its guarantees:
Rendering diagram…
Example — A money transfer: "debit A by $100" and "credit B by $100" must both happen or neither (Atomicity). Two transfers hitting the same account at once must not corrupt the balance (Isolation). Once the bank says "done," a power cut can't undo it (Durability).
Strong ACID is why you pick a relational DB for a banking ledger. Many NoSQL systems relax some of these for speed/scale.
6. CAP & PACELC
For distributed databases (data spread across machines): during a network partition (nodes can't talk), you can guarantee only two of three — so really you choose between C and A.
Rendering diagram…
C — Consistency: every read sees the latest write.
A — Availability: every request gets a response.
P — Partition tolerance: keeps working despite network breaks (non-negotiable in distributed systems).
PACELC extends it: if Partition → choose A or C; Else (normal operation) → trade Latency vs. Consistency. Keeping replicas perfectly in sync costs time.
Eventual consistency is the AP compromise: replicas may briefly disagree but converge.
Example — A YouTube view count stuck at "301" for a while is eventual consistency in the wild: the system chose availability + low latency over perfectly synchronized counters.
7. Scaling
Rendering diagram…
7.1 Replication — copies of the same data
Great for read scaling and fault tolerance.
Rendering diagram…
Key question: synchronous replicas (safe, slower) or asynchronous (fast, can be briefly stale)?
7.2 Sharding — split different data across machines
This is how you scale writes — the genuinely hard problem.
Rendering diagram…
The shard key is the highest-stakes, hardest-to-reverse decision in a distributed DB. A bad key creates a hotspot — e.g. sharding by country when 80% of users are in one country dumps the load on a single shard.
8. The Measurements
These are the numbers a senior attaches to a workload.
Metric
What it means
Why it matters
QPS / TPS
Queries / transactions per second
Raw load; watch the peak-to-average ratio
p50 / p95 / p99 latency
Median vs. the slow tail
Optimize p99 — the tail is what users feel
Data volume & growth
10 GB vs 10 TB vs 10 PB
Picks the whole tool class; project 1–3 yrs out
Working set
Hot data that must fit in RAM/cache
Drives memory sizing
RPO
How much data you can afford to lose
Backup/replication strategy
RTO
How fast you must recover
Failover design
SLA
Uptime promise
99.9% ≈ 8.7 h/yr down; 99.99% ≈ 52 min/yr
Cost
$ / query, / GB stored, / GB transferred + ops
The bill and the human cost
Rendering diagram…
Example — An API averaging "30 ms" can still feel broken if its p99 is 3 seconds: 1 in 100 requests stalls, and a page making 100 calls will almost always hit one. Averages lie; percentiles tell the truth.
9. OLTP vs. OLAP
Rendering diagram…
OLTP: "place an order," "update a profile." Row-oriented (a whole row sits together).
OLAP: "total revenue by region last quarter." Column-oriented — only reads the columns it needs, so it crushes scan-heavy aggregation.
Example — Your app writes orders to Postgres (OLTP) all day. Overnight, those orders stream into BigQuery (OLAP) where analysts run "revenue by region" without ever slowing down the live store.
10. Storage Engines
Rendering diagram…
B-tree: updates data in place; balanced read/write; the default for SQL databases.
LSM-tree (Log-Structured Merge): buffers writes in memory, flushes sequential immutable files, merges them in the background (compaction). Very fast writes; reads may check several files. Used by Cassandra, RocksDB, ScyllaDB.
Example — An IoT fleet writing 500k sensor readings/sec leans LSM (Cassandra): writes are sequential and cheap. A CRM doing balanced reads/writes with rich queries leans B-tree (Postgres).
11. The Decision Framework
Rendering diagram…
Steps:
Characterize the workload — read/write ratio, access patterns, query predictability, data shape, consistency needs.
Attach real numbers — QPS, data size + growth, p99 target, SLA, budget.
Let the numbers eliminate options — don't argue taste; let constraints decide.
Default to boring — Postgres until measured evidence says otherwise.
Scale the bottleneck — reads → replicas/cache; writes → shard or specialized store.
12. Worked Case Studies
Case A — Analytics dashboard
50k writes/sec of event data · queries are big time-range aggregations · p95 read < 2s · grows ~1 TB/month · stale-by-a-minute is fine.
Rendering diagram…
Case B — Banking ledger
2k TPS · every transaction must be correct & durable · ad-hoc queries by support staff.
Rendering diagram…
Same engineer, opposite decisions — because the measurements differed. That's the entire point.
13. Senior-Level Wisdom
Boring is good. Default to Postgres until you have measured evidence it can't keep up. It scales much further than people assume.
Scaling reads ≠ scaling writes. Reads scale easily (replicas, caching). Writes are the hard problem (sharding is mostly a one-way door).
The shard key is the most consequential, least reversible choice in a distributed DB. Choose to spread load evenly; avoid hotspots.
Polyglot persistence is normal. Postgres (transactions) + Redis (cache) + Elasticsearch (search) + warehouse (analytics). The right answer is often "more than one."
Operational maturity matters. "Can my team run this at 3am?" A theoretically-better DB nobody can operate is the wrong choice.
Design for 10×, not 1000×. Architecting today for traffic you won't see for five years is over-engineering. Leave room to evolve; don't build the final form on day one.
Measure before you migrate. "We need NoSQL for scale" is, more often than not, a missing index or an N+1 query.
14. Glossary
Term
One-line meaning
Schema
The defined structure & types of your tables
Primary key
Unique identifier for a row
Foreign key
A pointer to another table's primary key
Join
Recombine rows across tables by a matching column
Normalization
Store each fact once (clean writes, slower reads)
Denormalization
Duplicate data (fast reads, messy writes)
Index
Sorted lookup structure for fast finds
B-tree
Balanced tree index; read-friendly, in-place
LSM-tree
Write-friendly index; buffer + flush + compaction
Clustered index
Table physically sorted by this key (one per table)
Covering index
Index that answers a query without fetching the row
ACID
Atomicity, Consistency, Isolation, Durability
Transaction
A group of operations treated as one unit
CAP
Under a partition, pick Consistency or Availability
PACELC
Even without a partition, trade Latency vs. Consistency
Eventual consistency
Replicas briefly disagree, then converge
Replication
Copies of the same data on multiple machines (read scaling)
Sharding
Different data split across machines (write scaling)
Shard key
The column that decides how data is partitioned
Hotspot
One shard overloaded due to a bad shard key
QPS / TPS
Queries / transactions per second
p99
The latency 99% of requests beat (the slow tail)
RPO / RTO
How much data you can lose / how fast you recover
SLA
Uptime promise (e.g. 99.99%)
OLTP
Operational, row-oriented, many small ops
OLAP
Analytical, column-oriented, big aggregations
Working set
Hot data that should fit in memory
Polyglot persistence
Using several databases, each for its strength
Built as a study reference — pair it with the in-workspace Notion page for quick lookups, and use this file for the diagrams and offline reading.