InkdownInkdown
Start writing

Study

59 files·8 subfolders

Shared Workspace

Study
core

05-transactions-acid

Shared from "Study" on Inkdown

05 - Transactions & ACID

What is a Transaction?

A transaction is a logical unit of work that consists of one or more database operations. It represents a complete business operation that must succeed or fail as a whole.

Real-World Analogy: Bank Transfer
Plain text
Transfer $100 from Alice to Bob:

┌─────────────────────────────────────────┐
│ Step 1: Check Alice has $100           │
│ Step 2: Deduct $100 from Alice           │
│ Step 3: Add $100 to Bob                  │
│ Step 4: Record the transaction           │
└─────────────────────────────────────────┘

This MUST be atomic - either all steps complete, or none do.

Scenario without transactions:
- Step 2 completes (Alice -$100)
- System crashes before Step 3
- Result: $100 disappeared! 💸

With transactions:
- All steps succeed → Commit (permanent)
- Any step fails → Rollback (undo everything)

ACID Properties

programming-language-concepts.md
zero-language-explanation.md
DB
01-introduction.md
02-relational-databases.md
03-database-design.md
04-indexing.md
05-transactions-acid.md
06-nosql-databases.md
07-query-optimization.md
08-replication-ha.md
09-sharding-partitioning.md
10-caching-strategies.md
11-cap-theorem.md
12-connection-pooling.md
13-backup-recovery.md
14-monitoring.md
15-database-selection.md
README.md
JS
Event loop
Merlin Backend
01-Orchestration.md
02-DeepResearch.md
03-Search.md
04-Scraping.md
05-Streaming.md
06-MultiProviderLLM.md
07-MemoryAndContext.md
08-ErrorHandling.md
09-RateLimiting.md
10-TaskQueue.md
11-SecurityAndAuth.md
Orchestration-2nd-draft
OpenAI Agents Python
00_OVERVIEW.md
01_AGENT_SYSTEM.md
02_RUNNER_SYSTEM.md
03_TOOL_SYSTEM.md
04_ITEMS_SYSTEM.md
05_GUARDRAILS.md
06_HANDOFFS.md
07_MEMORY_SESSIONS.md
08_MODEL_PROVIDERS.md
09_SANDBOX_SYSTEM.md
10_TRACING.md
11_RUN_STATE.md
12_CONTEXT.md
13_LIFECYCLE_HOOKS.md
14_CONFIGURATION.md
15_ERROR_HANDLING.md
16_STREAMING.md
17_EXTENSIONS.md
18_MCP_INTEGRATION.md
19_BEST_PRACTICES.md
20_ARCHITECTURE_PATTERNS.md
opencode-study
context-handling
core
Python
Alembic
Basics
sqlalchemy - fastapi
SQLAlchemy overview
tweets
system_design_for_agentic_apps.md

ACID guarantees data integrity even in the face of errors, power failures, and concurrent access.

Plain text
┌─────────────────────────────────────────────────────┐
│                    ACID                              │
├─────────────────────────────────────────────────────┤
│                                                     │
│   A ──► Atomicity    ──► All or Nothing           │
│   C ──► Consistency  ──► Valid State to Valid State │
│   I ──► Isolation    ──► Concurrent = Serial        │
│   D ──► Durability   ──► Committed = Permanent      │
│                                                     │
└─────────────────────────────────────────────────────┘
A - Atomicity

Definition: A transaction is treated as a single unit. Either all operations complete successfully, or all are rolled back.

Sql
BEGIN;  -- Start transaction

UPDATE accounts SET balance = balance - 100 WHERE id = 'alice';
UPDATE accounts SET balance = balance + 100 WHERE id = 'bob';
INSERT INTO transaction_log (from, to, amount) VALUES ('alice', 'bob', 100);

-- If all succeed:
COMMIT;  -- Permanent save

-- If any fails:
ROLLBACK;  -- Undo all changes

Implementation:

  • Write-Ahead Log (WAL): Changes recorded in log before applying
  • If crash occurs, database replays log to complete or rollback
Plain text
┌─────────────────────────────────────────┐
│           Transaction Flow              │
├─────────────────────────────────────────┤
│                                         │
│  1. BEGIN                               │
│     └── Mark transaction start          │
│                                         │
│  2. Operations                          │
│     └── Write to WAL first (durability) │
│     └── Modify data in memory           │
│                                         │
│  3. COMMIT / ROLLBACK                   │
│     └── COMMIT: Flush WAL to disk       │
│     └── ROLLBACK: Discard changes       │
│                                         │
└─────────────────────────────────────────┘
C - Consistency

Definition: A transaction brings the database from one valid state to another, maintaining all defined rules (constraints, triggers, cascades).

Sql
-- Database rules:
-- 1. Account balance >= 0 (CHECK constraint)
-- 2. Total money in system stays constant

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'alice';  -- OK
UPDATE accounts SET balance = balance + 100 WHERE id = 'bob';      -- OK
COMMIT;
-- Result: Valid state (rules still hold)

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'alice';  -- OK
UPDATE accounts SET balance = balance + 50 WHERE id = 'bob';     -- OK (but wrong!)
-- This violates business rule but not DB constraint
-- Application must enforce this!
COMMIT;

Important Distinction:

  • Database Consistency: Enforced by constraints, triggers
  • Application Consistency: Enforced by business logic
I - Isolation

Definition: Concurrent transactions execute independently. The result is as if transactions ran serially, one after another.

Without Isolation (Problems):

Plain text
┌─────────────────────────────────────────────────────────┐
│            Lost Update (Concurrent Writes)              │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Time │ Transaction A          │ Transaction B          │
│  ─────┼────────────────────────┼────────────────────────│
│   T1  │ READ balance = 100     │                        │
│   T2  │                        │ READ balance = 100     │
│   T3  │ balance = 100 + 50     │                        │
│   T4  │                        │ balance = 100 + 30     │
│   T5  │ WRITE balance = 150    │                        │
│   T6  │                        │ WRITE balance = 130    │
│       │                        │ ← A's update lost!     │
│                                                         │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│              Dirty Read (Uncommitted Data)              │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Time │ Transaction A          │ Transaction B          │
│  ─────┼────────────────────────┼────────────────────────│
│   T1  │ BEGIN                  │                        │
│   T2  │ UPDATE x = 100         │                        │
│   T3  │                        │ READ x = 100           │
│   T4  │ ROLLBACK               │                        │
│   T5  │                        │ Using x = 100 (wrong!) │
│                                                         │
│  B read data that was never committed!                  │
│                                                         │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│              Non-Repeatable Read                          │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Time │ Transaction A          │ Transaction B          │
│  ─────┼────────────────────────┼────────────────────────│
│   T1  │ BEGIN                  │                        │
│   T2  │ READ x = 100           │                        │
│   T3  │                        │ BEGIN                  │
│   T4  │                        │ UPDATE x = 200         │
│   T5  │                        │ COMMIT                 │
│   T6  │ READ x = 200           │                        │
│       │  ↑ Different value!                            │
│                                                         │
│  Same transaction, same query, different result!        │
│                                                         │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│              Phantom Read                               │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Time │ Transaction A          │ Transaction B          │
│  ─────┼────────────────────────┼────────────────────────│
│   T1  │ BEGIN                  │                        │
│   T2  │ SELECT * WHERE age > 30│                        │
│       │ → Returns 5 rows       │                        │
│   T3  │                        │ INSERT user age=35     │
│   T4  │                        │ COMMIT                 │
│   T5  │ SELECT * WHERE age > 30│                        │
│       │ → Returns 6 rows       │                        │
│       │  ↑ New "phantom" row appeared!                 │
│                                                         │
└─────────────────────────────────────────────────────────┘
Isolation Levels

SQL standard defines 4 isolation levels. Higher level = more isolation = less concurrency.

Plain text
┌─────────────────────────────────────────────────────────────┐
│  Isolation Level        │ Dirty Read │ Non-Repeatable │ Phantom │
├─────────────────────────────────────────────────────────────┤
│  READ UNCOMMITTED       │   YES      │     YES        │  YES    │
│  READ COMMITTED         │   NO       │     YES        │  YES    │
│  REPEATABLE READ        │   NO       │     NO         │  YES    │
│  SERIALIZABLE           │   NO       │     NO         │  NO     │
└─────────────────────────────────────────────────────────────┘

Default by Database:
- PostgreSQL: READ COMMITTED
- MySQL (InnoDB): REPEATABLE READ
- Oracle: READ COMMITTED
- SQL Server: READ COMMITTED
Sql
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- In PostgreSQL
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;

READ UNCOMMITTED:

Sql
-- Can see uncommitted changes (rarely used)
-- Basically no isolation
-- Some databases treat it as READ COMMITTED

READ COMMITTED (Default in PostgreSQL, Oracle):

Sql
-- Can only see committed data
-- But: Non-repeatable reads and phantoms possible

-- Example:
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- $100
-- ... another transaction commits, changing balance to $200
SELECT balance FROM accounts WHERE id = 1;  -- $200 (different!)
COMMIT;

REPEATABLE READ (Default in MySQL):

Sql
-- Same query returns same results within transaction
-- But: Phantom reads still possible (new rows can appear)

-- PostgreSQL: Actually prevents phantoms too (using MVCC)

SERIALIZABLE:

Sql
-- Complete isolation, as if transactions ran one after another
-- Most restrictive, can cause more conflicts/retries
-- PostgreSQL implements this without locking (using SSI - Serializable Snapshot Isolation)
D - Durability

Definition: Once a transaction is committed, it remains committed even in case of system failure (power loss, crash).

Implementation:

  • Write-Ahead Logging (WAL)
  • fsync() to ensure data reaches disk
  • Some databases offer relaxed durability for performance
Plain text
┌─────────────────────────────────────────┐
│         Durability Mechanism            │
├─────────────────────────────────────────┤
│                                         │
│  1. Transaction commits                 │
│                                         │
│  2. Write to WAL (sequential, fast)     │
│     └── fsync() to disk                │
│                                         │
│  3. Return success to application       │
│                                         │
│  4. Later: Write to actual data files   │
│     └── Can be async, batched          │
│                                         │
│  If crash after step 3:                 │
│  └── Recovery replays WAL               │
│                                         │
└─────────────────────────────────────────┘

Transaction Commands

Sql
-- Start transaction
BEGIN;
-- or
START TRANSACTION;

-- Savepoint (partial rollback)
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
SAVEPOINT before_bonus;
UPDATE accounts SET balance = balance + 50 WHERE id = 2;
-- Oops, wrong bonus
ROLLBACK TO before_bonus;
-- Now only first update remains
COMMIT;

-- Autocommit (default in many clients)
SET autocommit = ON;  -- Each statement is a transaction
SET autocommit = OFF; -- Must explicitly COMMIT

Implementation: MVCC (Multi-Version Concurrency Control)

PostgreSQL and MySQL (InnoDB) use MVCC instead of locking for isolation.

Plain text
┌─────────────────────────────────────────────────────────┐
│                   MVCC Concept                          │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Instead of locking rows, each transaction sees:       │
│  - A snapshot of data as of transaction start          │
│  - Multiple versions of same row can exist             │
│                                                         │
│  How it works:                                          │
│  ┌─────────────┐        ┌─────────────┐               │
│  │ Row Version │   ┌───►│ Row Version │               │
│  │ xmin: 100   │   │    │ xmin: 200   │               │
│  │ xmax: 200   │───┘    │ xmax: null  │               │
│  │ data: "A"  │         │ data: "B"  │               │
│  └─────────────┘         └─────────────┘               │
│       (old)                  (current)                │
│                                                         │
│  xmin = transaction that created this version          │
│  xmax = transaction that deleted/obsoleted this        │
│                                                         │
│  Transaction 150 sees "A" (snapshot at start)          │
│  Transaction 250 sees "B" (created before 250)        │
│                                                         │
└─────────────────────────────────────────────────────────┘

MVCC Benefits:

  • Readers don't block writers
  • Writers don't block readers
  • No read locks needed
  • Better concurrency

MVCC Costs:

  • Storage for multiple versions
  • Need vacuuming to clean old versions

Locking

When conflicts occur, databases use locks.

Lock Types
Plain text
┌─────────────────────────────────────────────────────────┐
│                   Lock Hierarchy                        │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  TABLE LOCKS (coarse, high conflict)                   │
│  ├── ACCESS SHARE          (SELECT)                     │
│  ├── ROW SHARE             (SELECT FOR UPDATE)          │
│  ├── ROW EXCLUSIVE         (INSERT/UPDATE/DELETE)       │
│  ├── SHARE UPDATE EXCLUSIVE (VACUUM)                   │
│  ├── SHARE                 (CREATE INDEX)               │
│  ├── SHARE ROW EXCLUSIVE   (??? - rare)                │
│  ├── EXCLUSIVE             (REFRESH MATERIALIZED VIEW)│
│  └── ACCESS EXCLUSIVE      (DROP TABLE, ALTER TABLE)  │
│                                                         │
│  ROW LOCKS (fine-grained, less conflict)               │
│  ├── FOR SHARE            (read lock, share with others)│
│  ├── FOR UPDATE           (write lock, exclusive)       │
│  ├── FOR NO KEY UPDATE    (weaker than FOR UPDATE)     │
│  └── FOR KEY SHARE        (weaker than FOR SHARE)      │
│                                                         │
│  Lock compatibility:                                     │
│  - Share locks are compatible with other share locks   │
│  - Exclusive locks block everyone                       │
│                                                         │
└─────────────────────────────────────────────────────────┘
Explicit Locking
Sql
-- Pessimistic locking
BEGIN;
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
-- Now row 1 is locked until COMMIT/ROLLBACK
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;

-- Advisory locks (application-level)
SELECT pg_advisory_lock(123);  -- Lock integer key
-- Do work...
SELECT pg_advisory_unlock(123);

-- Skip locked rows (good for queues)
SELECT * FROM jobs WHERE status = 'pending' 
FOR UPDATE SKIP LOCKED
LIMIT 1;
Deadlocks
Plain text
┌─────────────────────────────────────────────────────────┐
│                    DEADLOCK                            │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Time │ Transaction A          │ Transaction B          │
│  ─────┼────────────────────────┼────────────────────────│
│   T1  │ Lock Row 1             │                        │
│   T2  │                        │ Lock Row 2             │
│   T3  │ Request Lock Row 2     │                        │
│       │  ↓ Waiting for B...    │                        │
│   T4  │                        │ Request Lock Row 1     │
│       │                        │  ↓ Waiting for A...    │
│       │                        │                        │
│       │  ◄─── CIRCULAR WAIT ──►│                        │
│       │        DEADLOCK!                              │
│                                                         │
│  Database detects and aborts one transaction            │
│                                                         │
└─────────────────────────────────────────────────────────┘

Avoiding Deadlocks:

  1. Always acquire locks in the same order
  2. Keep transactions short
  3. Use lower isolation levels when possible
  4. Use advisory locks for complex operations

Transaction Best Practices

Sql
-- ✓ Good: Short transactions
BEGIN;
-- Do minimal work
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- ✗ Bad: Long transaction with user interaction
BEGIN;
SELECT * FROM products;
-- ... wait for user to decide ... (minutes!)
UPDATE products SET ...;
COMMIT;

-- ✓ Good: Batch work in reasonable chunks
-- Process 1000 rows at a time, commit, repeat
FOR batch IN (SELECT * FROM large_table LIMIT 1000 OFFSET n*1000) LOOP
    -- Process
    COMMIT;
END LOOP;

-- ✓ Good: Use appropriate isolation level
-- Don't use SERIALIZABLE for everything
-- Use READ COMMITTED when possible

-- ✓ Good: Handle deadlocks in application
BEGIN;
-- ... work ...
COMMIT;
-- If deadlock error, retry the transaction

Next: NoSQL Databases