Low Level Design: MVCC — Multiversion Concurrency Control

Multiversion Concurrency Control (MVCC) is a database concurrency mechanism that allows readers and writers to operate simultaneously without blocking each other. Instead of locking rows for reads, the database maintains multiple versions of each row and gives each transaction a consistent snapshot of the database at a point in time. MVCC is used by PostgreSQL, MySQL InnoDB, Oracle, and most modern databases. Understanding MVCC is essential for explaining transaction isolation levels, phantom reads, and performance characteristics in system design interviews.

Core Idea: Versions Instead of Locks

In a lock-based system, a writer locks a row and readers must wait. With MVCC, every write creates a new version of the row (with a transaction ID and timestamp) while the old version remains. Readers see the version that was current at their transaction start time — they never see partial writes or uncommitted data. Writers create new versions without disturbing existing ones. Readers and writers never block each other. Only write-write conflicts (two transactions updating the same row) require conflict resolution.

-- PostgreSQL MVCC internals: each row has system columns
-- xmin: transaction ID that created this version
-- xmax: transaction ID that deleted/updated this version (0 if current)

SELECT xmin, xmax, id, name, balance FROM accounts WHERE id = 1;
-- xmin=100, xmax=0, id=1, name=Alice, balance=1000  (current version)

-- Transaction 105 updates balance:
BEGIN;  -- gets snapshot: sees all committed txns  103 snapshot boundary

Transaction Isolation Levels and MVCC

MVCC implements isolation levels through snapshot selection: Read Committed: each statement gets a new snapshot (latest committed data). May see different data within a transaction. Repeatable Read: snapshot taken at transaction start, held for the entire transaction. Same row always returns same data within the transaction. Snapshot Isolation: like Repeatable Read but also prevents phantom reads (new rows added by other transactions are not visible). PostgreSQL Repeatable Read is actually Snapshot Isolation. Serializable Snapshot Isolation (SSI): detects serialization anomalies at commit time, rejecting transactions that would violate serializability.

Vacuum: Cleaning Up Old Versions

MVCC creates old row versions on every update. Old versions accumulate and bloat tables unless cleaned up. PostgreSQL VACUUM reclaims space from dead rows (versions no longer visible to any active transaction). AUTOVACUUM runs in the background, triggered by the number of dead tuples. A critical MVCC issue: long-running transactions prevent VACUUM from reclaiming old versions (other transactions might still need to see them). A 10-hour transaction freezes table bloat for 10 hours. Monitor pg_stat_user_tables.n_dead_tup and long-running transactions.

Write Skew: The MVCC Blind Spot

Write skew is an anomaly possible under snapshot isolation: two transactions each read data, make decisions based on what they read, and write to disjoint sets of rows — yet the combined result violates a constraint that both transactions checked. Classic example: two doctors simultaneously check if there is at least one doctor on call (both see yes), both decide to go off call, and now zero doctors are on call. MVCC snapshot isolation cannot detect this because neither transaction overwrites the other rows. Prevention: SELECT FOR UPDATE to acquire locks on the rows read, or use Serializable isolation level (SSI).

Key Interview Discussion Points

  • MVCC vs. locking: MVCC eliminates read-write lock contention (critical for read-heavy workloads); pure locking has lower write overhead on write-heavy workloads
  • Transaction ID wraparound: PostgreSQL uses 32-bit transaction IDs — after 2 billion transactions, IDs wrap and old data looks “in the future.” VACUUM FREEZE prevents this by assigning a frozen transaction ID to old rows
  • MySQL InnoDB MVCC: uses undo logs (previous versions stored separately from the main table) instead of inline row versions like PostgreSQL
  • Optimistic vs. pessimistic concurrency: MVCC is optimistic for reads; writers still need to detect conflicts at commit time
  • Snapshot isolation vs. serializability: snapshot isolation allows write skew; SSI detects and prevents it with minimal overhead added in PostgreSQL 9.1
Scroll to Top