What is the difference between repeatable read and serializable isolation?

Repeatable read ensures that if you read a row, re-reading it in the same transaction returns the same value (prevents non-repeatable reads). It does not prevent phantom reads — a range query may return additional rows inserted by concurrent transactions. Serializable isolation prevents all anomalies including phantoms and write skew. PostgreSQL implements serializable via SSI (Serializable Snapshot Isolation), which detects dangerous read-write dependencies without locking, allowing high concurrency.

What is write skew and how does it differ from a lost update?

A lost update occurs when two transactions both read a value and overwrite it — one transaction's write is overwritten by the other. Write skew is subtler: two transactions each read overlapping data, then each updates a non-overlapping part based on a stale read, violating a constraint that relied on the combined state. Example: two doctors both see that two doctors are on call and each withdraws, leaving zero doctors on call. Write skew requires serializable isolation or explicit SELECT FOR UPDATE to prevent.

How does MVCC implement snapshot isolation?

MVCC (Multi-Version Concurrency Control) maintains multiple versions of each row. Each row version has xmin (transaction ID that created it) and xmax (transaction ID that deleted/updated it). A snapshot captures the current transaction ID at start time. The snapshot sees only rows where xmin committed before the snapshot and xmax is either not set or committed after the snapshot. This allows readers to see a consistent point-in-time view without blocking writers.

Low Level Design: Database Transaction Isolation Levels

Why Isolation Levels Exist

A database must handle many concurrent transactions at once. The naive solution — run one transaction at a time — is correct but crushes throughput. Isolation levels define the tradeoff: how much concurrency you allow versus how many anomalies you expose to application code. Strict serializability is the gold standard for correctness, but it means more locking, more contention, and slower reads. Most production systems compromise by choosing a weaker level and designing application logic to tolerate the specific anomalies that level permits.

The SQL Standard Isolation Levels

SQL-92 defines four isolation levels, ordered from weakest to strongest: Read Uncommitted, Read Committed, Repeatable Read, and Serializable. Each level is defined by which anomalies it prevents. Higher levels prevent more anomalies but reduce concurrency. The standard does not mandate a specific implementation — locking, MVCC, and optimistic concurrency control are all valid approaches.

Read Uncommitted

At Read Uncommitted, a transaction can read rows that another transaction has modified but not yet committed. This is called a dirty read. If the writing transaction rolls back, the reader has consumed data that never officially existed. In practice, no serious OLTP system uses Read Uncommitted for anything beyond approximate aggregate queries (e.g., rough row counts where a small error is acceptable). Most databases that claim to support it actually implement Read Committed underneath anyway.

Read Committed

Read Committed ensures a transaction only sees data that has been durably committed. Dirty reads are impossible. However, non-repeatable reads are allowed: if you read the same row twice within a transaction, a concurrent commit between the two reads can change the value you see. PostgreSQL uses Read Committed as its default. It is sufficient for most web application workloads where individual statements are short and the application does not depend on two reads of the same row returning identical values.

Repeatable Read

Repeatable Read prevents non-repeatable reads by taking a consistent snapshot at the start of the transaction. Any row you read at time T will return the same value if you read it again later in the same transaction, regardless of concurrent commits. MySQL InnoDB uses Repeatable Read as its default. However, phantom reads remain possible under the strict SQL-92 definition: a range query issued twice in the same transaction might return different sets of rows if concurrent inserts commit in between. InnoDB eliminates phantoms via gap locks, but this is an implementation choice beyond the standard definition.

Serializable

Serializable is the strongest level. The outcome of any set of concurrent serializable transactions must be equivalent to some serial (one-at-a-time) execution order. This prevents dirty reads, non-repeatable reads, phantom reads, and write skew. Traditional implementations use two-phase locking (2PL): shared locks on reads, exclusive locks on writes, held until commit. 2PL is correct but can cause significant blocking and deadlocks under contention. Modern databases implement Serializable via SSI (discussed below).

Anomalies in Detail

Dirty read: Transaction A reads a row modified by Transaction B before B commits. B rolls back — A consumed phantom data.

Non-repeatable read: Transaction A reads row R. Transaction B updates and commits R. Transaction A reads R again — different value. The same row read twice yields two different results within the same transaction.

Phantom read: Transaction A runs SELECT * FROM orders WHERE amount > 1000. Transaction B inserts a new order with amount 2000 and commits. Transaction A runs the same query again — the result set now includes B’s row. The range result is not stable.

Write skew: A subtler anomaly not covered by the original SQL-92 definitions. Transactions A and B both read overlapping data and each writes to a non-overlapping subset based on what they read. Example: a hospital scheduling system requires at least one doctor on call. A and B both read that two doctors are on call. A removes doctor 1. B removes doctor 2. Both commit — now zero doctors on call, violating the constraint. Neither transaction individually caused a conflict, yet the concurrent execution produced an invalid state. Write skew is only prevented at Serializable.

MVCC Implementation

Multi-Version Concurrency Control (MVCC) is the dominant implementation strategy for Read Committed and Repeatable Read in modern databases. Instead of locking rows on read, the database maintains multiple versions of each row. In PostgreSQL, every row tuple has two system fields: xmin (the transaction ID that created this version) and xmax (the transaction ID that deleted or updated this version, or zero if still live).

When a transaction starts, it takes a snapshot: the set of transaction IDs that were committed at that moment. A row version is visible to the snapshot if xmin is committed and in the snapshot, and xmax is either zero or not yet committed. Writes create new row versions rather than updating in place — old versions are left for concurrent readers and cleaned up later by VACUUM. This means readers never block writers and writers never block readers, a major advantage over pure locking approaches.

Serializable Snapshot Isolation (SSI)

SSI, introduced in PostgreSQL 9.1, achieves true Serializable isolation without global locking. It builds on MVCC snapshots and adds tracking of rw-antidependencies: if Transaction A reads a version that Transaction B later overwrites (or vice versa), that is a potential serialization conflict. SSI tracks these dependencies using SIREAD locks — lightweight locks that do not block but record that "A read data that B modified."

When two rw-antidependencies form a cycle in the dependency graph (the pattern that characterizes a non-serializable execution), SSI aborts one of the transactions involved. The key insight: most workloads do not form such cycles, so SSI aborts transactions far less frequently than 2PL blocks them. Throughput under SSI is close to Repeatable Read for typical OLTP workloads, while providing full Serializable correctness guarantees. Write skew, which requires Serializable to prevent, is handled automatically by SSI without the application needing to add explicit locking hints.

Choosing the Right Level in Practice

Most PostgreSQL applications run at Read Committed. It is fast, correct for workloads where each statement is atomic, and the default. If your application reads a value and then makes a decision based on it within the same transaction (e.g., check-then-act patterns), you need at least Repeatable Read to avoid non-repeatable reads invalidating your logic. If you have any constraint that spans multiple rows or tables and must hold after concurrent updates — like the on-call doctor example — you need Serializable or must use explicit SELECT FOR UPDATE locking at lower levels. SSI in PostgreSQL makes Serializable viable without the operational pain of 2PL deadlocks.