What Is Repeatable Read Isolation?
Repeatable read is an isolation level that guarantees a transaction always sees the same data when it reads the same row multiple times. The snapshot is taken at the start of the transaction, and all reads within the transaction use that snapshot. This prevents dirty reads and non-repeatable reads. Phantom prevention varies by implementation.
Repeatable read is the default isolation level in MySQL InnoDB.
Transaction-Level Snapshot
When a transaction begins under repeatable read, the database captures the current committed state as the transaction's snapshot. All subsequent reads within the transaction see only the data visible at that snapshot point, regardless of changes committed by other transactions afterward.
This contrasts with read committed, where each statement gets a fresh snapshot. Under repeatable read, a second read of the same row always returns the same value — even if another transaction has since updated and committed that row.
Preventing Non-Repeatable Reads
The transaction-level snapshot makes non-repeatable reads impossible by definition: since all reads use the snapshot from transaction start, committed changes by other transactions during the current transaction's lifetime are invisible. Re-reading the same row always returns the same version.
Phantom Reads and the Problem They Cause
A phantom read occurs when a range query returns a different set of rows across two executions because another transaction inserted or deleted rows matching the range predicate. Even with row-level snapshot isolation, a new row inserted into the range would be visible on the second read if only existing rows are locked.
Example:
- T1: SELECT * FROM orders WHERE amount > 100; — returns 5 rows
- T2: INSERT INTO orders (amount) VALUES (200); COMMIT;
- T1: SELECT * FROM orders WHERE amount > 100; — would return 6 rows without phantom prevention
Gap Locks and Next-Key Locks (InnoDB)
InnoDB prevents phantoms using gap locks and next-key locks on index ranges:
- Record lock: Lock on a specific index entry (a specific row).
- Gap lock: Lock on the gap before an index entry, preventing inserts into that gap. Gap locks do not block reads or writes on existing rows — only inserts into the gap.
- Next-key lock: Combination of a record lock and a gap lock on the gap preceding that record. Prevents both modification of the record and insertion into the preceding gap.
When T1 executes a range scan (e.g., WHERE amount > 100), InnoDB acquires next-key locks on all index entries in the range, including a gap lock at the high end. Any attempt by T2 to insert a row in the locked range blocks until T1 commits or rolls back.
Next-key locks can cause deadlocks when two transactions acquire gap locks that overlap, then each tries to insert into the other's locked gap.
Predicate Locking
Predicate locking is a more general approach: instead of locking specific index ranges, the system locks the predicate itself (e.g., the condition amount > 100). Any insert that would satisfy that predicate is blocked. PostgreSQL uses predicate locks in serializable isolation (SSI) but not in repeatable read — it relies on MVCC snapshots for repeatable read and detects phantoms only at serializable level.
PostgreSQL Repeatable Read: MVCC Without Gap Locks
PostgreSQL's repeatable read uses pure MVCC with a transaction-level snapshot. Phantom reads in PostgreSQL repeatable read are prevented by the snapshot itself: newly inserted rows have a higher XID than the transaction's snapshot XID and are therefore invisible. However, this does not prevent write-write conflicts or all serialization anomalies — that requires SSI (serializable snapshot isolation).
SQL Schema
-- Tracks rows read under a repeatable read snapshot
CREATE TABLE SnapshotRead (
id BIGSERIAL PRIMARY KEY,
txn_id UUID NOT NULL,
snapshot_xid BIGINT NOT NULL, -- XID at which snapshot was taken
key VARCHAR(256) NOT NULL,
value JSONB,
read_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_snapshotread_txn ON SnapshotRead(txn_id);
-- Tracks gap locks held by transactions (InnoDB-style)
CREATE TABLE GapLock (
id BIGSERIAL PRIMARY KEY,
txn_id UUID NOT NULL,
index_name VARCHAR(128) NOT NULL,
lower_bound JSONB NOT NULL, -- inclusive or exclusive start of locked gap
upper_bound JSONB NOT NULL, -- inclusive or exclusive end of locked gap
locked_at TIMESTAMPTZ NOT NULL DEFAULT now(),
released_at TIMESTAMPTZ
);
CREATE INDEX idx_gaplock_txn ON GapLock(txn_id);
CREATE INDEX idx_gaplock_index ON GapLock(index_name, released_at);
Python Implementation
import uuid
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Tuple
@dataclass
class GapLockRange:
txn_id: str
index_name: str
lower: Any
upper: Any
inclusive_lower: bool = True
inclusive_upper: bool = False
class RepeatableReadEngine:
def __init__(self):
self.current_xid: int = 0
self.txn_snapshots: Dict[str, int] = {} # txn_id -> snapshot_xid
self.committed_data: Dict[str, List[dict]] = {} # row_id -> [{xid, data}]
self.gap_locks: List[GapLockRange] = []
def _next_xid(self) -> int:
self.current_xid += 1
return self.current_xid
def begin_transaction(self) -> str:
txn_id = str(uuid.uuid4())
snapshot_xid = self.current_xid # Snapshot at BEGIN
self.txn_snapshots[txn_id] = snapshot_xid
return txn_id
def acquire_gap_lock(self, txn_id: str, index_name: str,
lower: Any, upper: Any) -> bool:
"""
Acquire a gap lock on [lower, upper) in index_name.
Returns False if conflicting gap lock held by another transaction.
"""
for existing in self.gap_locks:
if (existing.index_name == index_name
and existing.txn_id != txn_id
and self._ranges_overlap(existing, lower, upper)):
return False # Conflict: block or raise
self.gap_locks.append(GapLockRange(
txn_id=txn_id, index_name=index_name, lower=lower, upper=upper
))
return True
def _ranges_overlap(self, lock: GapLockRange, lower: Any, upper: Any) -> bool:
return not (upper = lock.upper)
def read_with_snapshot(self, txn_id: str, key: str) -> Optional[dict]:
"""
Read the most recent committed version of key visible to txn's snapshot.
"""
snapshot_xid = self.txn_snapshots.get(txn_id)
if snapshot_xid is None:
raise ValueError(f"No transaction {txn_id}")
versions = self.committed_data.get(key, [])
# Find latest version committed at or before snapshot_xid
visible = [v for v in versions if v["xid"] None:
"""Stage a write; committed on commit()."""
pass # Placeholder: store in txn write buffer
def commit(self, txn_id: str) -> None:
"""Apply staged writes with new XID; release gap locks."""
xid = self._next_xid()
# Apply staged writes with this XID to committed_data
self.gap_locks = [g for g in self.gap_locks if g.txn_id != txn_id]
del self.txn_snapshots[txn_id]
FAQ
- Non-repeatable vs phantom read: Non-repeatable reads are about a single row changing; phantom reads are about a range returning different row sets.
- Gap lock vs predicate lock: Gap lock is index-range-specific; predicate lock is logical and covers any write satisfying the predicate.
- InnoDB next-key lock: Record lock + gap lock on preceding gap; prevents both row modification and gap insertion.
- MVCC vs gap locks for phantom prevention: PostgreSQL MVCC hides newer-XID rows via snapshot; no gap locks needed for repeatable read but serializable requires SSI.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering