Snapshot isolation (SI) is the transaction isolation level used by PostgreSQL (as its default), Oracle, SQL Server, and many distributed databases. It prevents dirty reads and non-repeatable reads while allowing higher concurrency than serializable isolation. Understanding its mechanics — read view construction, write-write conflict detection, and MVCC integration — is essential for database system design interviews.
Snapshot Isolation Guarantees
Under snapshot isolation:
- No dirty reads: a transaction never sees uncommitted data from other transactions
- No non-repeatable reads: re-reading the same row within a transaction always returns the same version
- No phantom reads: the snapshot is fixed at transaction start, so new rows inserted by concurrent transactions are invisible
- Write skew allowed: two concurrent transactions can each read a consistent snapshot and make updates that are individually valid but collectively inconsistent (more on this below)
Snapshot isolation is weaker than serializable (which prevents write skew). PostgreSQL's Serializable Snapshot Isolation (SSI) adds anti-dependency tracking to detect and prevent write skew, achieving full serializability.
Read View Construction
When a transaction begins, the database captures a read view (snapshot):
ReadView = {
min_active_xid: smallest transaction ID currently active,
active_xids_set: set of all currently active (uncommitted) transaction IDs,
snapshot_xid: the next transaction ID that will be assigned (exclusive upper bound)
}
A row version created by transaction x is visible to this snapshot if:
x < min_active_xid— committed before any active transaction started, ORx < snapshot_xidANDxis NOT inactive_xids_set— committed before snapshot was taken and not one of the currently active transactions
In other words: the transaction sees all committed state as of its start time, and nothing that was not yet committed.
Write-Write Conflict Detection
Snapshot isolation uses the first-writer-wins rule for concurrent writes to the same row:
- Transaction A writes row R (creates a new version with
xmin = A.xid) - Transaction B tries to write row R while A is still active
- B detects that R's current version has
xminof an active transaction - B is aborted with a serialization failure
This prevents lost updates — the scenario where B overwrites A's change without seeing it.
MVCC Integration
Snapshot isolation is implemented on top of MVCC (Multi-Version Concurrency Control). Each row has:
xmin: transaction ID that created this versionxmax: transaction ID that deleted or superseded this version (0 = still current)
Old row versions are retained in the heap. The visibility rule determines which version a given transaction sees. When xmax is set (by UPDATE or DELETE), the old version remains visible to transactions whose snapshot predates the deletion.
Write Skew: The Classic Example
Two doctors share on-call coverage. Hospital policy: at least one doctor must always be on call. Both check the rule at the same snapshot: Doctor A sees Doctor B is on call (true), Doctor B sees Doctor A is on call (true). Both decide to take off. Both commits succeed — there is no write-write conflict because they modified different rows. Result: no doctors on call. This is write skew.
Preventing write skew requires either:
- Serializable Snapshot Isolation (SSI): track read-write anti-dependencies; abort if a dangerous cycle is detected
- Explicit locking:
SELECT FOR UPDATEon the rows being checked forces serialization
Vacuum and Garbage Collection
Old row versions accumulate as updates and deletes happen. Vacuum removes row versions where xmax < oldest_active_xid — no running transaction can possibly need to see that old version. Without vacuum, tables bloat with dead tuples and index entries pointing to them.
The oldest_active_xid is the minimum transaction ID still running. Long-running transactions prevent vacuum from reclaiming space — a major operational concern in high-write PostgreSQL deployments.
SQL Schema
-- Transaction lifecycle tracking
CREATE TABLE Transaction (
id BIGSERIAL PRIMARY KEY,
status TEXT NOT NULL DEFAULT 'active', -- active, committed, aborted
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
committed_at TIMESTAMPTZ,
CONSTRAINT chk_tx_status CHECK (status IN ('active','committed','aborted'))
);
-- Row versions (heap representation)
CREATE TABLE RowVersion (
id BIGSERIAL PRIMARY KEY,
table_name TEXT NOT NULL,
row_id BIGINT NOT NULL,
xmin BIGINT NOT NULL, -- creating transaction ID
xmax BIGINT, -- deleting transaction ID (NULL = current version)
data JSONB NOT NULL
);
CREATE INDEX idx_rv_row ON RowVersion(table_name, row_id, xmin);
CREATE INDEX idx_rv_xmax ON RowVersion(xmax) WHERE xmax IS NOT NULL;
-- Snapshot captures at transaction start
CREATE TABLE ActiveSnapshot (
xid BIGINT PRIMARY KEY, -- transaction ID
min_active_xid BIGINT NOT NULL,
active_xids JSONB NOT NULL, -- array of active XIDs at snapshot time
snapshot_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Python Implementation
from dataclasses import dataclass
from typing import Optional
import threading
@dataclass
class RowVersion:
row_id: int
xmin: int # creating transaction
xmax: Optional[int] # deleting/updating transaction; None = current
data: dict
@dataclass
class Snapshot:
xid: int
min_active_xid: int
active_xids: frozenset[int]
snapshot_xid: int # next XID at snapshot time
class MVCCStorage:
"""Simplified in-memory MVCC storage for snapshot isolation demo."""
def __init__(self):
self._versions: dict[int, list[RowVersion]] = {}
self._next_xid = 1
self._active_xids: set[int] = set()
self._committed_xids: set[int] = set()
self._lock = threading.Lock()
def begin_transaction(self) -> Snapshot:
with self._lock:
xid = self._next_xid
self._next_xid += 1
snapshot = Snapshot(
xid=xid,
min_active_xid=min(self._active_xids) if self._active_xids else xid,
active_xids=frozenset(self._active_xids),
snapshot_xid=xid
)
self._active_xids.add(xid)
return snapshot
def commit(self, snapshot: Snapshot) -> None:
with self._lock:
self._active_xids.discard(snapshot.xid)
self._committed_xids.add(snapshot.xid)
def abort(self, snapshot: Snapshot) -> None:
with self._lock:
self._active_xids.discard(snapshot.xid)
# Roll back versions created by this transaction
for versions in self._versions.values():
to_remove = [v for v in versions if v.xmin == snapshot.xid]
for v in to_remove:
versions.remove(v)
# Restore xmax nulled by this transaction
for v in versions:
if v.xmax == snapshot.xid:
v.xmax = None
def get_visible_version(self, row_id: int, snapshot: Snapshot) -> Optional[RowVersion]:
"""Return the row version visible to the given snapshot."""
versions = self._versions.get(row_id, [])
for v in reversed(versions):
if self._is_visible(v, snapshot):
return v
return None
def _is_visible(self, v: RowVersion, snap: Snapshot) -> bool:
# xmin must be committed and predate snapshot
if not self._was_committed_before(v.xmin, snap):
return False
# xmax must not be committed before snapshot (row not yet deleted)
if v.xmax is not None and self._was_committed_before(v.xmax, snap):
return False
return True
def _was_committed_before(self, xid: int, snap: Snapshot) -> bool:
if xid == snap.xid:
return True # own writes are visible
if xid in snap.active_xids:
return False # was active at snapshot time
if xid >= snap.snapshot_xid:
return False # started after snapshot
return xid in self._committed_xids
def detect_write_conflict(self, row_id: int, writing_xid: int) -> bool:
"""Return True if another active transaction already modified this row."""
versions = self._versions.get(row_id, [])
for v in versions:
if v.xmin in self._active_xids and v.xmin != writing_xid:
return True # First-writer-wins: abort writing_xid
return False
def write_version(self, row_id: int, data: dict, xid: int) -> None:
"""Create a new row version. Mark old current version with xmax."""
versions = self._versions.setdefault(row_id, [])
for v in versions:
if v.xmax is None:
v.xmax = xid
versions.append(RowVersion(row_id=row_id, xmin=xid, xmax=None, data=data))
def vacuum_old_versions(self, oldest_active_xid: int) -> int:
"""Remove row versions that are no longer visible to any active transaction."""
removed = 0
for versions in self._versions.values():
dead = [v for v in versions if v.xmax is not None and v.xmax < oldest_active_xid]
for v in dead:
versions.remove(v)
removed += 1
return removed
Snapshot Isolation vs Serializable
Snapshot isolation prevents the most common anomalies with low overhead — no lock contention on reads. Serializable isolation adds tracking of read-write conflicts (anti-dependencies) and aborts transactions when a dangerous cycle is detected. The abort rate under SSI is low for most workloads, making SSI practical. If your application uses SELECT FOR UPDATE to explicitly lock rows before making decisions, snapshot isolation is effectively equivalent to serializable for those operations.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering