Serializable Isolation Low-Level Design: 2PL, SSI, Conflict Cycles, and Anomaly Prevention

What Is Serializable Isolation?

Serializable isolation is the strongest standard isolation level. It guarantees that the result of executing a set of concurrent transactions is equivalent to some serial (one-at-a-time) execution of those same transactions. All anomalies — dirty reads, non-repeatable reads, phantom reads, and write skew — are prevented.

There are two primary implementation strategies: two-phase locking (2PL) and serializable snapshot isolation (SSI).

Two-Phase Locking (2PL)

2PL is the classical approach to serializability. It divides a transaction's lifetime into two phases:

  • Growing phase: The transaction acquires locks (shared and exclusive) as it accesses data. No locks are released during this phase.
  • Shrinking phase: Once any lock is released, the transaction may only release locks — it cannot acquire new ones.

Strict 2PL (S2PL) holds all locks — both shared and exclusive — until the transaction commits or rolls back. This is the most common variant and fully prevents all anomalies.

The cost of 2PL is high lock contention. Every read acquires a shared lock, blocking writers. Heavy workloads suffer significant throughput degradation under strict 2PL.

Serializable Snapshot Isolation (SSI)

SSI is an optimistic approach that builds on MVCC snapshot isolation. Transactions read from a consistent snapshot (like repeatable read) but additionally track anti-dependencies to detect when the execution is non-serializable.

SSI allows concurrent reads without locking, dramatically reducing contention. Conflicts are detected at commit time (or lazily during execution) by analyzing the pattern of read-write dependencies.

Anti-Dependencies

An anti-dependency (RW-antidependency) from T1 to T2 exists when:

  • T1 reads a version of some data item X.
  • T2 writes a newer version of X.
  • T1 and T2 are concurrent (T2's write is not visible to T1's snapshot).

T1 has an anti-dependency on T2 because if T2 had committed first, T1 should have seen T2's write. The anti-dependency represents a “should have seen” relationship that was violated by the snapshot read.

Dangerous Structures and Cycle Detection

A single anti-dependency does not indicate a serializability violation. The violation occurs when there is a cycle of RW-antidependencies forming a “pivot” structure. Specifically:

  • T1 has an RW-antidependency on T2 (T1 read data that T2 wrote).
  • T2 has an RW-antidependency on T1 (T2 read data that T1 wrote, or T2 read data that T3 wrote and T3 read data that T1 wrote).

A cycle in the RW-antidependency graph means the concurrent execution is not equivalent to any serial order. SSI aborts one transaction in the cycle (the victim) to break the cycle.

The conflict graph has nodes representing transactions and directed edges representing RW-antidependencies. Cycle detection runs via DFS. A cycle found means non-serializable execution.

PostgreSQL SSI Implementation

PostgreSQL implements SSI using:

  • SIREAD locks: Predicate read locks that record what ranges each transaction has read. Unlike regular locks, SIREAD locks do not block other transactions — they are used only for conflict tracking.
  • Write conflict detection: When a transaction writes, PostgreSQL checks whether any committed or concurrent transaction holds a SIREAD lock that conflicts with the write (i.e., the write would have been visible to that transaction if committed earlier).
  • Cycle detection: PostgreSQL tracks RW-antidependency edges and aborts a transaction when it detects a dangerous structure.

SSI vs 2PL Performance

SSI significantly outperforms 2PL on read-heavy workloads because readers do not acquire blocking locks. Writers still conflict with each other (exclusive row locks), but readers and writers do not block each other. The cost of SSI is abort rate: under high contention, more transactions are aborted due to cycle detection, requiring retries. 2PL has lower abort rate but higher blocking latency.

SQL Schema

-- Tracks serializable transactions and their snapshot XID
CREATE TABLE SSITransaction (
    id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    status       VARCHAR(16) NOT NULL DEFAULT 'ACTIVE',  -- ACTIVE, COMMITTED, ABORTED
    started_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    snapshot_xid BIGINT NOT NULL,
    completed_at TIMESTAMPTZ
);

-- Tracks RW-antidependency edges between transactions
CREATE TABLE AntiDependency (
    id            BIGSERIAL PRIMARY KEY,
    reader_txn_id UUID NOT NULL REFERENCES SSITransaction(id),
    writer_txn_id UUID NOT NULL REFERENCES SSITransaction(id),
    key           VARCHAR(256) NOT NULL,
    detected_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_antidep_reader ON AntiDependency(reader_txn_id);
CREATE INDEX idx_antidep_writer ON AntiDependency(writer_txn_id);

-- Records detected conflict cycles and the aborted victim
CREATE TABLE ConflictCycle (
    id             BIGSERIAL PRIMARY KEY,
    txn_id         UUID NOT NULL,               -- transaction that triggered detection
    cycle_members  JSONB NOT NULL,              -- list of txn_ids in the cycle
    aborted_at     TIMESTAMPTZ NOT NULL DEFAULT now()
);

Python Implementation

import uuid
from collections import defaultdict
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Set, Tuple

@dataclass
class SSITxn:
    txn_id: str
    snapshot_xid: int
    status: str = "ACTIVE"
    reads: Set[str] = field(default_factory=set)
    writes: Set[str] = field(default_factory=set)

class SSIEngine:
    def __init__(self):
        self.current_xid: int = 0
        self.transactions: Dict[str, SSITxn] = {}
        # Anti-dependency graph: reader -> set of writers
        self.rw_edges: Dict[str, Set[str]] = defaultdict(set)

    def begin_serializable_transaction(self) -> str:
        txn_id = str(uuid.uuid4())
        self.transactions[txn_id] = SSITxn(
            txn_id=txn_id, snapshot_xid=self.current_xid
        )
        self.rw_edges[txn_id] = set()
        return txn_id

    def record_anti_dependency(self, reader: str, writer: str, key: str) -> None:
        """
        Record that reader read a version of key that writer subsequently wrote.
        This creates an RW-antidependency edge: reader -> writer.
        """
        self.rw_edges[reader].add(writer)

    def detect_cycle(self, txn_id: str) -> Optional[List[str]]:
        """
        DFS from txn_id in the RW-antidependency graph.
        Returns the cycle as a list of txn_ids if found, else None.
        """
        visited: Set[str] = set()
        rec_stack: List[str] = []

        def dfs(node: str) -> Optional[List[str]]:
            visited.add(node)
            rec_stack.append(node)
            for neighbor in self.rw_edges.get(node, []):
                if neighbor not in visited:
                    result = dfs(neighbor)
                    if result is not None:
                        return result
                elif neighbor in rec_stack:
                    # Cycle found: extract cycle from rec_stack
                    idx = rec_stack.index(neighbor)
                    return rec_stack[idx:]
            rec_stack.pop()
            return None

        return dfs(txn_id)

    def select_victim(self, cycle: List[str]) -> str:
        """
        Select which transaction to abort. Prefer the one with least work done
        (fewest writes, youngest start time).
        """
        def work_score(txn_id: str) -> int:
            txn = self.transactions.get(txn_id)
            return len(txn.writes) if txn else 0

        return min(cycle, key=work_score)

    def abort_cycle_member(self, txn_id: str) -> None:
        """Abort the victim transaction and clean up its edges."""
        txn = self.transactions.get(txn_id)
        if txn:
            txn.status = "ABORTED"
        # Remove outgoing edges from victim
        self.rw_edges.pop(txn_id, None)
        # Remove incoming edges to victim
        for edges in self.rw_edges.values():
            edges.discard(txn_id)

    def commit(self, txn_id: str) -> bool:
        """
        Attempt to commit. Check for cycles involving this transaction first.
        Returns True on success, False if aborted due to cycle.
        """
        cycle = self.detect_cycle(txn_id)
        if cycle:
            victim = self.select_victim(cycle)
            self.abort_cycle_member(victim)
            if victim == txn_id:
                return False  # Caller must retry
        self.current_xid += 1
        txn = self.transactions[txn_id]
        txn.status = "COMMITTED"
        self.rw_edges.pop(txn_id, None)
        return True

FAQ

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does SSI compare to 2PL in performance?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “SSI significantly outperforms 2PL on read-heavy workloads because readers do not acquire blocking locks. Under 2PL, every read acquires a shared lock that blocks concurrent writers. Under SSI, reads are non-blocking and conflicts are detected lazily. The tradeoff is that SSI has a higher abort (retry) rate under contention, while 2PL has higher blocking latency but fewer aborts.”
}
},
{
“@type”: “Question”,
“name”: “How is an anti-dependency detected in SSI?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An anti-dependency from T1 to T2 is detected when T2 writes a version of key X and T1 holds a SIREAD lock (predicate read lock) on X from a snapshot that predates T2's write. The write by T2 triggers a scan of SIREAD locks to find conflicting readers, recording an RW-antidependency edge from each reader to T2.”
}
},
{
“@type”: “Question”,
“name”: “When should a transaction be aborted in a conflict cycle?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The victim is chosen to minimize wasted work. Common heuristics: abort the youngest transaction (least time invested), the transaction with fewest writes (least rollback cost), or the transaction not yet holding critical locks. PostgreSQL aborts the transaction that would cause the most forward progress — typically the one detected last in the cycle.”
}
},
{
“@type”: “Question”,
“name”: “What is the difference between serializable 2PL and SSI?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “2PL prevents anomalies by blocking: transactions must wait for locks held by conflicting transactions. SSI prevents anomalies by detecting: transactions proceed optimistically and are aborted if a non-serializable pattern is detected at commit time. 2PL has deterministic behavior and lower abort rate; SSI has higher throughput on read-heavy workloads and avoids reader-writer blocking.”
}
}
]
}

  • 2PL vs SSI performance: SSI wins on read-heavy workloads (non-blocking reads); 2PL wins on contention-heavy workloads (lower abort rate).
  • Anti-dependency detection: Write triggers scan of SIREAD predicate read locks; each conflicting reader gets an RW-antidependency edge to the writer.
  • When to abort in cycle: Choose victim with least work done (fewest writes, youngest XID) to minimize rollback cost.
  • SSI vs serializable 2PL: SSI is optimistic (detect on commit), 2PL is pessimistic (block on conflict); SSI avoids reader-writer blocking entirely.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does 2PL achieve serializability?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Two-phase locking divides a transaction's life into a growing phase (locks are acquired but never released) and a shrinking phase (locks are released but no new ones acquired); because every conflicting pair of transactions must acquire locks in some order, the lock acquisition order defines a conflict-serializable schedule. Strict 2PL holds all locks until commit, preventing cascading aborts.”
}
},
{
“@type”: “Question”,
“name”: “What is Serializable Snapshot Isolation (SSI) and how does it detect conflicts?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “SSI runs transactions on MVCC snapshots (like repeatable read) but additionally tracks rw-antidependencies — cases where transaction A reads a version that transaction B later overwrites. When the runtime detects a cycle of rw-antidependencies (a dangerous structure), it aborts one transaction to break the cycle, allowing the rest to commit in a serializable order without holding locks.”
}
},
{
“@type”: “Question”,
“name”: “How are write-write conflicts detected in SSI?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “SSI detects write-write conflicts the same way repeatable read does: when a transaction attempts to update a row, it checks whether a concurrent committed transaction has already updated that row since the snapshot was taken, and if so aborts with a serialization failure. This combines with rw-antidependency tracking to catch all anomalies without requiring explicit write locks.”
}
},
{
“@type”: “Question”,
“name”: “What is the performance trade-off of serializable isolation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Serializable isolation increases abort rates and latency because conflicting transactions must be retried, and lock-based approaches (2PL) add contention that reduces throughput under high concurrency. SSI reduces blocking compared to 2PL but still incurs overhead from tracking antidependencies and occasional false-positive aborts, making it best suited for workloads with moderate conflict rates.”
}
}
]
}

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top