What Is a Distributed Transaction Manager?
A Distributed Transaction Manager (DTM) coordinates operations that span multiple databases, services, or nodes so that all participants either commit or roll back together. Unlike a local ACID transaction confined to one database, a distributed transaction must guarantee atomicity across heterogeneous resources — each of which may fail independently. The DTM acts as the single source of truth for transaction state, orchestrating prepare and commit phases and persisting enough metadata to recover after crashes.
Data Model
-- Central coordinator store
CREATE TABLE distributed_txn (
txn_id UUID PRIMARY KEY,
status VARCHAR(20) CHECK (status IN ('INITIATED','PREPARING','PREPARED','COMMITTING','COMMITTED','ABORTING','ABORTED')),
initiated_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
timeout_ms INT NOT NULL DEFAULT 30000,
coordinator VARCHAR(128) NOT NULL
);
CREATE TABLE txn_participant (
txn_id UUID REFERENCES distributed_txn(txn_id),
participant VARCHAR(128) NOT NULL,
resource_type VARCHAR(64), -- e.g. postgres, kafka, redis
vote VARCHAR(10), -- YES | NO | NULL
ack BOOLEAN DEFAULT FALSE,
PRIMARY KEY (txn_id, participant)
);
CREATE TABLE txn_log (
log_id BIGSERIAL PRIMARY KEY,
txn_id UUID NOT NULL,
event VARCHAR(64) NOT NULL,
detail TEXT,
logged_at TIMESTAMP NOT NULL DEFAULT NOW()
);
Core Algorithm
The DTM implements Two-Phase Commit (2PC) as its base protocol, extended with a write-ahead log for durability:
- Begin: Client calls
beginTransaction(). DTM inserts a row indistributed_txnwith statusINITIATEDand returns atxn_id. - Enlist: Each participating service registers itself via
enlistParticipant(txn_id, resourceDSN). DTM inserts a row intxn_participant. - Phase 1 — Prepare: DTM sets status to
PREPARING, then sends aPREPAREmessage to every participant in parallel. Each participant durably writes its prepared state and repliesYESorNO. - Decision: If all votes are
YES, DTM writesPREPAREDthenCOMMITTINGto the log. AnyNOvote triggersABORTING. - Phase 2 — Commit/Abort: DTM broadcasts
COMMITorROLLBACKto all participants, collecting ACKs. Once all ACKs arrive, status advances toCOMMITTEDorABORTED.
Failure Handling and Recovery
The hardest problem in distributed transactions is surviving coordinator or participant crashes mid-protocol. The DTM addresses this with:
- Coordinator crash after PREPARED: On restart, a recovery thread scans for transactions stuck in
PREPARINGorCOMMITTINGand re-drives Phase 2. Because the decision is already durable, re-sendingCOMMITis safe and idempotent. - Participant crash after voting YES: The participant must replay its prepared log on restart and await the coordinator’s Phase 2 message. It must not unilaterally abort once it has voted YES.
- Timeout: Transactions stuck in
PREPARINGbeyondtimeout_msare aborted by a background sweeper. Participants receive an explicitROLLBACK. - Heuristic decisions: If a participant cannot reach the coordinator for an extended period, it may make a heuristic commit or abort and record the fact for later reconciliation.
Scalability Considerations
- Coordinator bottleneck: The coordinator is stateful, so scale it as an active-passive pair with shared durable storage (e.g., PostgreSQL or etcd) rather than active-active.
- Latency: 2PC adds at least two network round-trips. Keep participants co-located in the same region when possible; use async commit ACK collection in Phase 2 to reduce tail latency.
- Throughput: Shard the coordinator by
txn_idprefix or business domain so multiple coordinator instances handle disjoint transaction sets. - Observability: Expose transaction state via a metrics endpoint; alert on any transaction older than 2×
timeout_msthat has not reached a terminal state.
Summary
A Distributed Transaction Manager provides ACID guarantees across service boundaries by persisting coordinator state, running a two-phase protocol, and recovering deterministically from failures. The key design trade-off is latency and availability: 2PC blocks participants during coordinator downtime. Evaluate whether saga-based eventual consistency is acceptable before committing to a DTM.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems