Saga Pattern Low-Level Design: Distributed Transactions Without 2PC

The saga pattern manages distributed transactions across multiple services by decomposing them into a sequence of local transactions, each publishing an event or message. When a step fails, compensating transactions undo the preceding steps. Without sagas, a distributed transaction that touches three services has no atomic rollback — you must build it explicitly.

Core Data Model

CREATE TABLE Saga (
    saga_id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    saga_type      VARCHAR(50) NOT NULL,         -- 'order_fulfillment', 'user_registration'
    correlation_id UUID NOT NULL,                -- ID of the business object (order_id, etc.)
    current_step   VARCHAR(50) NOT NULL,
    status         VARCHAR(20) NOT NULL DEFAULT 'RUNNING', -- RUNNING, COMPLETED, COMPENSATING, FAILED
    payload        JSONB NOT NULL,               -- initial input, shared across steps
    created_at     TIMESTAMPTZ DEFAULT NOW(),
    updated_at     TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE SagaStep (
    step_id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    saga_id        UUID NOT NULL REFERENCES Saga(saga_id),
    step_name      VARCHAR(50) NOT NULL,
    status         VARCHAR(20) NOT NULL DEFAULT 'PENDING', -- PENDING, COMPLETED, COMPENSATED, FAILED
    result         JSONB,                        -- output stored for compensation
    compensated_at TIMESTAMPTZ,
    completed_at   TIMESTAMPTZ,
    created_at     TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_saga_correlation ON Saga(correlation_id);
CREATE INDEX idx_sagastep_saga ON SagaStep(saga_id, step_name);

Choreography vs Orchestration

Choreography: Each service listens for events and reacts. OrderService publishes OrderCreated; PaymentService listens, charges the card, publishes PaymentCompleted; InventoryService listens, reserves stock, publishes StockReserved. No central coordinator. Scales well; hard to debug and trace.

Orchestration: A central saga orchestrator drives the workflow. It sends commands to each service and waits for replies. Easier to monitor (the saga state machine is visible in one place), easier to compensate (orchestrator knows which steps completed), preferred for complex flows with many steps or non-trivial compensation logic. Use orchestration for anything with more than 3 steps.

Orchestrated Saga Implementation

class OrderFulfillmentSaga:
    STEPS = ['reserve_inventory', 'charge_payment', 'schedule_shipping', 'send_confirmation']

    def start(self, order_id: str, payload: dict) -> str:
        saga_id = str(uuid4())
        db.execute("""
            INSERT INTO Saga (saga_id, saga_type, correlation_id, current_step, payload)
            VALUES (%s, 'order_fulfillment', %s, 'reserve_inventory', %s)
        """, [saga_id, order_id, json.dumps(payload)])
        self._execute_step(saga_id, 'reserve_inventory', payload)
        return saga_id

    def _execute_step(self, saga_id: str, step: str, payload: dict):
        try:
            result = self._call_service(step, payload)
            db.execute("""
                INSERT INTO SagaStep (saga_id, step_name, status, result)
                VALUES (%s, %s, 'COMPLETED', %s)
            """, [saga_id, step, json.dumps(result)])
            next_step = self._next_step(step)
            if next_step:
                db.execute("UPDATE Saga SET current_step=%s WHERE saga_id=%s", [next_step, saga_id])
                self._execute_step(saga_id, next_step, {**payload, **result})
            else:
                db.execute("UPDATE Saga SET status='COMPLETED' WHERE saga_id=%s", [saga_id])
        except ServiceException as e:
            self._compensate(saga_id, step, payload)

    def _compensate(self, saga_id: str, failed_step: str, payload: dict):
        db.execute("UPDATE Saga SET status='COMPENSATING' WHERE saga_id=%s", [saga_id])
        # Get all completed steps before the failed one, in reverse order
        completed = db.fetchall("""
            SELECT step_name, result FROM SagaStep
            WHERE saga_id=%s AND status='COMPLETED'
            ORDER BY completed_at DESC
        """, [saga_id])
        for step_row in completed:
            try:
                self._call_compensation(step_row['step_name'], payload, step_row['result'])
                db.execute("""
                    UPDATE SagaStep SET status='COMPENSATED', compensated_at=NOW()
                    WHERE saga_id=%s AND step_name=%s
                """, [saga_id, step_row['step_name']])
            except Exception:
                # Compensation failure is critical — alert and require manual intervention
                alert_ops(f"Compensation failed for saga {saga_id} step {step_row['step_name']}")
        db.execute("UPDATE Saga SET status='FAILED' WHERE saga_id=%s", [saga_id])

Idempotent Steps

Every saga step must be idempotent. The orchestrator may retry a step after a network timeout — the service may have already executed it. Solution: pass the saga_id + step_name as an idempotency key to each service call. The receiving service checks if it already processed this key: if yes, return the cached result; if no, execute and store.

-- In PaymentService
CREATE TABLE IdempotencyKey (
    key        VARCHAR(100) PRIMARY KEY,  -- '{saga_id}:{step_name}'
    result     JSONB NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

def charge_payment(saga_id, step_name, amount, card_token):
    idem_key = f"{saga_id}:{step_name}"
    existing = db.fetchone("SELECT result FROM IdempotencyKey WHERE key=%s", [idem_key])
    if existing:
        return existing['result']  # already charged — return cached result
    result = payment_processor.charge(amount, card_token)
    db.execute("INSERT INTO IdempotencyKey (key, result) VALUES (%s, %s)",
               [idem_key, json.dumps(result)])
    return result

Key Interview Points

Sagas trade ACID isolation for availability — two concurrent sagas can see each other’s intermediate states. Use semantic locks or reservation patterns to prevent conflicts.
Compensating transactions are not rollbacks — they are new forward transactions that undo business effects. A charge compensation is a refund, not a DB rollback.
Compensation failures require manual intervention or a dead-letter queue — you cannot automatically recover from a failed compensation.
Prefer orchestration over choreography for complex flows — the state machine is testable and traceable.
The saga log (Saga + SagaStep tables) is your audit trail and recovery mechanism — if the orchestrator crashes mid-saga, replay from the last recorded step on restart.
At-least-once delivery + idempotent steps = exactly-once business semantics.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why can’t you use a database transaction for a distributed operation across three microservices?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A database transaction (BEGIN/COMMIT) works within a single database instance. When an operation spans three separate services — each with its own database — there is no shared transaction manager. Two-phase commit (2PC) can coordinate across databases but requires all participants to implement the XA protocol, adds two round-trips of latency, and fails catastrophically if the coordinator crashes between the prepare and commit phases (leaving participants in a locked state indefinitely). In practice, 2PC is too fragile for high-throughput microservices. The saga pattern replaces atomic transactions with a sequence of local transactions plus compensating transactions for rollback — trading strict atomicity for availability and partition tolerance.”}},{“@type”:”Question”,”name”:”What makes a compensating transaction different from a database rollback?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A database rollback undoes uncommitted changes — the changes were never visible to anyone outside the transaction. A compensating transaction undoes committed changes that were already visible and may have triggered downstream effects. If the payment step in an order saga has already charged the customer’s card (committed), a compensation must issue a refund — a new positive transaction, not an undo. The original charge record remains in the payment history. This is why saga compensation is more complex than rollback: the compensation must undo the business effect (refund), handle the case where compensation itself fails (double-refund, lost refund), and create an audit trail showing both the original action and its compensation.”}},{“@type”:”Question”,”name”:”How do you recover a saga if the orchestrator crashes mid-execution?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The orchestrator persists every step completion to the Saga and SagaStep tables before executing the next step. On restart, the orchestrator queries: SELECT * FROM Saga WHERE status IN (‘RUNNING’, ‘COMPENSATING’) and replays from the last recorded step. Because each step is idempotent (uses saga_id + step_name as an idempotency key), replaying a step that already completed returns the cached result without side effects. This makes saga recovery safe: the orchestrator can crash at any point, and on restart it resumes exactly where it left off. The saga log is both the execution record and the recovery mechanism — never execute a step without first persisting the intent to the log.”}},{“@type”:”Question”,”name”:”When should you use choreography vs orchestration for sagas?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Choreography: each service reacts to events and publishes its own events. OrderService publishes OrderCreated; PaymentService listens and publishes PaymentCompleted; InventoryService listens and publishes StockReserved. No central coordinator. Good for: simple 2-3 step flows, teams that own their own service boundaries and don’t want coupling to a central orchestrator. Problems at scale: the overall flow is implicit — you must read all service codebases to understand the complete saga. Compensations are triggered by failure events and are hard to trace. Orchestration: one service owns the saga state machine and explicitly commands each step. Better for: complex flows (5+ steps), flows requiring conditional branching, flows that need a clear owner and audit trail. Use orchestration for any saga that would require a flowchart to explain.”}},{“@type”:”Question”,”name”:”How do you prevent two concurrent sagas from conflicting on the same resource?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Sagas lack the isolation of ACID transactions — two concurrent order sagas can both see the same inventory count and both succeed the reservation step, overselling stock. Solution: semantic locks. Before executing the reservation step, insert a lock record: INSERT INTO InventoryLock (product_id, saga_id) VALUES (:pid, :sid). If this fails (unique constraint on product_id), the saga retries later — the resource is being modified by another saga. Release the lock when the step completes or compensates. This is an application-level advisory lock, not a DB row lock. Alternative: use optimistic concurrency on the resource (e.g., compare-and-swap on the inventory count with a version number), and retry the saga step if the CAS fails due to a concurrent update.”}}]}

Saga pattern and distributed transaction design is discussed in Uber system design interview questions.

Saga pattern and payment flow reliability design is covered in Stripe system design interview preparation.

Saga pattern and order fulfillment workflow design is discussed in Amazon system design interview guide.