What is Event Sourcing?
In a traditional system, the database stores the current state: “Account balance = $500.” In event sourcing, the database stores the history of events that led to the current state: “AccountCreated → Deposited($300) → Deposited($400) → Withdrawn($200).” The current state is derived by replaying the event log. This provides a complete audit trail, enables temporal queries (“what was the balance on March 1?”), and makes it easy to build new projections from historical data.
Core Concepts
- Event: an immutable record of something that happened. Never mutated or deleted. Events are the source of truth.
- Aggregate: a domain object (e.g., BankAccount, Order) whose state is derived by replaying its events.
- Event Store: append-only storage for events, ordered per aggregate by sequence number.
- Projection (Read Model): a materialized view derived from the event stream for efficient querying.
- Command: a request to change state (DepositFunds, PlaceOrder). Commands produce events.
Data Model
Event(event_id UUID, aggregate_type VARCHAR, aggregate_id UUID,
event_type VARCHAR, payload JSONB, sequence_number BIGINT,
created_at TIMESTAMP, metadata JSONB)
-- UNIQUE constraint on (aggregate_id, sequence_number) prevents concurrent writes
Snapshot(aggregate_id UUID, aggregate_type, sequence_number,
state JSONB, created_at)
-- Optimization: avoid replaying all events from the beginning
Command Handler Pattern
def handle_deposit(account_id, amount):
# 1. Load current state by replaying events
events = event_store.load(aggregate_id=account_id)
account = BankAccount()
for event in events:
account.apply(event) # mutate state based on event type
# 2. Validate the command
if amount <= 0: raise ValueError("Amount must be positive")
# 3. Produce new event
new_event = DepositedEvent(account_id=account_id, amount=amount,
sequence_number=len(events) + 1)
# 4. Append to event store (optimistic concurrency via sequence_number)
event_store.append(new_event, expected_version=len(events))
# 5. Publish to event bus for projections
event_bus.publish(new_event)
Optimistic Concurrency Control
Two concurrent commands on the same aggregate must not overwrite each other. Solution: the UNIQUE constraint on (aggregate_id, sequence_number) prevents two writes at the same sequence position. The command handler reads the current version (N), writes at version N+1. If another command already wrote N+1, the INSERT fails with a unique constraint violation — retry with the latest version. This is optimistic locking without explicit locks.
Projections and Read Models
The event store is normalized but inefficient for queries. Projections consume the event stream and build read-optimized views:
class AccountBalanceProjection:
def on_event(self, event):
if event.type == 'Deposited':
db.execute("UPDATE account_balances SET balance = balance + %s
WHERE account_id = %s", (event.amount, event.account_id))
elif event.type == 'Withdrawn':
db.execute("UPDATE account_balances SET balance = balance - %s
WHERE account_id = %s", (event.amount, event.account_id))
Projections can be rebuilt from scratch by replaying all events. This enables: fixing bugs in projection logic, creating new read models for new features, auditing — all without touching the source of truth.
Snapshots
An aggregate with 10,000 events takes 10,000 replays to reconstruct current state. Snapshots periodically save the materialized state: every 100 events, serialize the current aggregate state as a Snapshot. On load: find the latest snapshot (if any), load state from snapshot, then replay only events after the snapshot’s sequence_number. This reduces replay cost from O(all events) to O(events since last snapshot).
When to Use Event Sourcing
- Audit trail is a hard requirement (financial, compliance, legal)
- Need temporal queries (“what was the state at time T?”)
- Need to rebuild projections when requirements change
- Complex domain logic where events carry business meaning
- Avoid when: simple CRUD, small team, read-heavy (projections add complexity)
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the difference between event sourcing and traditional CRUD?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Traditional CRUD stores only the current state. You see the account balance is $500 but have no record of how it got there. Event sourcing stores the complete history of state changes as an immutable sequence of events: AccountCreated, Deposited($300), Deposited($400), Withdrawn($200). The current state is derived by replaying events. Benefits: (1) Complete audit trail — every change is recorded with who, what, and when. Required for financial, compliance, and legal systems. (2) Temporal queries — reconstruct state at any point in time by replaying events up to that timestamp. (3) Rebuild projections — create new read models from historical events without losing data. (4) Debugging — replay events to reproduce bugs. Trade-offs: more complex implementation, eventual consistency between event store and projections, event schema must be versioned carefully (you cannot change past events).”}},{“@type”:”Question”,”name”:”What is CQRS and how does it relate to event sourcing?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”CQRS (Command Query Responsibility Segregation) separates write operations (commands) from read operations (queries). The write side handles commands that produce events. The read side consists of projections — materialized views optimized for specific queries. CQRS is often used with event sourcing but they are independent: you can have CQRS without event sourcing (separate write DB from read DB), and you can have event sourcing without strict CQRS. In a full event sourcing + CQRS system: commands → aggregate validation → events written to event store → event bus publishes events → projections consume events and update read models → queries read from projections. The read models can use any storage: PostgreSQL for relational queries, Elasticsearch for full-text search, Redis for cached lookups. Each projection is independently rebuilt by replaying the event stream.”}},{“@type”:”Question”,”name”:”How do you handle event schema evolution in event sourcing?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Events are immutable and stored forever. When the business logic changes, you need to handle events written under the old schema. Strategies: (1) Event upcasting: when loading old events, transform them to the new schema before passing to the aggregate. The upcast function is versioned: if event.version == 1, add the new required field with a default value. (2) New event versions: instead of modifying OrderPlaced, add OrderPlacedV2 with the new fields. Old aggregates understand V1; new aggregates understand both. (3) Weak schema: use JSONB payload with optional fields. New code checks if the field exists before using it. (4) Snapshot migration: when a snapshot is taken, serialize using the current schema. On load, use the snapshot (current schema) + only replay events since the snapshot. Recommendation: include an event version field from day one. Write explicit upcasters for each version transition. Test that events from 2 years ago still replay correctly.”}},{“@type”:”Question”,”name”:”How do snapshots work and when should you use them?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”An aggregate reconstructed by replaying all its events is correct but slow for long-lived aggregates with many events. An order processed 10,000 times would require 10,000 event replays. Snapshots solve this: every N events (e.g., every 50), serialize the aggregate's current state to a Snapshot record: (aggregate_id, sequence_number, state_json, created_at). On load: query for the most recent snapshot. Load the state from the snapshot. Replay only events with sequence_number > snapshot.sequence_number. Worst case: replay N-1 events (between snapshots) instead of all events. Use snapshots when: aggregates have > 100 events, replay time is measurable (> a few ms), or aggregates are accessed frequently. Avoid snapshots when: aggregates are short-lived (orders completed in minutes), event count is small, or snapshot storage adds complexity you don't need yet.”}},{“@type”:”Question”,”name”:”How does optimistic concurrency control work in an event store?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Multiple concurrent commands targeting the same aggregate could create conflicting events. Example: two transfers from the same account simultaneously, both seeing balance=$100, both succeeding — overdraft. Optimistic concurrency: when appending a new event, specify the expected version (last known sequence_number). The event store tries to insert at expected_version + 1. If another command already wrote that sequence_number, the unique constraint (aggregate_id, sequence_number) causes an INSERT failure. The application catches the conflict error, reloads the aggregate from the current state (replaying the new events), re-runs the command validation, and retries. This is optimistic locking without explicit row locks — low contention cost, correct under concurrent writes. If the retry also fails (high contention on a hot aggregate), back off and retry with exponential jitter. For very high contention aggregates, consider partitioning or actor-based serialization.”}}]}
Coinbase system design covers event sourcing for financial audit trails. See common questions for Coinbase interview: event sourcing and financial audit system design.
Stripe system design covers event sourcing for payment processing. Review patterns for Stripe interview: event sourcing and payment system design.
Databricks system design covers event-driven data architectures. See design patterns for Databricks interview: event sourcing and data architecture design.
See also: Atlassian Interview Guide
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering