Change Data Capture (CDC) Low-Level Design: Real-Time Database Streaming

Change Data Capture (CDC) streams every insert, update, and delete from a database as an ordered event log. Instead of polling for changes or adding triggers to every table, CDC taps the database’s replication log — the same mechanism used for database replicas. This gives you a real-time, complete, ordered stream of mutations with near-zero overhead on the source database.

How CDC Works at the Database Level

PostgreSQL writes every committed change to its Write-Ahead Log (WAL) before applying it. CDC tools like Debezium connect as a PostgreSQL replication client, read the WAL stream, decode row-level changes, and publish them to Kafka. The WAL records: the operation (INSERT/UPDATE/DELETE), the table name, the full before and after state of the row, the LSN (log sequence number — a monotonic position in the log), and the transaction ID.

-- Enable logical replication in PostgreSQL
-- postgresql.conf:
wal_level = logical
max_replication_slots = 4

-- Create a replication slot for Debezium
SELECT pg_create_logical_replication_slot('debezium_slot', 'pgoutput');

-- Create publication for tables you want to stream
CREATE PUBLICATION app_publication FOR TABLE orders, payments, users;

Debezium CDC Event Schema

// Kafka message produced by Debezium for an UPDATE to orders table
{
  "before": {
    "order_id": "abc123",
    "status": "pending",
    "updated_at": 1700000000
  },
  "after": {
    "order_id": "abc123",
    "status": "shipped",
    "updated_at": 1700001000
  },
  "source": {
    "db": "appdb",
    "table": "orders",
    "lsn": 12345678,       // log sequence number — position in WAL
    "txId": 9876,
    "ts_ms": 1700001000000
  },
  "op": "u",               // i=insert, u=update, d=delete, r=read (snapshot)
  "ts_ms": 1700001001234   // Debezium processing timestamp
}

Common CDC Use Cases

Cache invalidation: Consume CDC events and delete or update the corresponding Redis key. Eliminates cache staleness from direct DB writes that bypass the application cache layer.

def handle_order_change(event):
    order_id = event['after']['order_id'] if event['op'] != 'd' else event['before']['order_id']
    redis.delete(f"order:{order_id}")
    if event['op'] != 'd':
        redis.setex(f"order:{order_id}", 300, json.dumps(event['after']))

Search index sync: Consume CDC events and update Elasticsearch. Avoids dual-write race conditions where DB write succeeds but ES update fails.

def handle_product_change(event):
    if event['op'] == 'd':
        es.delete(index='products', id=event['before']['product_id'])
    else:
        es.index(index='products', id=event['after']['product_id'],
                 document=map_to_search_doc(event['after']))

Event sourcing / audit log: Every CDC event is a timestamped audit record. Route CDC events to an append-only store (S3, BigQuery) for compliance and historical queries.

Replication to read replicas or data warehouses: Stream CDC events to a read-optimized store (Redshift, BigQuery, ClickHouse) for analytics without loading the OLTP database.

Handling Ordering and Exactly-Once Semantics

CDC events within a single table are ordered by LSN. Events across tables in the same transaction arrive in the same Kafka partition (Debezium uses transaction ID as the partition key within a table). Consumers must track the last processed LSN to resume correctly after a restart.

CREATE TABLE cdc_consumer_offset (
    consumer_group  VARCHAR(100) PRIMARY KEY,
    last_lsn        BIGINT NOT NULL,
    updated_at      TIMESTAMPTZ DEFAULT NOW()
);

def consume_with_checkpoint(consumer_group):
    last_lsn = db.fetchone("SELECT last_lsn FROM cdc_consumer_offset WHERE consumer_group=%s",
                            [consumer_group])['last_lsn'] or 0
    for event in kafka.consume('orders.changes'):
        if event['source']['lsn'] <= last_lsn:
            continue  # already processed — skip
        process_event(event)
        # Update checkpoint atomically with the business operation
        db.execute("""
            INSERT INTO cdc_consumer_offset (consumer_group, last_lsn)
            VALUES (%s, %s)
            ON CONFLICT (consumer_group) DO UPDATE SET last_lsn=EXCLUDED.last_lsn
        """, [consumer_group, event['source']['lsn']])

Key Interview Points

CDC reads the replication log — overhead on the source DB is minimal (no triggers, no polling queries). The WAL is written anyway for replication; CDC just reads it.
The before-image is critical for cache invalidation and downstream deletes — always configure Debezium with REPLICA IDENTITY FULL on tables where you need the pre-change state.
CDC is at-least-once delivery — consumers must be idempotent. Use the LSN as an idempotency key.
WAL retention: PostgreSQL retains WAL until all replication slots have consumed it. A stalled CDC consumer will cause WAL accumulation and disk exhaustion. Set a replication slot lag limit and alert on it.
Initial snapshot: When first enabling CDC, Debezium does a consistent snapshot of existing rows (emitting "op": "r" read events) before streaming live changes. This ensures downstream systems start with complete data.
Schema evolution: CDC events embed the schema at the time of the change. Use Confluent Schema Registry with Avro to manage schema evolution across producers and consumers.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How does Debezium read database changes without impacting the source database?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Debezium connects to PostgreSQL as a replication client using the PostgreSQL streaming replication protocol. PostgreSQL writes every committed change to its Write-Ahead Log (WAL) before applying it to the data files — this happens regardless of whether any replication clients are connected. Debezium reads the WAL stream using a logical replication slot (which decodes raw WAL bytes into row-level change events), then publishes those events to Kafka. The WAL is written anyway for crash recovery and streaming replication; Debezium simply reads it. Overhead on the source database: minimal CPU (decoding WAL), plus WAL must be retained until Debezium’s slot has consumed it (risk: disk fill if Debezium falls behind). There are no triggers, no polling queries, no additional writes to the source database.”}},{“@type”:”Question”,”name”:”What is the difference between CDC and the outbox pattern for event publishing?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The outbox pattern writes events to an Outbox table in the same DB transaction as the business data, then a relay reads the Outbox table and publishes to Kafka. The application explicitly controls what events are published and their format. CDC reads every change to every configured table from the WAL — you get a stream of all inserts, updates, and deletes, including changes made by migrations, admin scripts, or other services. CDC is more powerful (captures everything) but noisier (must filter relevant events). Outbox is more targeted (only events you explicitly write). Combined approach (recommended): use the outbox pattern to write intent events, then use Debezium’s outbox event router to publish them — you get explicit event control with zero-overhead delivery.”}},{“@type”:”Question”,”name”:”How do you handle schema changes in CDC event streams?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”When you add or rename a column in the source table, the CDC event schema changes. Downstream consumers that expect the old schema will fail. Solution: use Confluent Schema Registry with Avro serialization. Every CDC event includes a schema ID; the consumer fetches the schema by ID and deserializes accordingly. Schema Registry enforces compatibility rules: BACKWARD compatibility means new schemas can read old events (new optional fields only); FORWARD means old schemas can read new events. Use BACKWARD_TRANSITIVE for most cases. For breaking changes (removing a column, changing a type), version the topic (orders.v2) and migrate consumers before decommissioning the old topic. Never make breaking schema changes without a migration plan for all consumers.”}},{“@type”:”Question”,”name”:”How do you prevent WAL accumulation from causing disk exhaustion?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”PostgreSQL retains WAL segments until all replication slots have consumed them. If Debezium falls behind (slow consumer, Kafka backpressure, Debezium downtime), WAL grows unboundedly. Prevention: (1) Monitor replication slot lag: SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag FROM pg_replication_slots. Alert if lag > 1GB. (2) Set wal_keep_size (PostgreSQL 13+) as a minimum WAL retention floor, separate from slot retention. (3) Set a max_slot_wal_keep_size limit — if a slot falls more than this far behind, PostgreSQL drops the slot automatically (preventing disk exhaustion at the cost of needing a full resnapshot). (4) Monitor Debezium consumer lag in Kafka consumer groups.”}},{“@type”:”Question”,”name”:”How does the initial snapshot work when first enabling CDC?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”When Debezium first starts for a table, it cannot just read the WAL from the beginning — the WAL does not retain full history. Debezium performs a consistent snapshot: it acquires a consistent snapshot of the table (using a transaction with REPEATABLE READ isolation), reads all existing rows and emits them as CDC events with "op": "r" (read), records the WAL LSN at the snapshot point, then transitions to streaming live changes from that LSN forward. This ensures no gap between the snapshot and live changes. For large tables (100M+ rows), the snapshot can take hours. During snapshot, the table is not locked (Debezium uses REPEATABLE READ, not an exclusive lock), but replication slot lag accumulates. Schedule initial snapshots during low-traffic windows.”}}]}

Change data capture and real-time data pipeline design is discussed in Databricks system design interview questions.

Change data capture and data pipeline reliability design is covered in LinkedIn system design interview preparation.

Change data capture and event streaming design is discussed in Uber system design interview guide.