Requirements and Constraints
CQRS (Command Query Responsibility Segregation) separates write operations (commands) from read operations (queries) into distinct models, each optimized for its workload. Functional requirements: commands mutate state and emit domain events; queries read from denormalized, pre-joined read models optimized for specific access patterns; the read model is updated asynchronously from domain events; and consumers can query stale-but-fast read models or pay for strong consistency when needed.
Key constraints: write throughput and read throughput must scale independently, the read model must eventually converge with the write model within a bounded lag (target: under 500ms at p99), and the system must handle read model rebuilds without write-side downtime.
Core Data Model
Write Side (Command Model)
The command model uses a normalized relational schema optimized for integrity and consistency. Example for an e-commerce domain:
- orders: order_id, customer_id, status, created_at, updated_at
- order_items: item_id, order_id, product_id, quantity, unit_price
- payments: payment_id, order_id, amount, status, processed_at
Each command handler wraps its mutations in a transaction and publishes domain events to an outbox table within the same transaction (transactional outbox pattern) to guarantee event emission without two-phase commit.
Outbox Table
- outbox_id (UUID)
- aggregate_type (varchar)
- aggregate_id (UUID)
- event_type (varchar)
- payload (jsonb)
- published_at (timestamptz, nullable) — null means unpublished
Read Side (Query Model)
Read models are denormalized projections stored in the most query-friendly store for each use case:
- Order summary view: a Redis hash keyed by
order:{id}containing pre-joined customer name, item count, total amount, and status — serves dashboard widgets. - Customer order history: an Elasticsearch index with one document per order, indexed by customer_id and created_at — serves paginated history with full-text search on item names.
- Reporting aggregates: a ClickHouse table with daily revenue rollups — serves finance dashboards.
Key Algorithms and Logic
Outbox Relay
A relay process (or Debezium CDC connector) polls the outbox table for rows where published_at IS NULL, publishes each event to Kafka, then marks the row as published. To avoid reprocessing on relay restart, Kafka producers use idempotent mode and each event carries a stable outbox_id as the Kafka message key for deduplication.
Read Model Synchronization
Kafka consumers (one per read model type) process domain events and apply projections:
- On OrderPlaced: create the Redis hash and Elasticsearch document.
- On PaymentConfirmed: update status field in both stores atomically (Redis
HSET, ESupdatewithretry_on_conflict=3). - On OrderCancelled: mark records as cancelled and decrement daily ClickHouse revenue via a compensating event insert.
Each consumer commits Kafka offsets only after successfully applying the projection, ensuring at-least-once delivery. Idempotent projection handlers (check-and-skip by event sequence) prevent double-application.
Handling Eventual Consistency
Most reads tolerate the sub-500ms lag. For operations requiring read-your-writes consistency — e.g., immediately showing a newly placed order — the API layer uses a version token strategy: after a command, return the command's sequence number. The subsequent query includes this token, and the query handler waits up to 1 second for the read model to advance past that sequence before serving results.
API Design
Command API
- POST /orders — PlaceOrder command; returns
{ order_id, sequence_number } - POST /orders/{id}/cancel — CancelOrder command; returns updated sequence
- POST /payments — ConfirmPayment command
Query API
- GET /orders/{id}?min_seq= — reads from Redis; waits for read model to reach min_seq
- GET /customers/{id}/orders?page=&size= — reads from Elasticsearch
- GET /reports/revenue?from=&to=&granularity= — reads from ClickHouse
Scalability Considerations
- Independent scaling: command handlers and query handlers are separate services, allowing CPU/memory allocation tuned to each workload.
- Read model fan-out: a single domain event can update multiple read models in parallel by having multiple Kafka consumer groups, each owning one projection type.
- Read model rebuild: reset a consumer group offset to the beginning of the Kafka topic and replay into a shadow projection; swap traffic after convergence. Write side is unaffected.
- Write throughput: batch outbox polling intervals and Kafka producer batching keep write-side overhead under 5ms per command at steady state.
- Consistency monitoring: a lag monitor tracks the difference between the latest outbox event timestamp and the latest processed event timestamp per consumer group; alerts fire when lag exceeds 1 second.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Atlassian Interview Guide