Message queues like Kafka, RabbitMQ, and SQS show up in both system design and low-level design interviews. Understanding the internals helps you answer follow-up questions about ordering, durability, and exactly-once delivery.
Why Message Queues
Message queues solve four problems:
- Async decoupling: The order service does not wait for the email service to respond. It publishes an event and moves on. If the email service is down, the message waits in the queue.
- Traffic spike absorption: Black Friday sends 100x normal order volume. The queue buffers the spike; downstream services process at their own pace without falling over.
- Fan-out to multiple consumers: One order event triggers fulfillment, email notification, analytics update, and fraud check simultaneously – each consumer reads independently.
- Durability: Messages are persisted to disk. If the consumer crashes mid-processing, the message is not lost.
Core Concepts
- Producer: Any service that publishes messages. Producers send to a topic, not directly to a consumer. They do not know who consumes the message.
- Topic: A named category or feed for messages. Think of it as a table in a database or a channel in Slack. Topics are logical; partitions are physical.
- Partition: A topic is divided into N partitions. Each partition is an ordered, immutable sequence of records. Parallelism within a topic is bounded by partition count.
- Consumer Group: A set of consumers that jointly consume a topic. Each partition is assigned to exactly one consumer in the group at a time – this is how Kafka achieves parallel consumption without duplicate processing within a group.
- Offset: A monotonically increasing integer identifying a message’s position within a partition. Consumers track their offset to know where to resume after restart.
Partition Strategy
How you assign messages to partitions determines ordering guarantees and parallelism:
- Key-based partitioning: hash(message_key) % num_partitions. All messages with the same key go to the same partition, guaranteeing order within that key. Use when order matters per entity – e.g., all events for user_id=123 must be processed in order.
- Round-robin: Messages distributed evenly across partitions. Maximum parallelism, but no ordering guarantee across messages. Use for independent events like log lines or metrics.
- Custom partition function: Producer-side logic to route to a specific partition. Use when you have business logic that determines co-location – e.g., VIP customers get their own partition for priority processing.
Design rule: if a consumer needs to see all events for a given entity in order, make sure all those events share the same partition key.
Storage Layer
Kafka’s storage design is why it is so fast:
- Append-only log: Each partition is stored as a sequence of segment files on disk. New messages are always appended to the active segment – no random writes, no indexing overhead. Sequential disk writes saturate disk bandwidth (500+ MB/s on spinning disk, far above random write throughput).
- Segment rotation: When the active segment reaches a size limit (default 1GB) or time limit (default 7 days), it is closed and a new segment opens. Closed segments are immutable.
- Sparse index: Each segment has an accompanying .index file mapping offsets to byte positions. Kafka does not index every message – it uses binary search within a segment and then scans forward. This keeps index size small.
- Zero-copy transfers: When a consumer reads, Kafka uses sendfile() to copy data from disk directly to the network socket – bypassing user space entirely. This is why Kafka can serve millions of reads/sec without saturating CPU.
Consumer Groups
Consumer groups are how Kafka enables parallel consumption:
Topic: orders (4 partitions)
P0 P1 P2 P3
Group A: C1 C2 C1 C2 (2 consumers, each handles 2 partitions)
Group B: C3 C3 C4 C4 (independent read of same topic)
Rules:
- Each partition is assigned to at most one consumer per group at a time.
- If you have more consumers than partitions, some consumers sit idle. Adding partitions later allows more parallelism.
- Different consumer groups are completely independent – they each maintain their own offsets and can be at different positions in the log.
- Offsets are committed to a special internal topic called __consumer_offsets. This makes offset tracking durable and removes the need for ZooKeeper in newer Kafka versions.
Rebalancing happens when a consumer joins or leaves a group. During rebalance, partition assignments are redistributed and consumption pauses briefly. Sticky assignor and cooperative rebalancing reduce this disruption.
At-Least-Once vs Exactly-Once Delivery
At-most-once: Commit offset before processing. If consumer crashes after commit but before processing, message is lost. Acceptable for metrics and analytics where some data loss is tolerable.
At-least-once: Commit offset after processing. If consumer crashes after processing but before committing, it re-processes the message on restart. This is the default and most common mode. Your consumer must be idempotent – processing the same message twice should be safe.
Exactly-once: The hard problem. Kafka achieves this via:
- Idempotent producer: Each message gets a sequence number. The broker deduplicates retries within a session. Enabled with enable.idempotence=true.
- Transactional producer: Atomic write spanning multiple partitions. Either all messages in the transaction are visible, or none are. Enables read-process-write patterns where you atomically consume from one topic and produce to another.
- Idempotent consumer: Even with transactional producers, consumers should track processed message IDs (e.g., in a database) to handle edge cases.
In practice: use at-least-once with idempotent consumers for 90% of use cases. Reserve exactly-once for financial transactions and audit logs where duplicates cause real harm.
Dead Letter Queue
Some messages cannot be processed – malformed data, upstream dependency unavailable, or a bug in consumer code. Without a DLQ, failed messages block processing or get silently dropped.
DLQ pattern:
- Consumer tries to process message. On failure, increments retry count (stored in message header or a Redis key).
- After N retries (typically 3-5), consumer publishes the message to a dead-letter topic: orders-dlq.
- Consumer commits the original offset and continues processing.
- An ops team or separate process monitors the DLQ, inspects failures, fixes the bug or data issue, and republishes messages to the original topic.
Important: include the original topic name, partition, offset, timestamp, and error message in the DLQ record. Without this context, debugging is difficult.
Replication
Kafka replicates each partition across multiple brokers for fault tolerance:
- Leader: One replica is elected leader. All produce and consume requests go to the leader. The leader tracks which followers are in-sync.
- In-Sync Replicas (ISR): The set of replicas that are fully caught up with the leader. A follower is removed from ISR if it falls behind by more than replica.lag.time.max.ms (default 10 seconds).
- Acknowledgment modes (acks):
- acks=0: Fire and forget. Producer does not wait for broker acknowledgment. Fastest, but no durability guarantee.
- acks=1: Leader writes to disk, acknowledges. Follower lag means data loss if leader fails before replication.
- acks=all (acks=-1): Leader waits for all ISR to acknowledge. Strongest durability. Use min.insync.replicas=2 to require at least 2 replicas in ISR.
- Leader election: When the leader fails, the controller (another broker) elects a new leader from the ISR. Unclean leader election (electing an out-of-sync replica) risks data loss and is disabled by default.
Push vs Pull
- Pull (Kafka, Kinesis): Consumer requests messages from broker at its own pace. Advantages: consumer controls batch size and fetch rate; no backpressure needed; consumer can replay from any offset. Disadvantage: consumer must poll even when no messages are available (mitigated by long polling with fetch.max.wait.ms).
- Push (SQS, RabbitMQ): Broker pushes messages to consumer. Advantages: lower latency for single-message processing; simpler consumer code. Disadvantages: broker must track per-consumer rate limits; slow consumers can be overwhelmed.
SQS uses a visibility timeout model: consumer receives message, message becomes invisible to other consumers for N seconds. Consumer must delete the message before timeout expires, or it becomes visible again for retry. This enables at-least-once delivery without explicit consumer groups.
Retention and Compaction
How long do you keep messages? Kafka supports two policies:
Time/size-based retention (log.retention.hours, log.retention.bytes): Delete segments older than N hours or when the topic exceeds M bytes. Use for event streams where historical data beyond a window is not needed – application logs, metrics, clickstreams.
Log compaction (log.cleanup.policy=compact): For each unique key, retain only the most recent message. Older messages for the same key are garbage collected. The compacted log is like a snapshot of the latest state for every key. Use cases:
- Database change data capture (CDC) – latest row state per primary key
- User profile updates – latest profile version per user ID
- Configuration changes – latest config value per config key
Log compaction allows a new consumer to bootstrap current state by reading from offset 0 without needing infinite retention of all historical events.
Scale Numbers
Kafka’s published benchmarks (LinkedIn, 2014 era hardware):
- Producer throughput: 786,980 messages/second for a single producer (200MB/s)
- Consumer throughput: 940,521 messages/second (over 400MB/s, benefits from zero-copy)
- LinkedIn production (2023): 7 trillion messages per day, 2.8 million messages per second peak
- Typical latency: sub-10ms end-to-end at p99 for acks=1, sub-100ms for acks=all with replication
These numbers come from sequential I/O, batching (producers buffer messages before sending), and zero-copy network transfers. The architecture is designed around disk, not around RAM – Kafka intentionally writes everything to disk and relies on the OS page cache for read performance.
LinkedIn created Kafka and deeply tests message queue design. See system design questions for LinkedIn interview: Kafka and message queue system design.
Netflix uses Kafka for event streaming at massive scale. See system design patterns for Netflix interview: message queue and event streaming design.
Datadog ingests metrics via message queues. See system design patterns for Datadog interview: event pipeline and message queue design.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture