A publish-subscribe (pub/sub) messaging system decouples producers (publishers) from consumers (subscribers) through a message broker. Publishers emit events to topics without knowing who consumes them; subscribers register interest in topics and receive messages asynchronously. This pattern powers event-driven architectures, real-time notifications, and microservice communication. Systems like Kafka, Google Pub/Sub, AWS SNS/SQS, and RabbitMQ implement variations of pub/sub with different consistency and delivery guarantees.
Core Components
A pub/sub system has five core components: Topics (logical channels named by subject, e.g., order.created), Publishers (produce messages to a topic without knowing consumers), Subscriptions (a named durable association between a topic and a consumer group), Messages (payload + metadata: message ID, publish timestamp, ordering key, attributes), and Brokers (store messages, match subscriptions, deliver to consumers). The broker must handle fan-out: if 10 subscriptions exist on a topic, each published message must be delivered to all 10.
Message Storage and Partitioning
Messages are stored in partitioned logs (Kafka-style) or per-subscription queues. Partition-based storage: each topic has N partitions; messages are assigned to partitions by a routing key hash (e.g., user_id). Within a partition, messages are ordered and assigned sequential offsets. Each subscription tracks its offset per partition. Benefits: horizontal scalability (add partitions), ordering guarantees within a partition, and efficient replay (seek to offset). Trade-off: messages with different partition keys are unordered relative to each other.
// Message structure
type Message struct {
ID string // globally unique
Topic string
Partition int
Offset int64 // monotonically increasing per partition
Key string // routing key (determines partition)
Payload []byte
Attributes map[string]string
PublishedAt time.Time
}
// Partition assignment
func partition(key string, numPartitions int) int {
h := fnv.New32a()
h.Write([]byte(key))
return int(h.Sum32()) % numPartitions
}
Delivery Semantics
Three delivery semantics exist: At-most-once: acknowledge before processing; if consumer crashes, message is lost. Lowest latency. At-least-once: acknowledge after processing; if consumer crashes before ack, message redelivers. Consumer must be idempotent. Exactly-once: requires transactional producers + idempotent consumers + offset commit in the same transaction. Kafka supports this with transactional APIs, but at higher latency cost. Most systems use at-least-once with idempotent consumers — the pragmatic default.
Consumer Groups and Load Balancing
A consumer group is a set of consumer instances that collectively consume a subscription. Each partition is assigned to exactly one consumer in the group at a time (exclusive ownership). The broker maintains partition assignments and handles rebalancing when consumers join or leave. With N partitions and M consumers (M ≤ N), each consumer gets N/M partitions. This provides parallel processing without duplicate delivery within the group. Multiple independent consumer groups on the same topic each receive all messages (fan-out).
Dead Letter Topics
When a consumer fails to process a message after max retries, move it to a dead letter topic (DLT): order.created.dlq. A DLT preserves the original message with metadata: failure reason, attempt count, original topic, original timestamp. A separate consumer processes DLT messages for alerting, manual review, or replay after fixing the bug. Configure DLT routing per subscription: Google Pub/Sub and SQS support this natively. Without DLTs, poison pill messages (malformed payloads) block partition processing indefinitely.
Key Interview Discussion Points
- Push vs. pull delivery: push (broker sends to consumer endpoint) vs. pull (consumer polls broker) — push requires consumer to manage backpressure; pull is simpler
- Message ordering: ordering guaranteed within a partition key; across partitions, use a sequencer service or accept out-of-order processing
- Backpressure: consumer groups track lag (current offset vs. latest offset); alert when lag grows, scale consumers when lag persists
- Schema registry: enforce Avro/Protobuf schemas on publish to prevent malformed messages from entering the pipeline
- Compacted topics: retain only the latest message per key — useful for changelog topics (database change events, config updates)