Question 1

How do you design a notification fanout system for 50 million users?

Accepted Answer

Sending a notification to 50 million users (a marketing campaign, breaking news alert) requires a distributed fanout architecture — you cannot generate 50M individual messages synchronously. The pipeline: (1) Create a single notification campaign record in the database: {campaign_id, template, target_criteria, scheduled_at}. (2) A campaign orchestrator queries users matching the target criteria in batches of 10,000-50,000 users using cursor-based pagination (WHERE user_id > last_processed_id LIMIT 50000). For each batch, it publishes one "batch notification" message to a Kafka topic per channel (push, email, SMS). (3) Channel-specific fanout workers consume from Kafka. Each worker takes a batch of user IDs, fetches their device tokens/email addresses from the database (or a dedicated device registry service), and calls the delivery API in bulk: APNs supports sending one notification to up to ~1,000 device tokens per HTTP/2 request; Firebase FCM has a multicast API; SendGrid supports batch email sends. (4) Delivery status is tracked asynchronously via webhooks and stored in a partitioned notifications_status table. (5) Throttling: most platforms limit your send rate. APNs throttles based on app traffic patterns; email providers rate-limit by IP reputation. Use token bucket rate limiting at the worker level to stay within limits. At 50M users with 1,000 tokens per APNs request: 50,000 HTTP requests to APNs. At 1,000 requests/second (well within APNs limits), this takes 50 seconds — acceptable for non-time-sensitive campaigns.

Question 2

How do you handle notification deduplication and prevent users from receiving duplicate alerts?

Accepted Answer

Duplicate notifications happen when: Kafka at-least-once delivery re-delivers a message; a worker times out and the job is retried; a race condition causes two workers to process the same event; or a bug in the producer publishes the same event twice. Prevention architecture: (1) Idempotency key generation: create a deterministic key for each notification based on immutable properties: sha256(event_id + user_id + channel + notification_type). This ensures the same event always maps to the same key, regardless of how many times the event is delivered from Kafka. (2) Redis deduplication check: before sending, SET notification_dedup:{key} 1 NX EX 86400 (SET if Not eXists, expire in 24 hours). If the SET returns nil (key already exists), skip sending and acknowledge the Kafka message. The NX flag makes this check atomic — two concurrent workers trying to process the same event will both attempt the SET; only one will succeed, and the other will see the key already exists. (3) Database-level idempotency: for financial notifications where Redis may not be enough, insert into a notifications_sent table with a UNIQUE constraint on (user_id, event_id, channel). A duplicate attempt will fail the UNIQUE constraint rather than send twice. (4) Time-based deduplication window: set the TTL on the deduplication key to match the maximum retry window (24 hours for most systems). Keys expire automatically, avoiding unbounded Redis growth. Monitor duplicate rates in your metrics — a spike indicates a bug in the producer or Kafka consumer.

Question 3

How do you implement notification preference management without bottlenecking on a shared database?

Accepted Answer

User notification preferences are read on every notification send (high read volume) and written infrequently (users change preferences rarely). The naive approach — a database query per user per notification — becomes a bottleneck at scale. Scalable architecture: (1) Preference caching: store user preferences in Redis with a 1-hour TTL. On cache miss, read from the database and populate the cache. Cache invalidation: when a user updates preferences, delete their cache key immediately. With 10M daily active users sending 100M notifications/day, a 1-hour cache TTL reduces database reads by ~99% (most users do not change preferences within an hour). (2) Preference service: a dedicated microservice owns preference data. This service maintains its own in-memory cache (LRU cache per instance) and a Redis shared cache. The notification service calls the preference service via gRPC — the preference service batches lookups (fetch preferences for 1,000 users in one call) rather than per-user queries. (3) Preference event log: preferences are stored as an event log (user_id, event_type, preferences, timestamp). The preference service computes current preferences by replaying the event log, cached in memory per user. Event sourcing means updates are always appends — no locking, highly concurrent writes. (4) Default preferences: define organizational defaults for each notification type. Most users never change defaults — only store and look up overrides. This reduces the preference dataset by 90%+ and simplifies the cache design.

System Design Interview: Notification System at Scale

Notification System Requirements

Notification Channels

Architecture: Fanout Service

Delivery Tracking and Receipts

User Preferences and Rate Limiting

Deduplication

Handling Push Token Invalidation

Email Deliverability

Interview Questions

Frequently Asked Questions

How do you design a notification fanout system for 50 million users?

How do you handle notification deduplication and prevent users from receiving duplicate alerts?

How do you implement notification preference management without bottlenecking on a shared database?

Companies That Ask This Question