Question 1

What is fan-out on write vs. fan-out on read for notifications?

Accepted Answer

Fan-out on write (push model): when an event occurs, immediately enqueue a notification task for every recipient. A post by a user with 1M followers generates 1M write tasks instantly. Pros: fast delivery for all recipients; cons: expensive for celebrities (1M tasks per post), wastes work for inactive users. Fan-out on read (pull model): store the event once; when each user opens the app, their feed queries for relevant events. Pros: no wasted work for inactive users; cons: latency (no immediate delivery), complex query at read time. Hybrid (Twitter/Instagram approach): fan out immediately to active users (online or logged in recently), store events for inactive users and deliver on next login. This balances delivery speed against storage efficiency.

Question 2

How do you prevent duplicate notification delivery?

Accepted Answer

At-least-once delivery (Kafka, SQS) causes duplicate notifications when a consumer crashes after sending but before committing the offset. Prevent duplicates with notification IDs: assign a UUID to each notification at creation. Before calling the channel API (APNs, SendGrid), check if notification_id exists in a Redis deduplication set (SET NX with TTL = delivery window + retry window, e.g., 24 hours). If it exists, skip. After successful delivery, add notification_id to the set. For third-party APIs: use idempotency keys if supported (Twilio supports idempotent SMS sends). For push notifications (APNs/FCM): these providers deduplicate by collapse-key — messages with the same collapse-key replace earlier undelivered messages, preventing N identical push notifications from queueing up.

Question 3

How should a notification system handle APNs device token invalidation?

Accepted Answer

APNs returns specific error codes when a device token is no longer valid: BadDeviceToken (the token is malformed or expired) and Unregistered (the user uninstalled the app). On receiving these errors: immediately remove the invalid token from the user's stored push tokens — do not retry with the same token. If the user has multiple devices, only remove the specific failing token. APNs penalizes apps that repeatedly send to invalid tokens with rate limits and eventually app-level bans. Maintain a feedback service integration: APNs's feedback API periodically reports tokens that permanently failed — use this to clean up tokens that weren't removed in real-time. After token removal, check if the user has other valid push channels (email, SMS) and route the notification there if push was the only channel.

Question 4

How do you implement per-user notification throttling?

Accepted Answer

Per-user throttling prevents notification fatigue (sending 50 notifications in an hour). Implement with a token bucket per user per channel: each user has a bucket with capacity C and refill rate R per hour. When a notification would be sent: check the bucket; if tokens available, deduct one and send; if exhausted, either drop the notification or aggregate it ('You have 5 new comments'). Use Redis for the bucket state: HMGET user:{id}:push_bucket tokens last_refill; compute tokens added since last_refill; cap at capacity; deduct if positive. Apply higher throttle limits to marketing/promotional notifications vs. transactional (order confirmations, security alerts are never throttled). Store throttled notifications for digest delivery: batch them into a daily or hourly summary rather than dropping them entirely.

Notification System: Low-Level Design

Architecture Overview

Fan-Out Strategies

User Preference Management

Delivery Failure and Retry

Deduplication

Throttling and Rate Limiting