Question 1

What is the difference between fan-out on write and fan-out on read for social feeds?

Accepted Answer

Fan-out on write (push model): when a user posts, immediately write the post to every follower's feed. Reads are fast (pre-computed), writes are slow (O(followers) per post). Bad for celebrities with millions of followers. Fan-out on read (pull model): when a user loads their feed, query recent posts from all accounts they follow and merge/sort. Reads are slow (O(followees) queries), writes are instant. Bad for users following thousands of accounts. Hybrid (used by Instagram, Twitter): fan-out on write for regular users, fan-out on read for celebrities (> 10K followers). At read time, merge pre-computed feed with live celebrity posts.

Question 2

How do you store and serve user feeds efficiently at scale?

Accepted Answer

Store each user's feed as a Redis sorted set: key=feed:{user_id}, member=post_id, score=ranking_score. Cap at 1000 items per feed (ZREMRANGEBYRANK on each write). When reading: ZREVRANGE to get top N post IDs by score, then batch-fetch post content from a Redis post cache (key=post:{post_id}, TTL 1h). Two-level cache: feed sorted set (post order) and post content cache (post data). Database is only hit on cache miss. For new users with empty feeds: fall back to pull model query from DB.

Question 3

How do you implement feed ranking beyond chronological order?

Accepted Answer

Engagement-weighted score: score = created_at_unix + (log(1+likes)*W_l + log(1+comments)*W_c + log(1+shares)*W_s). Weights are tuned empirically (comments are worth more than likes). Log-scale dampening prevents viral posts from permanently dominating. For personalized ranking: add a personalization bonus based on author relationship strength (how often the viewer has interacted with this author) and content affinity (ML-predicted probability the viewer engages with this content type). Scores are computed at fan-out time and stored in the sorted set. Re-rank top 50 posts at read time using real-time personalization signals if needed.

Question 4

How do you handle post deletion from feeds at scale?

Accepted Answer

Hard deletion from all follower feeds (removing from N Redis sorted sets) is too expensive for high-follower accounts - O(followers) operations. Use tombstones instead: mark the post as deleted in the DB and cache (post:{post_id} -> {deleted: true}). On feed read, filter out tombstoned posts from the returned list. The post_id may remain in Redis sorted sets, but is filtered at read time. Periodic cleanup job: scan Redis feeds and remove tombstoned post IDs. This trades slight memory overhead for fast deletes. Alternatively, store a user-specific deleted_posts set and exclude on read.

Question 5

How do you handle like counts without a database write per like?

Accepted Answer

Use a Redis counter: INCR like_count:{post_id} on each like. Decrement on unlike. Read the counter from Redis for display - no DB hit. Periodically (every 60 seconds), flush Redis counters to the DB in batch: UPDATE posts SET like_count = %s WHERE post_id = %s for all posts with dirty counters. Mark counters as clean after flush. If Redis fails and restarts, rebuild counters from the DB. This approach handles viral posts with thousands of likes per second without DB write amplification. Apply the same pattern to comment counts and view counts.

Low-Level Design: Social Feed System — Fan-Out, Ranking, and Personalization

Core Entities and Fan-Out Strategies

Feed Generation

Feed Ranking

Feed Caching and Invalidation