Design a News Feed (Facebook-style) – Tech Interview Dot Org

Design a news feed system like Facebook’s. This is a richer problem than the Twitter feed design — Facebook’s feed involves mixed media (photos, videos, links), privacy filtering, and a ranking model that goes far beyond chronological ordering.

Requirements Clarification

Scale: 3 billion users, 1.5 billion DAU. Average user has 500 friends.
Feed content: Text posts, photos, videos, shared links, event RSVPs, life events.
Ranking: Not chronological — ranked by relevance (EdgeRank/ML model). User can switch to “Most Recent.”
Privacy: Posts have visibility settings: Public, Friends, Friends of Friends, Custom, Only Me.
Read/write ratio: Heavily read-biased. 1.5B users refresh feed × multiple times daily vs ~500M posts/day.
Latency: Feed load < 200ms p99.

Back-of-Envelope

500M posts/day → ~6,000 posts/sec
1.5B DAU × 10 feed loads/day = 15B feed views/day → ~175,000 feed reads/sec
Average feed: 20 posts per load. Each post: ~2KB metadata + media URLs. Feed payload: ~40KB before media.
Average user: 500 friends. A post fans out to 500 feeds → 6,000 × 500 = 3M feed insertions/sec at peak

3M insertions/sec is too expensive for pure fanout-on-write. Facebook uses a hybrid fanout approach.

Fanout Strategy: The Hybrid Approach

Pure fanout-on-write (push model): when Alice posts, immediately write to all 500 friends’ feed caches. Fast reads. Expensive for high-follower users — a celebrity with 5M friends generates 5M cache writes per post.

Pure fanout-on-read (pull model): don’t precompute. When Bob loads his feed, fetch posts from all 500 friends and merge/rank. No write amplification. Very expensive reads — 500 DB queries per feed load.

Facebook’s hybrid:

Regular users (<5K friends): fanout-on-write. Posts pushed to friends’ feed caches immediately.
High-follower users (celebrities, pages): fanout-on-read. Their posts are not pre-pushed. When you load your feed, the feed service fetches their recent posts and merges.
Cutoff is configurable, typically 5K–10K friends/followers.

Architecture

Post Service → [Kafka] → Fanout Service → Feed Cache (Redis)
                                ↓
                         Privacy Filter
                                ↓
                    Feed Cache: user_id → [post_id list]

Feed Read Path:
User Request → Feed Service → Redis (precomputed feed)
                           → Merge with celebrity posts (pull)
                           → Ranking Service (ML model)
                           → Post Details Service (hydrate metadata)
                           → CDN (media URLs)

Privacy Filtering

Privacy is the hardest part of Facebook’s feed. A post with visibility “Friends of Friends” must be excluded from your feed if the mutual friend relationship was broken since the post was created.

Options:

Pre-filter at fanout time: When writing to feed cache, check privacy at insertion. Stale if Alice changes her post visibility later — cached feeds won’t update.
Post-filter at read time: Always re-check privacy when loading feed. Correct, but expensive — N posts × privacy check per feed load.
Facebook’s approach: Pre-filter at fanout, with invalidation events. When post visibility changes, send an invalidation to affected caches.

Feed Ranking

Facebook’s original EdgeRank (2010) ranked posts by: Affinity (how close you are to the poster) × Weight (post type: video > photo > link > text) × Time decay. Modern feed ranking is a full ML model — a multi-task neural network that predicts engagement probability (like, comment, share, click) for each post-user pair.

In an interview, you don’t need to design the full ML system. Say: “We run a ranking service that scores each candidate post against the user’s engagement history. Top-K posts are returned. The ranking model is trained offline on engagement labels and served via a low-latency feature store + model serving layer.”

Key ranking signals:

Affinity score between user and poster (interaction frequency, mutual friends)
Post type and recency
Predicted engagement probability (from ML model)
Content diversity (don’t show 5 posts from the same friend in a row)

Feed Cache Design

Redis sorted set per user:
  Key: feed:{user_id}
  Members: post_id (encoded as string)
  Score: ranking_score (float, computed at fanout time for fast reads)

Feed size: cap at 1,000 posts per user in cache (pagination handles older)
TTL: 7 days (inactive users' feeds expire; rebuilt on next load)

On feed read: fetch top-N post_ids from Redis → batch-fetch post metadata from Post Service → filter privacy → re-rank if using dynamic ranking signals → paginate.

Media Handling

Photos and videos are not stored in the feed cache — only their IDs. At read time, the feed service returns media URLs pointing to CDN (Akamai, Facebook’s own CDN). Video thumbnails are pre-generated at multiple resolutions. Video streaming uses HLS/DASH from Facebook’s video processing pipeline (similar to YouTube’s transcoding pipeline).

Feed Pagination

Cursor-based pagination, not offset. Why: offset pagination requires scanning and skipping rows, which is O(n). Cursor-based: store the last-seen ranking_score as the cursor. “Give me posts with score less than {cursor}.” O(1) lookup in the sorted set.

Key Differences from Twitter Feed Design

Aspect	Twitter/X	Facebook News Feed
Default sort	Chronological + algorithmic	Algorithmic (EdgeRank/ML)
Privacy	Public by default	Complex visibility rules per post
Content types	Tweets (text/media)	Posts, events, life events, pages
Celebrity problem	Fan-out on read for all	Hybrid by follower count
Engagement	Likes/RTs/replies	Reactions, comments, shares, saves

Interview Follow-ups

How do you handle feed freshness — a user posts and expects to see it in their own feed immediately?
How would you design the “Memories” feature (posts from N years ago)?
How do you prevent the ranking model from creating filter bubbles?
Your fanout service falls behind during a viral event. How do you handle backpressure?
How do you handle a user who suddenly gains 10M followers (goes viral)?

Companies That Ask This System Design Question

This problem type commonly appears in interviews at:

See our company interview guides for full interview process, compensation, and preparation tips.