Design a news feed system like Facebook’s. This is a richer problem than the Twitter feed design — Facebook’s feed involves mixed media (photos, videos, links), privacy filtering, and a ranking model that goes far beyond chronological ordering.
Requirements Clarification
- Scale: 3 billion users, 1.5 billion DAU. Average user has 500 friends.
- Feed content: Text posts, photos, videos, shared links, event RSVPs, life events.
- Ranking: Not chronological — ranked by relevance (EdgeRank/ML model). User can switch to “Most Recent.”
- Privacy: Posts have visibility settings: Public, Friends, Friends of Friends, Custom, Only Me.
- Read/write ratio: Heavily read-biased. 1.5B users refresh feed × multiple times daily vs ~500M posts/day.
- Latency: Feed load < 200ms p99.
Back-of-Envelope
- 500M posts/day → ~6,000 posts/sec
- 1.5B DAU × 10 feed loads/day = 15B feed views/day → ~175,000 feed reads/sec
- Average feed: 20 posts per load. Each post: ~2KB metadata + media URLs. Feed payload: ~40KB before media.
- Average user: 500 friends. A post fans out to 500 feeds → 6,000 × 500 = 3M feed insertions/sec at peak
3M insertions/sec is too expensive for pure fanout-on-write. Facebook uses a hybrid fanout approach.
Fanout Strategy: The Hybrid Approach
Pure fanout-on-write (push model): when Alice posts, immediately write to all 500 friends’ feed caches. Fast reads. Expensive for high-follower users — a celebrity with 5M friends generates 5M cache writes per post.
Pure fanout-on-read (pull model): don’t precompute. When Bob loads his feed, fetch posts from all 500 friends and merge/rank. No write amplification. Very expensive reads — 500 DB queries per feed load.
Facebook’s hybrid:
- Regular users (<5K friends): fanout-on-write. Posts pushed to friends’ feed caches immediately.
- High-follower users (celebrities, pages): fanout-on-read. Their posts are not pre-pushed. When you load your feed, the feed service fetches their recent posts and merges.
- Cutoff is configurable, typically 5K–10K friends/followers.
Architecture
Post Service → [Kafka] → Fanout Service → Feed Cache (Redis)
↓
Privacy Filter
↓
Feed Cache: user_id → [post_id list]
Feed Read Path:
User Request → Feed Service → Redis (precomputed feed)
→ Merge with celebrity posts (pull)
→ Ranking Service (ML model)
→ Post Details Service (hydrate metadata)
→ CDN (media URLs)
Privacy Filtering
Privacy is the hardest part of Facebook’s feed. A post with visibility “Friends of Friends” must be excluded from your feed if the mutual friend relationship was broken since the post was created.
Options:
- Pre-filter at fanout time: When writing to feed cache, check privacy at insertion. Stale if Alice changes her post visibility later — cached feeds won’t update.
- Post-filter at read time: Always re-check privacy when loading feed. Correct, but expensive — N posts × privacy check per feed load.
- Facebook’s approach: Pre-filter at fanout, with invalidation events. When post visibility changes, send an invalidation to affected caches.
Feed Ranking
Facebook’s original EdgeRank (2010) ranked posts by: Affinity (how close you are to the poster) × Weight (post type: video > photo > link > text) × Time decay. Modern feed ranking is a full ML model — a multi-task neural network that predicts engagement probability (like, comment, share, click) for each post-user pair.
In an interview, you don’t need to design the full ML system. Say: “We run a ranking service that scores each candidate post against the user’s engagement history. Top-K posts are returned. The ranking model is trained offline on engagement labels and served via a low-latency feature store + model serving layer.”
Key ranking signals:
- Affinity score between user and poster (interaction frequency, mutual friends)
- Post type and recency
- Predicted engagement probability (from ML model)
- Content diversity (don’t show 5 posts from the same friend in a row)
Feed Cache Design
Redis sorted set per user:
Key: feed:{user_id}
Members: post_id (encoded as string)
Score: ranking_score (float, computed at fanout time for fast reads)
Feed size: cap at 1,000 posts per user in cache (pagination handles older)
TTL: 7 days (inactive users' feeds expire; rebuilt on next load)
On feed read: fetch top-N post_ids from Redis → batch-fetch post metadata from Post Service → filter privacy → re-rank if using dynamic ranking signals → paginate.
Media Handling
Photos and videos are not stored in the feed cache — only their IDs. At read time, the feed service returns media URLs pointing to CDN (Akamai, Facebook’s own CDN). Video thumbnails are pre-generated at multiple resolutions. Video streaming uses HLS/DASH from Facebook’s video processing pipeline (similar to YouTube’s transcoding pipeline).
Feed Pagination
Cursor-based pagination, not offset. Why: offset pagination requires scanning and skipping rows, which is O(n). Cursor-based: store the last-seen ranking_score as the cursor. “Give me posts with score less than {cursor}.” O(1) lookup in the sorted set.
Key Differences from Twitter Feed Design
| Aspect | Twitter/X | Facebook News Feed |
|---|---|---|
| Default sort | Chronological + algorithmic | Algorithmic (EdgeRank/ML) |
| Privacy | Public by default | Complex visibility rules per post |
| Content types | Tweets (text/media) | Posts, events, life events, pages |
| Celebrity problem | Fan-out on read for all | Hybrid by follower count |
| Engagement | Likes/RTs/replies | Reactions, comments, shares, saves |
Interview Follow-ups
- How do you handle feed freshness — a user posts and expects to see it in their own feed immediately?
- How would you design the “Memories” feature (posts from N years ago)?
- How do you prevent the ranking model from creating filter bubbles?
- Your fanout service falls behind during a viral event. How do you handle backpressure?
- How do you handle a user who suddenly gains 10M followers (goes viral)?
Related System Design Topics
- Caching Strategies — the Redis sorted set feed cache is the read path’s foundation
- Message Queues — Kafka fan-out pipeline from post creation to feed cache insertion
- Database Sharding — sharding user and post data across the feed graph
- Design Twitter / Social Feed — compare fanout-on-write vs read; Facebook’s hybrid approach
- Load Balancing — distributing feed read requests across feed service nodes
Companies That Ask This System Design Question
This problem type commonly appears in interviews at:
See our company interview guides for full interview process, compensation, and preparation tips.