System Design: Design a Social Media Feed (Twitter/Instagram)
News feed design is one of the most comprehensive system design problems — it combines social graph traversal, content ranking, fan-out, caching, and real-time updates. Understanding the push vs. pull vs. hybrid approaches is key to a strong answer.
Requirements
Functional: user follows others, sees recent posts from followed users, ranked by recency or relevance, infinite scroll, support for photos/video/text.
Non-functional: serve feeds in <200ms, 500M DAU, 500K posts per second at peak, eventual consistency acceptable, different SLA for celebrities vs. regular users.
Two Core Approaches
Fan-Out on Write (Push Model)
When user A posts, immediately write to the feed list of every follower. Feed reads are O(1) — just read user’s pre-built feed list from cache.
- Pros: instant feed reads, simple read path
- Cons: write amplification — a celebrity with 50M followers generates 50M writes per post. Unusable for high-follower accounts.
Fan-Out on Read (Pull Model)
When user opens feed, query recent posts from all followed users and merge-sort them. No write amplification.
- Pros: no write amplification, storage-efficient
- Cons: high read latency — user follows 500 people, must query 500 users’ post lists and merge. Unusable for high fan-out readers.
Hybrid (Twitter/Instagram approach)
Fan-out on write for regular users (≤ 1000 followers). Pull on read for celebrities (> 1000 followers). On feed read, merge pre-built feed with real-time pull from followed celebrities.
def get_feed(user_id, page):
# 1. Get pre-built feed from Redis (fan-out on write from regular users)
feed = redis.lrange(f"feed:{user_id}", 0, 99)
# 2. Get followed celebrities
celebrities = get_followed_celebrities(user_id) # followers > threshold
# 3. Pull recent posts from celebrities
celeb_posts = []
for celeb_id in celebrities:
posts = get_recent_posts(celeb_id, limit=20)
celeb_posts.extend(posts)
# 4. Merge and rank
all_posts = feed + celeb_posts
return sorted(all_posts, key=lambda p: p.timestamp, reverse=True)[:20]
Data Model
-- Users and social graph
users(id, username, follower_count, following_count, ...)
follows(follower_id, followee_id, created_at)
-- Posts
posts(id, user_id, content, media_url, created_at, like_count, comment_count)
-- Feed cache (Redis)
feed:{user_id} → sorted list of post IDs (newest first)
post:{post_id} → cached post object (hash)
user:{user_id} → cached user object (hash)
Fan-Out Implementation
# When user posts
def on_new_post(post_id, author_id):
post = get_post(post_id)
# Cache the post
redis.hmset(f"post:{post_id}", post.__dict__)
redis.expire(f"post:{post_id}", 86400)
if get_follower_count(author_id) > CELEB_THRESHOLD:
return # celebrities: pull-based, skip fan-out
# Fan-out to followers (async via Kafka)
followers = get_followers(author_id) # paginated
for follower_id in followers:
kafka.publish("fanout", {
"post_id": post_id,
"user_id": follower_id
})
# Fanout worker
def process_fanout(post_id, user_id):
redis.lpush(f"feed:{user_id}", post_id)
redis.ltrim(f"feed:{user_id}", 0, 999) # keep latest 1000
Feed Ranking
Pure recency is simple but engagement-based ranking increases time-on-platform:
- Recency: sort by post timestamp. Predictable, transparent to users.
- Engagement: score = f(recency, likes, comments, shares, author relationship). Facebook/Instagram use ML models trained on click-through rate.
- Interest graph: boost posts from users you interact with frequently (reply, like) vs. just follow.
Scale Numbers
- Twitter: ~500M tweets/day, ~200M DAU, Elon Musk (~100M followers) generates 100M fan-out writes per tweet → pull-based for celebrities
- Instagram: 100M photos/day, ~2B MAU
- Facebook: 100B feed impressions/day
Interview Checklist
- Start with the write path: post → fan-out workers → Redis feed lists
- Explain the celebrity problem and the hybrid push+pull solution
- Draw the read path: Redis feed + real-time celebrity posts → merge → rank
- Discuss feed ranking: recency vs. engagement scoring
- Address storage: Redis for feeds (1000 posts max), CDN for media
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is fan-out on write and when does it fail?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “Fan-out on write (push model) means when a user posts, the system immediately writes that post to every follower's feed list in cache. Feed reads become O(1). It fails when users have massive follower counts (celebrities): a single post by a user with 100M followers triggers 100M cache writes, causing severe write amplification and latency spikes. Twitter, Instagram, and Facebook use a hybrid approach: push for regular users (<1000-10000 followers) and pull for celebrities.” }
},
{
“@type”: “Question”,
“name”: “How does the hybrid push-pull feed model work?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “Regular users' posts fan out to followers' pre-built feed lists in Redis. Celebrity posts (above a follower threshold) are not fanned out. When a user reads their feed, the system merges: (1) their pre-built feed from Redis (from regular users they follow) and (2) real-time pull of recent posts from celebrities they follow. The two lists are merged by timestamp and returned to the client. This eliminates write amplification for celebrities while keeping read latency low for the common case.” }
},
{
“@type”: “Question”,
“name”: “How should news feed ranking work beyond simple recency?”,
“acceptedAnswer”: { “@type”: “Answer”, “text”: “Engagement-based ranking trains an ML model on user interactions (click, like, comment, share, time spent) to predict the probability a user engages with each candidate post. Features include: author relationship strength (direct message vs. follow), post recency, post format (video outperforms text), historical engagement with similar content, and diversity (avoid showing 10 consecutive posts from the same author). Facebook's EdgeRank and Instagram's feed algorithm use this approach. For interview purposes, explain the signals and scoring formula; exact ML details are not expected.” }
}
]
}