Designing a social media feed like Twitter is one of the most comprehensive system design problems. The core challenge: how do you efficiently deliver a personalized timeline for 300M users when anyone can have 100M followers?
Core Feed Delivery Approaches
Option A: Pull-based (Fan-out on Read)
When user opens app:
SELECT tweets FROM tweets WHERE author_id IN (following_list)
ORDER BY created_at DESC LIMIT 20
Problem: user follows 1000 people → 1000 DB lookups per timeline load
slow, especially for users following many accounts
Option B: Push-based (Fan-out on Write)
When user posts a tweet:
For each of their N followers: INSERT tweet_id into follower's timeline cache
User timeline = pre-built list in Redis → O(1) read
Problem: celebrity with 100M followers posts → 100M inserts → latency spike
Option C: Hybrid (Twitter's actual approach)
Regular users ( 1M followers): fan-out on read → fetched at read time
User timeline = pre-built cache + live merge of celebrity tweets
Fan-out on Write: Implementation
Post tweet flow:
1. Store tweet in tweets DB: {id, author_id, content, created_at}
2. Publish to Kafka: "tweet.created" event
3. Fan-out worker consumes event:
- Fetch follower list (from follower graph store)
- For each follower (if not celebrity): LPUSH timeline:{follower_id} tweet_id
- Redis: list of tweet_ids, max 800 per user (trim older entries)
4. For celebrity accounts: skip fan-out, handle at read time
Timeline read:
LRANGE timeline:{user_id} 0 19 → 20 tweet_ids (O(1) range read)
Fetch tweet content: MGET tweet:{id} for each tweet_id
Merge celebrity tweets: fetch from celebrities' own tweet lists, sort by time
Return merged + ranked timeline
Tweet Storage
tweets table (PostgreSQL / DynamoDB):
id BIGINT PRIMARY KEY -- Snowflake ID (time-sortable)
author_id BIGINT
content TEXT (280 chars)
media_ids UUID[] -- array of media attachments
reply_to_id BIGINT NULL
retweet_id BIGINT NULL
like_count BIGINT DEFAULT 0
retweet_count BIGINT DEFAULT 0
reply_count BIGINT DEFAULT 0
created_at TIMESTAMPTZ
Snowflake ID (Twitter's approach):
64-bit integer:
41 bits: milliseconds since epoch (69-year range)
5 bits: datacenter ID
5 bits: worker ID
12 bits: sequence number (4096 per millisecond per worker)
Benefits: time-sortable, unique globally, no central coordinator
Tweet cache (Redis Hash):
Key: tweet:{id}
Fields: author_id, content, created_at, like_count, retweet_count
TTL: 24 hours for recent tweets; no cache for old tweets
Social Graph Storage
Following/follower relationships:
Option A: Relational DB
follows table: (follower_id, followee_id, created_at)
"Who does user A follow?" → SELECT followee_id WHERE follower_id = A (indexed)
"Who follows user A?" → SELECT follower_id WHERE followee_id = A (indexed)
Sharding: by follower_id (write), replicate for followee_id queries
Option B: Graph DB (Neo4j) — better for complex queries
MATCH (u:User {id: A})-[:FOLLOWS]->(friends)
MATCH (friends)-[:FOLLOWS]->(fof) -- friends of friends
Overkill for simple follow/following; useful for recommendations
Follower count caching:
Redis: INCR follower_count:{user_id} on follow
DECR follower_count:{user_id} on unfollow
Sync to DB every 60 seconds (eventual consistency for counts)
Large follower lists:
100M followers stored in Redis Set: SET memory = 100M × 8 bytes = 800MB per celebrity
Use Redis Cluster, fan-out worker streams through followers with SSCAN
Batch size: 1000 followers per Kafka message → parallelizes fan-out
Trending Topics
Trending = hashtags/topics with spike in volume vs baseline
Pipeline:
Tweets → Kafka → Flink (sliding 5-minute window):
COUNT(hashtag) per window
Compare to 7-day average for same hour
Spike score = current / baseline
Trending list:
Redis sorted set: ZADD trending score:hashtag (per region)
Updated every 5 minutes
Top 10 trending returned to client
Personalized trending:
Filter global trending by user's language, location, and interests
"Trending in United States" vs "Worldwide"
Timeline Ranking (Beyond Chronological)
Raw chronological timeline → ranked timeline
Ranking factors:
Recency: recent tweets score higher
Engagement: tweets with high like/retweet velocity score higher
Author affinity: tweets from frequently-interacted authors score higher
Media: photos/videos get mild boost
Diversity: cap same-author at 2 consecutive tweets
Model: two-tower retrieval → lightweight ranker (gradient boosted trees)
Features: (tweet, user) pairs → rank score
Latency budget: < 50ms for ranking 200 candidate tweets
Twitter's "For You" feed:
Real-time relevance model on candidate tweets
Personalized ranking using interaction history
Balances recency + relevance + diversity
Scale Numbers
- 300M active users, 500M tweets/day (~5800 tweets/sec)
- Average follows: 200 users; celebrity: 100M followers
- Timeline reads: 300M users × 5 opens/day = 1.5B reads/day (~17K reads/sec)
- Redis for 300M user timelines: 300M × 800 tweet_ids × 8 bytes = 1.92TB — needs Redis Cluster
- Fan-out worker: 5800 tweets/sec × avg 200 followers = 1.16M Redis writes/sec
Interview Discussion Points
- Why Snowflake IDs instead of UUIDs? Snowflake IDs are time-sortable — tweet_id ORDER BY is equivalent to ORDER BY created_at without a secondary sort. Random UUIDs create random B-tree inserts (causing page fragmentation); Snowflake IDs append monotonically. Twitter processes 6000 tweet_id generations/sec — all within the 4096/ms capacity.
- How to handle the celebrity problem? At the time of read, fetch the 50 accounts the user follows with > 1M followers. For each, fetch their latest tweet IDs (LRANGE celebrity:tweets:{id} 0 9). Merge these 50×10=500 tweet IDs with the user’s pre-built timeline of 800 tweet IDs. Sort by Snowflake ID (time-ordered), take top 20. This adds ~10ms to read time for users following celebrities.
- Write amplification: A celebrity with 100M followers posting one tweet creates 100M Redis writes. Mitigation: batch writes (1000 followers per batch worker), dedicate fan-out worker pools to celebrity accounts, use separate Redis cluster for celebrity timelines, apply back-pressure if fan-out queue grows too large.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the difference between fan-out on write and fan-out on read for social feeds?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Fan-out on write (push model): when a user posts a tweet, the system immediately writes that tweet ID to each follower’s pre-built timeline cache in Redis. Timeline reads are O(1) u2014 just read from cache. The downside is write amplification: a celebrity with 100M followers posting one tweet creates 100M Redis writes. Fan-out on read (pull model): the timeline is assembled at read time by fetching recent tweets from all accounts the user follows. This avoids write amplification but is slow for users following thousands of accounts (N database queries per timeline load). Twitter uses a hybrid: fan-out on write for regular users ( last_seen_id” instead of timestamp range queries. Sequential IDs also avoid B-tree fragmentation from random inserts. Each worker node generates IDs independently without coordination, enabling millions of IDs per second with no central bottleneck.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle trending topics in a social media platform?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Trending topics are detected by comparing current hashtag/keyword volume against a historical baseline using a sliding window Flink streaming job. The pipeline ingests tweets from Kafka, counts hashtag occurrences in 5-minute tumbling windows, computes a spike score (current count / 7-day hourly average for the same day-of-week and hour), and publishes top trending items with score > threshold to a Redis sorted set (ZADD trending spike_score hashtag) updated every 5 minutes. The top-10 trending list is served from Redis in O(1) with ZREVRANGE. Personalization filters the global trending list by user’s language and region. Trending items are also decayed (score multiplied by 0.85 per hour) to naturally remove outdated trends.”
}
}
]
}