System Design: Twitter / Social Media Feed

⏱ 6 min read

Designing a social media feed is one of the most nuanced system design problems you’ll encounter in interviews. It looks simple — “show a user the recent posts from people they follow” — but at scale it surfaces fundamental trade-offs that junior engineers miss entirely. Twitter, Instagram, and Facebook all have engineering blog posts describing how they solved this. Knowing those solutions cold is a significant advantage.

Step 1: Clarify Requirements

Scale: How many users? How many follows? Average posts per day?
Feed freshness: How quickly must a new tweet appear in followers’ feeds?
Feed ordering: Reverse chronological, or ranked by relevance/engagement?
Media: Text only, or images/video too?
Read/write ratio: Typically very read-heavy (100:1 or higher).

Assume: 300M daily active users, 500M tweets/day (~5,800/sec), each user follows ~200 accounts, each user checks feed ~10 times/day → 3B feed reads/day (~35,000 reads/sec). Reverse-chronological feed, text + images.

Step 2: Back-of-Envelope

Writes: 500M tweets/day = ~5,800/sec (peak ~20,000/sec)
Reads:  3B feed loads/day = ~35,000/sec (peak ~100,000/sec)
Read:write ratio ≈ 100:1

Tweet storage:
  tweet_id (8B) + user_id (8B) + text (280 chars = 280B)
  + created_at (8B) + media_url (100B) ≈ ~400 bytes/tweet
  500M tweets/day × 400B = 200GB/day
  5 years = ~365TB raw tweet storage

Timeline storage (if we precompute — see below):
  Each tweet fan-out to avg 200 followers = 100B timeline writes/day
  Each entry: tweet_id (8B) + user_id (8B) = 16B
  16B × 100B entries/day = 1.6TB/day → expensive

Step 3: The Core Problem — Feed Generation

This is the crux. When user A opens the app and requests their feed, you need to return the N most recent tweets from all accounts they follow. There are two fundamental approaches:

Fanout-on-Read (Pull Model)

When a user requests their feed, look up all accounts they follow, fetch recent tweets from each, merge and sort by time, return the top N.

def get_feed(user_id, page_size=20):
    following = follow_db.get_following(user_id)        # e.g., 200 accounts
    all_tweets = []
    for account in following:
        tweets = tweet_db.get_recent(account, limit=20) # 200 DB reads
        all_tweets.extend(tweets)
    return sorted(all_tweets, key=lambda t: t.created_at, reverse=True)[:page_size]

Pros: Writes are cheap — posting a tweet just inserts one row. No precomputation. Always fresh.

Cons: Feed reads are expensive — 200 DB lookups per feed load, merged and sorted in memory. At 35K reads/sec this is 7M DB queries/sec. Doesn’t scale. Response latency is high.

Fanout-on-Write (Push Model)

When user A posts a tweet, push it into the feed (timeline) of every follower immediately.

def post_tweet(author_id, content):
    tweet_id = tweet_db.insert(author_id, content)      # 1 write
    followers = follow_db.get_followers(author_id)       # e.g., 200 followers
    for follower_id in followers:
        timeline_cache.prepend(follower_id, tweet_id)   # 200 cache writes

Now a feed load is a single cache lookup per user:

def get_feed(user_id, page_size=20):
    tweet_ids = timeline_cache.get_range(user_id, 0, page_size)  # 1 cache read
    return tweet_db.batch_get(tweet_ids)                          # 1 batch DB read

Pros: Feed reads are O(1) — single cache lookup. Extremely fast. Scales to massive read traffic.

Cons: Writes are expensive. A user with 10M followers posting one tweet triggers 10M cache writes. This is the celebrity problem.

Step 4: Solving the Celebrity Problem

Pure fanout-on-write breaks when a celebrity (Elon Musk, Taylor Swift) posts a tweet. Fanning out to 100M followers synchronously would take minutes and overwhelm the system.

Hybrid approach (what Twitter actually uses):

Regular users (< ~10,000 followers): Fanout-on-write. Tweets are pushed to followers' timeline caches immediately.
Celebrity accounts (> ~10,000 followers, or flagged): Not fanned out. Instead, when any user loads their feed, the system checks if they follow any celebrities and fetches recent celebrity tweets on-the-fly (fanout-on-read for celebrities only), then merges with their precomputed timeline.

def get_feed(user_id, page_size=20):
    # 1. Get precomputed timeline (regular follows, from cache)
    timeline = timeline_cache.get_range(user_id, 0, 100)

    # 2. Get followed celebrities
    celebrities = follow_db.get_followed_celebrities(user_id)

    # 3. Fetch recent celebrity tweets (cache per celebrity, not per follower)
    celeb_tweets = []
    for celeb_id in celebrities:
        celeb_tweets += tweet_cache.get_recent(celeb_id, limit=20)

    # 4. Merge, sort, return top N
    all_tweets = timeline + celeb_tweets
    return sorted(all_tweets, key=lambda t: t.created_at, reverse=True)[:page_size]

Celebrity tweets are cached once per celebrity (not replicated per follower), so serving 100M followers reading a celebrity’s tweet costs one cache read per user request against a shared cache entry — not 100M cache writes.

Step 5: Data Model

-- Tweets
tweets (
    tweet_id    BIGINT PRIMARY KEY,   -- Snowflake ID (sortable by time)
    user_id     BIGINT,
    content     VARCHAR(280),
    media_url   TEXT,
    created_at  TIMESTAMP,
    like_count  INT DEFAULT 0,
    retweet_count INT DEFAULT 0
)

-- Social graph
follows (
    follower_id BIGINT,
    followee_id BIGINT,
    created_at  TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id)
)

-- Timeline cache (Redis sorted set, score = timestamp)
-- Key: timeline:{user_id}
-- Members: tweet_ids, scored by created_at unix timestamp
-- Keep only most recent ~800 tweet_ids per user

Tweet IDs: Use a Snowflake-style distributed ID (timestamp + datacenter + sequence number). This gives you time-ordered IDs without a centralized counter, which means sorting by ID is equivalent to sorting by time — no separate ORDER BY created_at needed.

Database choice: Tweets → MySQL or Postgres, sharded by tweet_id. Follow graph → MySQL (small rows, relational). Timeline cache → Redis sorted sets (fast range queries by score/timestamp).

Step 6: Feed Storage in Redis

Redis sorted sets are ideal for timelines. The score is the tweet’s creation timestamp; members are tweet IDs.

ZADD timeline:{user_id} {timestamp} {tweet_id}  # O(log N) insert
ZREVRANGE timeline:{user_id} 0 19               # O(log N + 20) fetch top 20

Keep only the latest ~800 tweets in each user’s timeline. Older posts are fetched from the DB on scroll (most users never paginate past 50 items). Trim the sorted set on write: ZREMRANGEBYRANK timeline:{user_id} 0 -801.

Step 7: High-Level Architecture

Post tweet:
  Client → API Gateway → Tweet Service
    ├─ Insert tweet into MySQL (sharded)
    └─ Publish to Kafka "new_tweet" topic
         └─ Fanout Workers (consumer group)
               ├─ For regular follows: ZADD to each follower's Redis timeline
               └─ For celebrities: update celebrity tweet cache only

Load feed:
  Client → API Gateway → Feed Service
    ├─ ZREVRANGE from Redis timeline (precomputed, regular follows)
    ├─ GET recent tweets for followed celebrities (Redis per-celeb cache)
    ├─ Merge + sort
    └─ Batch fetch tweet details from MySQL/tweet cache
         └─ Return to client

Follow-up Questions

Q: How do you handle ranked/algorithmic feeds instead of chronological?
Replace the sorted set score with an engagement score (likes × recency × relationship strength). A separate ranking service re-scores timeline entries periodically and writes new scores to Redis. Real Twitter uses ML-based ranking.

Q: What if a user hasn’t been active for 30 days?
Don’t maintain a timeline cache for inactive users — let it expire. On first login after inactivity, rebuild the timeline from the DB (fanout-on-read for that one cold-start load), then resume fanout-on-write going forward.

Q: How do you handle retweets?
A retweet is a new row in the tweets table with a retweet_of foreign key. The fanout worker treats it identically to a regular tweet. On display, the client renders it as a retweet.

Q: How do you prevent a single viral tweet from causing a thundering herd on the tweet DB?
Cache individual tweet objects in Redis with a short TTL (60–300s). A single viral tweet gets one DB read and N cache hits, not N DB reads.

Summary

Fanout-on-write (push) makes reads fast but writes expensive. Fanout-on-read (pull) makes writes cheap but reads slow. The hybrid — push for regular users, pull for celebrities — is how Twitter, Instagram, and Facebook all solved this. Store timelines as Redis sorted sets scored by timestamp. Use Snowflake IDs so tweet ordering is built into the ID. Use Kafka to decouple fanout from the write path.

Caching Strategies — Redis sorted sets store precomputed timelines; cache-aside caches individual tweets.
Message Queues — Kafka decouples the fanout workers from the tweet write path.
Database Sharding — the tweet table is sharded by tweet_id at scale; the follow graph has its own sharding strategy.
Consistent Hashing — routes tweet lookups to the correct shard.
Design a Chat System — another real-time delivery problem with different trade-offs.
Design a URL Shortener — a simpler SD problem to warm up with before tackling feed systems.

See also: Design a News Feed (Facebook-style) — how Facebook’s hybrid fanout, EdgeRank, and privacy filtering differ from Twitter’s simpler chronological model.