Low Level Design: News Feed Service – Tech Interview Dot Org

A news feed service delivers a ranked, personalized stream of posts to each user. The central design question is delivery model: fan-out on write vs fan-out on read. At scale, the answer is usually a hybrid of both.

Fan-Out on Write (Push Model)

When a user creates a post, the system immediately writes the post_id to the feed queue of every follower. Feed reads are fast: just fetch from the pre-built feed table for that user. The cost is write amplification: one post by a user with 1 million followers requires 1 million feed writes. For celebrities, this makes writes prohibitively slow and expensive.

Feed table schema: user_id, post_id, score, created_at. Index on (user_id, score DESC) for fast feed reads. Writes are handled asynchronously by a fan-out worker pool consuming from a post-created event queue (Kafka).

Fan-Out on Read (Pull Model)

At read time, fetch posts from all followees, merge, and rank. No write amplification. The read cost is proportional to the number of followees and the volume of their recent posts. For users following thousands of accounts, merging and ranking in real time is too slow for a sub-200ms feed load.

Hybrid Delivery

Push for regular users (fewer than ~10,000 followers), pull for celebrities (above the threshold). At read time, the feed service fetches the pre-built feed from the feed table (push portion) and merges it with a real-time pull from celebrity followees the user follows. The merge is a simple sorted merge of two streams, bounded by a small window of recent celebrity posts.

Post Storage

Posts are stored in a posts table: post_id (UUID or snowflake ID), author_id, content, media_urls, created_at. The feed table stores only post_id references; post content is fetched separately and can be served from a post cache (Redis or Memcached) keyed by post_id. This avoids duplicating post content across millions of feed rows.

Feed Ranking

A chronological feed is the simplest ranking. A ranked feed uses an ML model trained on engagement signals: likes, comments, shares, dwell time. Features include recency, relationship strength (how often the viewer interacts with the author), post type (video vs image vs text), and viewer device/context. The ranking model runs at read time on the candidate set pulled from the feed table.

Cursor-Based Pagination

Offset-based pagination (LIMIT x OFFSET y) is fragile for a ranked feed: new posts inserted at the top shift offsets and cause duplicates or gaps. Use a composite cursor of (score, post_id) instead. The client sends the cursor from the last seen item; the server fetches posts with score less than the cursor score (or equal score with post_id less than cursor post_id). This gives stable, consistent pagination even as new posts arrive.

Feed Cache and Cache Warming

Cache the first page of each active user feed in Redis with a TTL (e.g., 5 minutes). On cache miss, build from the feed table and repopulate. For high-traffic users, warm the cache proactively after each fan-out write. This keeps p99 feed load times low even during traffic spikes.

Unseen Counter

Track the number of new posts since the user last viewed their feed. Store last_seen_timestamp per user. On feed load, count feed entries with created_at greater than last_seen_timestamp. Display the count as a badge. Update last_seen_timestamp on each feed open. This counter drives re-engagement notifications (push or email) when unseen count exceeds a threshold.

Handling Deletes and Edits

When a post is deleted, the fan-out worker removes its post_id from all feed tables (or marks it deleted in the posts table and filters at read time). Filtering at read time is simpler and avoids expensive distributed deletes, at the cost of slightly inflating the candidate set before ranking. Edits update only the posts table; feed entries are unchanged since they reference post_id.

Frequently Asked Questions

What is a news feed service in system design?

A news feed service aggregates content from the accounts a user follows and presents it in a ranked, personalized stream. When a user opens the app, the feed service must quickly assemble a list of recent posts from potentially thousands of followed accounts, apply a ranking model, and return the top N results — all within a few hundred milliseconds. The core design challenge is balancing freshness, relevance, and latency at massive scale, where millions of users may be loading their feeds simultaneously and popular accounts may have millions of followers.

What is the difference between fan-out on write and fan-out on read?

Fan-out on write (push model): when a user creates a post, the system immediately writes a copy of the post ID into the feed inbox of every follower. Feed reads are fast — just read the pre-built inbox — but writes are expensive for accounts with many followers, as one post triggers millions of inbox writes. Fan-out on read (pull model): when a user requests their feed, the system queries the social graph for followed accounts, fetches their recent posts, merges, and ranks in real time. Writes are cheap but reads are slow and compute-intensive, especially for users who follow thousands of accounts. The push model is better for read-heavy feeds with moderate follower counts; the pull model is better when write cost is the bottleneck.

How does the hybrid feed delivery model work for celebrity accounts?

The hybrid model applies fan-out on write for ordinary users (typically defined as accounts with fewer than a threshold number of followers, e.g., 10,000) and fan-out on read for high-follower accounts (celebrities, brands). When a user loads their feed, the service reads their pre-built push inbox for ordinary followed accounts and then separately pulls recent posts from any high-follower accounts they follow, merging the two result sets before ranking. This bounds the worst-case write amplification: a celebrity posting to 50 million followers does not trigger 50 million inbox writes. The threshold and logic for classifying accounts as high-follower are tunable parameters that can be adjusted based on observed write throughput and latency.

How do you rank posts in a personalized news feed?

Ranking transforms the chronological candidate set into a relevance-ordered list tailored to the individual user. Features used include: engagement signals on the post (likes, comments, shares, view time) weighted by recency; relationship strength between the viewer and the author (interaction frequency, mutual friends); content type preferences derived from the user’s historical engagement; and diversity constraints to avoid showing too many consecutive posts from one author or of one content type. A gradient-boosted tree or a neural ranking model scores each candidate post, and the top-K are returned. The model is retrained continuously on implicit feedback (clicks, shares, time spent) using a streaming training pipeline. A/B testing infrastructure evaluates ranking model changes against engagement and retention metrics before full rollout.