Question 1

What is the difference between fan-out on write and fan-out on read for activity feeds?

Accepted Answer

Fan-out on write (push): when Alice posts, her activity is immediately written to the feed table for each of her 500 followers. Feed reads are fast (pre-computed). Cost: high write amplification — 500 writes per post. Fan-out on read (pull): each follower's feed is computed at read time by querying activities from all followed users. No write amplification. Cost: slow reads — must query N sources and merge. Most systems use fan-out on write for accounts with normal follower counts and fan-out on read (or hybrid) for celebrities with millions of followers.

Question 2

How do you solve the celebrity problem in activity feeds?

Accepted Answer

When a user with 50M followers posts, fan-out on write would generate 50M DB writes — infeasible synchronously. Solutions: (1) Async fan-out via Kafka with many parallel workers (still takes hours). (2) Hybrid: regular users get fan-out on write; celebrity posts are identified at read time by checking a "high-follower" flag and fetching their recent activities directly, then merging into the pre-computed feed. (3) Active-follower filtering: only fan out to followers who have logged in within the last 30 days.

Question 3

How do you implement the activity object model (actor, verb, object)?

Accepted Answer

Model each activity as a triple: actor (who did it), verb (what was done), object (what was acted on). Example: actor=alice, verb=liked, object=post:123. Add optional target (who/what was the context): actor=alice, verb=commented, object=comment:456, target=post:123. Store activity metadata in JSONB for extensible context. This model works for any activity type without schema changes — new verbs are just new values in the verb column, not new tables or columns.

Question 4

How do you cache activity feeds in Redis?

Accepted Answer

Use a Redis sorted set per user: key=feed:{user_id}, score=timestamp, member=activity_id. ZADD feed:{user_id} {ts} {activity_id} to add entries. ZREVRANGEBYSCORE to read the latest N entries. ZREMRANGEBYRANK to trim to 1000 most recent. Set TTL=24h for inactive users (auto-expire cache). On cache miss: load from DB and rebuild the cache. Batch-fetch activity details (titles, names, photos) for the activity_ids in a single query to avoid N+1 DB calls.

Question 5

How do you paginate an activity feed with infinite scroll?

Accepted Answer

Use cursor pagination with the created_at timestamp as the cursor. The first request returns the 20 newest activities plus a next_cursor encoding the timestamp of the last item. Subsequent requests pass next_cursor as a parameter; the server returns activities with created_at < cursor. This is stable under inserts — new activities added at the top don't cause items on deeper pages to shift. In Redis: use ZREVRANGEBYSCORE with max=cursor_score to efficiently retrieve the next page from the sorted set.

Activity Feed System Low-Level Design

What is an Activity Feed?

Requirements

Data Model

Fan-out on Write (Push)

Hybrid: Redis Feed Cache

Celebrity Problem

Key Design Decisions