What is an Activity Feed?
An activity feed shows users a chronological stream of events from people and things they follow: “Alice liked your photo”, “Bob published a new article”, “Your order shipped”. LinkedIn’s news feed, GitHub’s activity feed, and Shopify’s store activity feed are all examples. The core design question is the fan-out strategy: do you write events to each follower’s feed at write time (push/fan-out-on-write), or compile the feed at read time (pull/fan-out-on-read)? Each trades write amplification against read latency.
Requirements
- Show activities from followed users/entities in reverse chronological order
- Activity types: follow, like, comment, publish, purchase, status change
- Feed latency: new activity appears in follower feeds within 5 seconds
- Read feed: <50ms for first page of 20 items
- 100M users, average 500 followers per user, 10K events/second
- Pagination: infinite scroll with cursor
Data Model
Activity(
activity_id UUID PRIMARY KEY,
actor_id UUID NOT NULL, -- who performed the action
verb VARCHAR NOT NULL, -- 'liked', 'commented', 'published', 'followed'
object_type VARCHAR, -- 'post', 'photo', 'order'
object_id UUID,
target_id UUID, -- who/what was acted upon (nullable)
metadata JSONB, -- extra context: {'post_title': '...'}
created_at TIMESTAMPTZ NOT NULL
)
UserFeed(
user_id UUID NOT NULL,
activity_id UUID NOT NULL,
created_at TIMESTAMPTZ NOT NULL, -- denormalized for sort
PRIMARY KEY (user_id, activity_id)
)
-- UserFeed is the fan-out table: one row per follower per activity
Fan-out on Write (Push)
def on_new_activity(activity):
# Store the activity
db.insert(Activity(**activity))
# Fan out to all followers
followers = get_followers(activity.actor_id) # could be millions
# For small follower counts: synchronous DB inserts
if len(followers) <= 1000:
db.bulk_insert([
UserFeed(user_id=f, activity_id=activity.id,
created_at=activity.created_at)
for f in followers
])
else:
# For large follower counts (celebrities): async via Kafka
kafka.produce('feed-fanout', {
'activity_id': activity.id,
'actor_id': activity.actor_id,
'created_at': activity.created_at.isoformat()
})
# Workers consume and fan out in parallel batches
def read_feed(user_id, cursor=None, limit=20):
# Single fast indexed query on UserFeed
query = '''
SELECT a.* FROM UserFeed uf
JOIN Activity a ON uf.activity_id = a.activity_id
WHERE uf.user_id = :uid
'''
if cursor:
query += ' AND uf.created_at < :cursor'
query += ' ORDER BY uf.created_at DESC LIMIT :limit'
return db.query(query, uid=user_id, cursor=cursor, limit=limit)
Hybrid: Redis Feed Cache
For hot users who read their feed frequently, cache the feed in a Redis sorted set:
# Score = timestamp (Unix); member = activity_id
def add_to_feed_cache(user_id, activity_id, timestamp):
key = f'feed:{user_id}'
redis.zadd(key, {activity_id: timestamp})
redis.zremrangebyrank(key, 0, -1001) # keep only latest 1000
redis.expire(key, 86400) # TTL 24h
def read_feed_cached(user_id, before_ts=None, limit=20):
key = f'feed:{user_id}'
max_score = before_ts or '+inf'
ids = redis.zrevrangebyscore(key, max_score, '-inf',
start=0, num=limit, withscores=False)
if len(ids) < limit:
# Cache miss or expired: fall back to DB
return read_feed_from_db(user_id, before_ts, limit)
return get_activities_by_ids(ids) # batch fetch activity details
Celebrity Problem
A user with 50M followers posts once — 50M fan-out writes would take hours. Solutions:
- Async fan-out via Kafka with many parallel workers
- Pull for celebrities: regular users get fan-out-on-write; celebrity posts are fetched at read time and merged into the feed
- Cap fan-out: only fan out to users who have been active in the last 30 days
Key Design Decisions
- Fan-out on write for most users — fast reads; feed query is a simple indexed scan
- Async fan-out for high-follower accounts — prevents write spikes blocking the DB
- Redis sorted set cache — feeds can be served entirely from cache for active users
- Activity object model (actor, verb, object) — extensible; new activity types don’t require schema changes
- Cursor pagination on created_at — stable, O(1) regardless of feed depth
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the difference between fan-out on write and fan-out on read for activity feeds?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Fan-out on write (push): when Alice posts, her activity is immediately written to the feed table for each of her 500 followers. Feed reads are fast (pre-computed). Cost: high write amplification — 500 writes per post. Fan-out on read (pull): each follower’s feed is computed at read time by querying activities from all followed users. No write amplification. Cost: slow reads — must query N sources and merge. Most systems use fan-out on write for accounts with normal follower counts and fan-out on read (or hybrid) for celebrities with millions of followers.”}},{“@type”:”Question”,”name”:”How do you solve the celebrity problem in activity feeds?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”When a user with 50M followers posts, fan-out on write would generate 50M DB writes — infeasible synchronously. Solutions: (1) Async fan-out via Kafka with many parallel workers (still takes hours). (2) Hybrid: regular users get fan-out on write; celebrity posts are identified at read time by checking a "high-follower" flag and fetching their recent activities directly, then merging into the pre-computed feed. (3) Active-follower filtering: only fan out to followers who have logged in within the last 30 days.”}},{“@type”:”Question”,”name”:”How do you implement the activity object model (actor, verb, object)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Model each activity as a triple: actor (who did it), verb (what was done), object (what was acted on). Example: actor=alice, verb=liked, object=post:123. Add optional target (who/what was the context): actor=alice, verb=commented, object=comment:456, target=post:123. Store activity metadata in JSONB for extensible context. This model works for any activity type without schema changes — new verbs are just new values in the verb column, not new tables or columns.”}},{“@type”:”Question”,”name”:”How do you cache activity feeds in Redis?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a Redis sorted set per user: key=feed:{user_id}, score=timestamp, member=activity_id. ZADD feed:{user_id} {ts} {activity_id} to add entries. ZREVRANGEBYSCORE to read the latest N entries. ZREMRANGEBYRANK to trim to 1000 most recent. Set TTL=24h for inactive users (auto-expire cache). On cache miss: load from DB and rebuild the cache. Batch-fetch activity details (titles, names, photos) for the activity_ids in a single query to avoid N+1 DB calls.”}},{“@type”:”Question”,”name”:”How do you paginate an activity feed with infinite scroll?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use cursor pagination with the created_at timestamp as the cursor. The first request returns the 20 newest activities plus a next_cursor encoding the timestamp of the last item. Subsequent requests pass next_cursor as a parameter; the server returns activities with created_at < cursor. This is stable under inserts — new activities added at the top don’t cause items on deeper pages to shift. In Redis: use ZREVRANGEBYSCORE with max=cursor_score to efficiently retrieve the next page from the sorted set.”}}]}
Activity feed and news feed design is a common topic in Meta system design interview questions.
Activity feed fan-out architecture is discussed in Twitter system design interview preparation.
Activity feed and professional network feed design is covered in LinkedIn system design interview guide.