Activity Feed Aggregator: Low-Level Design
An activity feed aggregator collects events from multiple services — post created, user followed, payment completed, comment added — and assembles them into a personalized, time-ordered stream for each user. The central design challenge is fan-out: when a user with 1 million followers posts, should you write to 1 million feeds immediately (fan-out on write) or compute the feed at read time (fan-out on read)? This design covers the hybrid approach used by large social platforms, feed storage in Redis sorted sets, and pagination.
Core Data Model
CREATE TABLE ActivityEvent (
event_id BIGSERIAL PRIMARY KEY,
actor_id BIGINT NOT NULL, -- user who performed the action
verb VARCHAR(50) NOT NULL, -- 'posted', 'followed', 'liked', 'commented'
object_type VARCHAR(50) NOT NULL, -- 'post', 'user', 'comment', 'payment'
object_id BIGINT NOT NULL,
target_id BIGINT, -- e.g. the post being commented on
metadata JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE ActivityEvent_2026_04 PARTITION OF ActivityEvent
FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');
CREATE INDEX ON ActivityEvent(actor_id, created_at DESC);
CREATE INDEX ON ActivityEvent(object_type, object_id, created_at DESC);
Fan-Out Strategy: Hybrid Push/Pull
import redis, json, time
r = redis.Redis(decode_responses=True)
CELEBRITY_THRESHOLD = 10_000 # followers; above this, use pull instead of push
FEED_MAX_SIZE = 1_000 # max events per user feed in Redis
FEED_TTL_SECONDS = 86400 * 7 # 7-day feed retention
def publish_event(event: dict):
"""
Called by any service when a notable action occurs.
Decides fan-out strategy based on actor's follower count.
"""
actor_id = event['actor_id']
follower_count = _get_follower_count(actor_id)
if follower_count <= CELEBRITY_THRESHOLD:
# Push to all followers' feeds immediately (fan-out on write)
_fan_out_to_followers(event, actor_id)
else:
# Celebrity: only write to own timeline (fan-out on read)
# Followers' feeds will merge this at read time
_write_to_actor_timeline(event, actor_id)
# Always write to the durable event store
_persist_event(event)
def _fan_out_to_followers(event: dict, actor_id: int):
"""Push event to every follower's feed sorted set."""
follower_ids = db.fetchall(
"SELECT follower_id FROM Follow WHERE followee_id=%s", (actor_id,)
)
event_json = json.dumps(event)
score = event['created_at'] # Unix timestamp as sorted set score
pipe = r.pipeline(transaction=False)
for row in follower_ids:
feed_key = f"feed:{row['follower_id']}"
pipe.zadd(feed_key, {event_json: score})
pipe.zremrangebyrank(feed_key, 0, -(FEED_MAX_SIZE + 1)) # trim to max size
pipe.expire(feed_key, FEED_TTL_SECONDS)
pipe.execute()
def _write_to_actor_timeline(event: dict, actor_id: int):
"""For celebrity accounts, write only to the actor's own sorted set."""
tl_key = f"timeline:{actor_id}"
r.zadd(tl_key, {json.dumps(event): event['created_at']})
r.zremrangebyrank(tl_key, 0, -(FEED_MAX_SIZE + 1))
r.expire(tl_key, FEED_TTL_SECONDS)
def _persist_event(event: dict):
db.execute("""
INSERT INTO ActivityEvent
(actor_id, verb, object_type, object_id, target_id, metadata, created_at)
VALUES (%s,%s,%s,%s,%s,%s, to_timestamp(%s))
""", (event['actor_id'], event['verb'], event['object_type'],
event['object_id'], event.get('target_id'),
json.dumps(event.get('metadata', {})), event['created_at']))
Feed Read: Merge Push Feed with Celebrity Timelines
def get_feed(user_id: int, limit: int = 20, before_score: float = None) -> list:
"""
Retrieve paginated feed for a user.
Merges: (1) pre-computed push feed + (2) celebrity timelines pulled on read.
"""
max_score = before_score or '+inf'
min_score = '-inf'
# 1. Fetch from user's push feed
feed_key = f"feed:{user_id}"
push_items = r.zrevrangebyscore(
feed_key, max_score, min_score,
start=0, num=limit * 2, # fetch extra to allow merging
withscores=True
)
# 2. Fetch from celebrity timelines followed by this user
celebrity_ids = db.fetchall("""
SELECT followee_id FROM Follow f
JOIN User u ON u.user_id=f.followee_id
WHERE f.follower_id=%s
AND (SELECT COUNT(*) FROM Follow WHERE followee_id=f.followee_id) > %s
""", (user_id, CELEBRITY_THRESHOLD))
celebrity_items = []
for row in celebrity_ids[:20]: # cap celebrity timelines merged per request
tl_key = f"timeline:{row['followee_id']}"
items = r.zrevrangebyscore(tl_key, max_score, min_score,
start=0, num=limit, withscores=True)
celebrity_items.extend(items)
# 3. Merge all items, sort by score (timestamp) descending, deduplicate
all_items = push_items + celebrity_items
all_items.sort(key=lambda x: x[1], reverse=True)
seen_ids = set()
results = []
for event_json, score in all_items:
event = json.loads(event_json)
event_key = f"{event['actor_id']}:{event['verb']}:{event['object_id']}"
if event_key not in seen_ids:
seen_ids.add(event_key)
event['_score'] = score
results.append(event)
if len(results) >= limit:
break
return results
def get_feed_cursor(results: list) -> float:
"""Returns score of last item for next-page pagination."""
return results[-1]['_score'] if results else None
Key Design Decisions
- Hybrid fan-out eliminates the celebrity problem: pure fan-out-on-write for a user with 50M followers means 50M Redis writes per post — a 10-second operation. Pure fan-out-on-read means merging 1,000 followed users’ timelines on every feed load — 1,000 Redis reads. The hybrid uses fan-out-on-write for normal users (<10K followers) and fan-out-on-read for celebrities. The threshold (10K) balances write amplification vs. read-time merge cost.
- Redis sorted set with timestamp score: ZREVRANGEBYSCORE provides cursor-based pagination over time — pass the last seen score as before_score for the next page. This is O(log N + M) where M is items returned, regardless of total feed size. Sorted sets are more efficient than lists for time-ordered feeds because they support range queries by score.
- Feed TTL of 7 days: inactive users’ feeds expire and are rebuilt from the database (ActivityEvent table) on their next login. The cold-load rebuild queries the actor_id index on ActivityEvent for the user’s followed accounts and re-populates the Redis feed. This keeps Redis memory usage bounded.
- Monthly partitions on ActivityEvent: the durable store accumulates events indefinitely. Monthly partitions allow old data to be detached and archived to cold storage without impacting the current month’s partition, which is the hot read path for cold-feed rebuilds.
Activity feed aggregator and social timeline system design is discussed in Meta system design interview questions.
Activity feed aggregator and social media feed design is covered in Twitter system design interview preparation.
Activity feed aggregator and professional activity stream design is discussed in LinkedIn system design interview guide.