Social Graph System Low-Level Design

Two Graph Models

The first design decision is the relationship model:

  • Directed graph (Twitter/Instagram follow): A follows B does not imply B follows A. Schema: Follows(follower_id, followee_id, created_at). Index on both follower_id and followee_id to support “who does X follow” and “who follows X” queries.
  • Undirected graph (Facebook friend): Friendship is mutual. Schema: Friendships(user_id_1, user_id_2, created_at) with the constraint user_id_1 < user_id_2 to avoid duplicate rows. A friend request flow requires a FriendRequest(from_id, to_id, status) table for the pending state.

Scale Challenge

At 1 billion users with an average of 300 follows each, the Follows table has 300 billion rows. A single MySQL instance holds roughly 1–2 billion rows comfortably — so you need sharding.

Sharding Strategy

Shard the Follows table by follower_id mod N. This co-locates all of a user’s outgoing follows on one shard, making “get following list for user X” a single-shard query. The trade-off: “get all followers of X” (fan-in query) must scatter to all shards. For most use cases, the following list is queried far more often, so this trade-off is correct.

For celebrities with 100M+ followers, fan-in queries are prohibitively expensive. Solution: maintain a separate FollowerCount Redis counter rather than counting rows, and avoid materializing the full follower list in real-time.

Cache Layer

Adjacency lists live in Redis as sets:

SADD follow:{user_id} {followee_id_1} {followee_id_2} ...
TTL: 1 hour on last access

On a follow action: update MySQL, then SADD follow:{follower_id} {followee_id} and invalidate or update any cached follower list for the followee.

Celebrity problem: A user with 50M followers would produce a Redis set of 50M members (~400 MB). Do not cache the follower list for celebrities. Only cache their following list (typically small). Follower queries for celebrities are served from DB with pagination.

Follower and Following Counts

Never compute counts with SELECT COUNT(*) on a 300B-row table. Use Redis counters:

INCR follower_count:{user_id}
INCR following_count:{user_id}

Periodically flush to a UserStats table for durability. On cache miss, recompute from DB and re-warm Redis.

Mutual Connections

To find mutual friends between users A and B:

  • Small followings (both <10K): load both sets into Redis and run SINTER follow:A follow:B. Fast in-memory set intersection.
  • Large followings: store sorted adjacency lists in DB, use merge-join on sorted followee IDs — O(|A| + |B|).
  • Approximate: use MinHash sketches to estimate Jaccard similarity without full intersection. Useful for “you might know” scoring at scale.

People You May Know (PYMK)

The classic algorithm is friend-of-friend BFS:

  1. For user X, get their following set (depth 1).
  2. For each node at depth 1, get their following set (depth 2).
  3. Count how many depth-1 nodes point to each depth-2 node — this is the mutual connection score.
  4. Rank depth-2 nodes by score, exclude existing follows and X themselves.

This runs offline (nightly batch on Spark or Flink), results stored in a PYMK(user_id, recommended_user_id, score, updated_at) table, served from cache.

Graph Traversal at Scale

For deeper BFS (e.g., six-degrees queries or influence propagation):

  • Use a worker pool reading BFS frontier from a Kafka topic. Each worker processes one user node, publishes their neighbors to the next-level topic.
  • Track visited nodes with a Redis Bloom filter (space-efficient, probabilistic — false positives just skip a node, which is acceptable).
  • Parallelize across workers; each worker is stateless, reads from its partition of the Kafka topic.

Key APIs

POST /follow/{target_id}                  → 200 OK
DELETE /follow/{target_id}                → 200 OK
GET  /users/{id}/following?cursor=&limit= → paginated list
GET  /users/{id}/followers?cursor=&limit= → paginated list
GET  /users/{id}/mutual/{other_id}        → mutual connection list
GET  /users/{id}/pymk                     → People You May Know list

Interview Tips

  • Always clarify directed vs undirected — it changes schema, shard key choice, and cache strategy.
  • The hardest scaling problem is the celebrity fan-out on write (feed fanout). That is a separate topic from graph storage but often comes up in the same interview.
  • Bloom filters for BFS visited sets: explain the false positive implication clearly — you might revisit a node, wasting a little work, but you never miss one.

Meta system design interviews cover social graph and friend recommendations. See common questions for Meta interview: social graph and friend recommendations system design.

LinkedIn system design covers professional social graph and connections. Review patterns for LinkedIn interview: professional social graph system design.

Twitter system design covers follow graphs and social networks. See design patterns for Twitter/X interview: follow graph and social network system design.

See also: Snap Interview Guide

Scroll to Top