Two Graph Models
The first design decision is the relationship model:
- Directed graph (Twitter/Instagram follow): A follows B does not imply B follows A. Schema:
Follows(follower_id, followee_id, created_at). Index on bothfollower_idandfollowee_idto support “who does X follow” and “who follows X” queries. - Undirected graph (Facebook friend): Friendship is mutual. Schema:
Friendships(user_id_1, user_id_2, created_at)with the constraintuser_id_1 < user_id_2to avoid duplicate rows. A friend request flow requires aFriendRequest(from_id, to_id, status)table for the pending state.
Scale Challenge
At 1 billion users with an average of 300 follows each, the Follows table has 300 billion rows. A single MySQL instance holds roughly 1–2 billion rows comfortably — so you need sharding.
Sharding Strategy
Shard the Follows table by follower_id mod N. This co-locates all of a user’s outgoing follows on one shard, making “get following list for user X” a single-shard query. The trade-off: “get all followers of X” (fan-in query) must scatter to all shards. For most use cases, the following list is queried far more often, so this trade-off is correct.
For celebrities with 100M+ followers, fan-in queries are prohibitively expensive. Solution: maintain a separate FollowerCount Redis counter rather than counting rows, and avoid materializing the full follower list in real-time.
Cache Layer
Adjacency lists live in Redis as sets:
SADD follow:{user_id} {followee_id_1} {followee_id_2} ...
TTL: 1 hour on last access
On a follow action: update MySQL, then SADD follow:{follower_id} {followee_id} and invalidate or update any cached follower list for the followee.
Celebrity problem: A user with 50M followers would produce a Redis set of 50M members (~400 MB). Do not cache the follower list for celebrities. Only cache their following list (typically small). Follower queries for celebrities are served from DB with pagination.
Follower and Following Counts
Never compute counts with SELECT COUNT(*) on a 300B-row table. Use Redis counters:
INCR follower_count:{user_id}
INCR following_count:{user_id}
Periodically flush to a UserStats table for durability. On cache miss, recompute from DB and re-warm Redis.
Mutual Connections
To find mutual friends between users A and B:
- Small followings (both <10K): load both sets into Redis and run
SINTER follow:A follow:B. Fast in-memory set intersection. - Large followings: store sorted adjacency lists in DB, use merge-join on sorted followee IDs — O(|A| + |B|).
- Approximate: use MinHash sketches to estimate Jaccard similarity without full intersection. Useful for “you might know” scoring at scale.
People You May Know (PYMK)
The classic algorithm is friend-of-friend BFS:
- For user X, get their following set (depth 1).
- For each node at depth 1, get their following set (depth 2).
- Count how many depth-1 nodes point to each depth-2 node — this is the mutual connection score.
- Rank depth-2 nodes by score, exclude existing follows and X themselves.
This runs offline (nightly batch on Spark or Flink), results stored in a PYMK(user_id, recommended_user_id, score, updated_at) table, served from cache.
Graph Traversal at Scale
For deeper BFS (e.g., six-degrees queries or influence propagation):
- Use a worker pool reading BFS frontier from a Kafka topic. Each worker processes one user node, publishes their neighbors to the next-level topic.
- Track visited nodes with a Redis Bloom filter (space-efficient, probabilistic — false positives just skip a node, which is acceptable).
- Parallelize across workers; each worker is stateless, reads from its partition of the Kafka topic.
Key APIs
POST /follow/{target_id} → 200 OK
DELETE /follow/{target_id} → 200 OK
GET /users/{id}/following?cursor=&limit= → paginated list
GET /users/{id}/followers?cursor=&limit= → paginated list
GET /users/{id}/mutual/{other_id} → mutual connection list
GET /users/{id}/pymk → People You May Know list
Interview Tips
- Always clarify directed vs undirected — it changes schema, shard key choice, and cache strategy.
- The hardest scaling problem is the celebrity fan-out on write (feed fanout). That is a separate topic from graph storage but often comes up in the same interview.
- Bloom filters for BFS visited sets: explain the false positive implication clearly — you might revisit a node, wasting a little work, but you never miss one.
Meta system design interviews cover social graph and friend recommendations. See common questions for Meta interview: social graph and friend recommendations system design.
LinkedIn system design covers professional social graph and connections. Review patterns for LinkedIn interview: professional social graph system design.
Twitter system design covers follow graphs and social networks. See design patterns for Twitter/X interview: follow graph and social network system design.
See also: Snap Interview Guide