Matchmaking Service Low-Level Design: Skill-Based Matching, Queue Management, and Lobby System

Problem Overview

A matchmaking service groups players into balanced game sessions based on skill, latency, and preference. The system must minimize wait time while maximizing match quality. Poor matchmaking produces one-sided games; slow matchmaking drives players away — the design must balance both continuously.

Requirements and Constraints

Functional Requirements

Players enqueue individually or as a pre-formed party
Match players by skill rating (ELO or MMR) with configurable tolerance windows
Prefer players with low mutual latency (ping-based region affinity)
Create a lobby when a valid group is found; notify all matched players
Allow players to cancel while in queue

Non-Functional Requirements

Median wait time under 30 seconds for popular game modes
Match quality: average ELO delta within a match under 150 points at steady state
Handle 100,000 concurrent players in queue
Lobby creation latency under 200 ms after a match is found

Core Data Model

Queue Entry

CREATE TABLE matchmaking_queue (
  ticket_id       UUID PRIMARY KEY,
  player_id       BIGINT NOT NULL,
  party_id        UUID,               -- null for solo
  game_mode       VARCHAR(64) NOT NULL,
  mmr             INT NOT NULL,
  region          VARCHAR(32) NOT NULL,
  enqueued_at     TIMESTAMP NOT NULL DEFAULT NOW(),
  status          ENUM('waiting','matched','cancelled') DEFAULT 'waiting',
  tolerance       INT NOT NULL DEFAULT 50  -- current MMR tolerance window
);

Lobby

CREATE TABLE lobbies (
  lobby_id        UUID PRIMARY KEY,
  game_mode       VARCHAR(64) NOT NULL,
  region          VARCHAR(32) NOT NULL,
  server_ip       VARCHAR(45),
  status          ENUM('forming','ready','started','expired') DEFAULT 'forming',
  created_at      TIMESTAMP DEFAULT NOW()
);

CREATE TABLE lobby_players (
  lobby_id        UUID NOT NULL,
  player_id       BIGINT NOT NULL,
  team            TINYINT,
  PRIMARY KEY (lobby_id, player_id)
);

Key Algorithms and Logic

Skill-Based Matching with Tolerance Expansion

Each ticket starts with a tight MMR window (±50). A background job runs every 2 seconds and widens the tolerance by 25 points per iteration up to a cap (±300). This ensures no player waits indefinitely. The algorithm iterates over waiting tickets sorted by enqueue time and attempts to form a complete group (e.g., 5v5 = 10 players) where all players fall within the current maximum pairwise MMR delta threshold.

Greedy group formation: sort tickets by MMR ascending. Use a sliding window to find the smallest MMR range that fits exactly N players. If multiple windows qualify, prefer the one with the lowest average wait time.

Party Handling

A party is treated as a single atomic unit with an aggregate MMR equal to the average member MMR. The party occupies N slots when matched, where N is party size. The remaining slots are filled with solo players whose MMR falls within tolerance of the party average. Teams are balanced by distributing the party to one side and filling the other side to match total team MMR within a threshold.

Region and Latency Affinity

Players report their region on enqueue. The matchmaker first attempts same-region matches. After 15 seconds without a match, it expands to adjacent regions. Predicted ping is estimated from region pairs using a pre-built latency matrix; matches exceeding 80 ms predicted ping are only allowed after full tolerance expansion.

Matching Worker Architecture

Queue state lives in Redis as a sorted set per (game_mode, region) keyed by MMR: queue:{mode}:{region}. Matching workers (stateless) use a distributed lock per (mode, region) to prevent duplicate match attempts. Workers use ZRANGEBYSCORE to extract candidates in the current MMR window, attempt group formation in memory, and atomically remove matched tickets with a Lua script that checks status before deletion.

API Design

POST /v1/matchmaking/tickets
Body: { "player_id": "...", "game_mode": "ranked_5v5", "region": "us-east", "mmr": 1420 }
Response: { "ticket_id": "uuid", "status": "waiting", "estimated_wait_sec": 18 }

DELETE /v1/matchmaking/tickets/{ticket_id}
Response: 204 No Content

GET /v1/matchmaking/tickets/{ticket_id}/status
Response: { "status": "matched", "lobby_id": "uuid" }

POST /v1/lobbies/{lobby_id}/ready
Body: { "player_id": "..." }
Response: { "lobby_status": "forming", "ready_count": 7, "total": 10 }

Lobby Lifecycle

Once a match is formed, a lobby record is created and all matched players receive a WebSocket push notification. Players have 30 seconds to click “Accept.” If any player declines or times out, the lobby is dissolved and non-offending players are re-queued with a priority boost (they retain their original enqueue timestamp). Players who repeatedly decline receive a short matchmaking cooldown.

Scalability Considerations

Horizontal matching workers: Shard queues by (mode, region) — each shard has exactly one active worker holding a Redis lock, with hot standby for failover.
Queue depth monitoring: Alert when median wait time exceeds 45 seconds; auto-scale matching worker frequency (run every 1 s instead of 2 s) to clear backlog.
Game server provisioning: Pre-warm server fleet based on current queue depth and historical match rate; call a game server allocation API asynchronously as soon as a match is found.
State durability: Redis queue is the hot path, but all ticket state is also written to the relational DB for auditability and replay on Redis failure.

Failure Modes and Mitigations

Worker crash mid-match: Lua script atomicity ensures tickets are either all removed or none are. Orphaned “matched” tickets have a 60-second TTL watchdog that returns them to “waiting.”
Redis restart: Rebuild queue from DB rows where status = 'waiting' on startup; use AOF persistence to reduce rebuild scope.
MMR manipulation: Rate-limit ticket submissions per player; validate MMR server-side against the ranked profile service rather than trusting client-supplied values.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does ELO/MMR skill matching work in a matchmaking system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each player has a numeric rating (ELO or MMR). The matchmaker finds opponents whose ratings fall within an acceptable delta of the queuing player's rating. Ratings are stored per game mode and updated after each match using a K-factor weighted formula, so recent performance influences future match quality.”
}
},
{
“@type”: “Question”,
“name”: “What is queue expansion policy and when should it apply?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Queue expansion widens the acceptable MMR search range over time to prevent players from waiting indefinitely. A tiered policy starts with a tight delta (e.g. ±50 MMR), then expands in steps every 30–60 seconds. After a configurable ceiling (e.g. ±300 MMR) the player is placed in a fill match or shown an estimated wait time.”
}
},
{
“@type”: “Question”,
“name”: “How is aggregate MMR calculated for a party queuing together?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Party MMR is typically computed as the weighted average of individual member ratings, sometimes boosted by the highest member's MMR to account for smurf protection. The resulting aggregate is then used as if it were a single player's rating when searching for opponents, ensuring fair team composition.”
}
},
{
“@type”: “Question”,
“name”: “What are the key states in a lobby lifecycle?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A lobby progresses through: QUEUING (players added to pool), MATCHED (group assembled, lock acquired), ACCEPTING (each player confirms readiness with a timeout), LOADING (game server allocated, map loaded), IN_GAME (active session), and CLOSED (results persisted, MMR updated). Any acceptance timeout or disconnect before IN_GAME returns players to QUEUING.”
}
}
]
}