Designing a ride-sharing app is a top-5 system design interview question, especially at Uber, Lyft, DoorDash, and Meta. It combines geospatial algorithms, real-time data streams, distributed systems, and marketplace dynamics. Understanding all four areas is required to pass senior-level interviews.
Functional Requirements
- Rider requests a ride from location A to B
- System matches the rider with the nearest available driver
- Driver and rider can track each other in real time during the trip
- Trip pricing (upfront fare with surge multiplier)
- Payment processing on trip completion
- Rating system for drivers and riders
Non-Functional Requirements and Scale
| Metric | Target |
|---|---|
| Active drivers (peak) | 1 million globally |
| Active riders (peak) | 5 million globally |
| Location update rate | Every 4 seconds per driver |
| Location update QPS | ~250,000/s (1M drivers / 4s) |
| Match latency | < 1 second (rider to matched driver) |
| Availability | 99.99% |
High-Level Architecture
[Rider App] [Driver App]
| |
[API Gateway + Auth]
/
[Ride Service] [Location Service] [Dispatch Service]
| | |
[Trip State DB] [Redis Geo Index] [Matching Engine]
| |
[Payment Service] [ETA Service]
|
[Notification Service (SMS/Push)]
Real-Time Driver Location Tracking
Driver Location Updates
Each driver app sends GPS coordinates every 4 seconds via WebSocket or HTTP POST. At 1 million active drivers, this is 250,000 writes/second — too high for a traditional relational DB.
Geospatial Index with Redis
Store driver locations in Redis using the GEOADD/GEORADIUS commands. Redis Geo uses a sorted set with geohash-encoded scores:
# Driver sends location update
GEOADD drivers:available 37.7749 -122.4194 "driver:12345"
# Rider requests a ride at (37.778, -122.415) — find drivers within 2km
GEORADIUS drivers:available -122.415 37.778 2 km ASC COUNT 10 WITHDIST
# Returns:
# 1) driver:12345, 0.43 km
# 2) driver:67890, 1.21 km
# ...
When a driver goes offline or accepts a ride, remove them from the available set:
ZREM drivers:available "driver:12345"
Geohash for Sharding
For global scale, partition the Redis Geo index by geohash prefix (first 4 characters = ~40km x 20km cells). Each cell is served by a separate Redis instance, allowing horizontal scaling of location writes.
def get_redis_shard(lat: float, lon: float) -> Redis:
# First 4 chars of geohash identifies the ~40km cell
cell = geohash.encode(lat, lon)[:4]
shard_index = hash(cell) % NUM_SHARDS
return redis_pool[shard_index]
Dispatch and Driver Matching
Matching Algorithm
Simple approach: nearest available driver. Production systems also consider:
- ETA to pickup (not just straight-line distance — uses road network)
- Driver acceptance rate (avoid sending to drivers likely to decline)
- Driver heading (driver heading toward rider is preferable)
- Batch matching (batching multiple nearby requests to optimize global assignment)
def match_driver(rider_lat, rider_lon, radius_km=2.0):
candidates = redis.georadius(
"drivers:available", rider_lon, rider_lat,
radius_km, "km", sort="ASC", count=10
)
best_driver = None
best_eta = float("inf")
for driver_id, distance in candidates:
# Get driver position from location service
driver_pos = location_service.get(driver_id)
# Calculate ETA using road network (cached for common routes)
eta = eta_service.estimate(driver_pos, (rider_lat, rider_lon))
if eta < best_eta:
best_eta = eta
best_driver = driver_id
return best_driver, best_eta
Offer-Accept Flow
RIDER requests ride
↓
DISPATCH SERVICE finds top 3 nearby drivers
↓
Send offer to Driver #1 (15-second timeout)
↓ accept
LOCK driver (set status=matched in Redis atomically with SETNX)
↓
NOTIFY rider: driver found, ETA X minutes
↓ if timeout/decline
Try Driver #2, then Driver #3
↓ if all decline
Expand search radius and retry
Preventing Double Assignment
Race condition: two dispatch servers simultaneously offer the same driver to two different riders. Prevent with Redis atomic lock:
def try_assign_driver(driver_id: str, trip_id: str) -> bool:
lock_key = f"driver:lock:{driver_id}"
# NX = set only if not exists, EX = expire in 30 seconds
acquired = redis.set(lock_key, trip_id, nx=True, ex=30)
if acquired:
# Remove from available set
redis.zrem("drivers:available", driver_id)
return True
return False # driver already assigned
Trip State Machine
REQUESTED
|
[Driver matched]
|
ACCEPTED
|
[Driver picks up rider]
|
IN_TRIP
|
[Driver reaches destination]
|
COMPLETED
|
[Payment processed, rating prompted]
|
CLOSED
Side transitions:
REQUESTED → CANCELLED (rider cancels, no driver found in 3 min)
ACCEPTED → CANCELLED (driver cancels, new match begins)
IN_TRIP → COMPLETED (only terminal transition from in-trip)
Store trip state in PostgreSQL with an events table for audit trail. Use optimistic locking (version column) to prevent concurrent state transitions.
ETA Estimation
ETA needs two components:
- Driver pickup ETA: time from driver current location to rider pickup point
- Trip ETA: time from pickup to destination
Road network routing uses a precomputed graph (OSM data) with Dijkstra or A* on contracted hierarchies. Real-time traffic data (from anonymized GPS traces of all active drivers) modifies edge weights. Cache common route ETAs in Redis.
Surge Pricing
def calculate_surge_multiplier(pickup_geohash: str) -> float:
cell = pickup_geohash[:5] # ~5km cell
available_drivers = count_available_drivers(cell)
pending_requests = count_pending_requests(cell, window_sec=300)
# Supply/demand ratio
ratio = available_drivers / max(pending_requests, 1)
if ratio > 1.5: return 1.0 # plenty of drivers
elif ratio > 0.8: return 1.2
elif ratio > 0.5: return 1.5
elif ratio > 0.2: return 2.0
else: return 3.0 # severe shortage
# Smooth transitions to avoid price oscillation
surge = alpha * current_surge + (1 - alpha) * new_surge # EMA smoothing
Database Design
-- Trips (PostgreSQL — OLTP, transactional)
CREATE TABLE trips (
id UUID PRIMARY KEY,
rider_id BIGINT NOT NULL,
driver_id BIGINT,
status VARCHAR(20) NOT NULL, -- state machine
pickup_lat DOUBLE PRECISION,
pickup_lon DOUBLE PRECISION,
dropoff_lat DOUBLE PRECISION,
dropoff_lon DOUBLE PRECISION,
fare_cents INT,
surge_multiplier DECIMAL(4,2),
version INT DEFAULT 0, -- optimistic lock
created_at TIMESTAMP NOT NULL,
completed_at TIMESTAMP
);
-- Location history (Cassandra — high-write time series)
CREATE TABLE driver_locations (
driver_id BIGINT,
recorded_at TIMESTAMP,
lat DOUBLE,
lon DOUBLE,
speed_kmh FLOAT,
PRIMARY KEY (driver_id, recorded_at)
) WITH CLUSTERING ORDER BY (recorded_at DESC);
Interview Discussion Points
- Why Redis for location instead of PostGIS? Write throughput (250K/s), O(log n) GEORADIUS, TTL for stale cleanup
- How do you handle driver going through a tunnel (GPS loss)? Use dead reckoning (last known position + heading + speed); if gap > 30s, remove from available index
- How do you prevent surge gaming? Smooth price changes with EMA, show surge to rider upfront, require explicit acceptance
- How do you handle city-level peak demand (NYE, stadium events)? Pre-scale dispatch servers, pre-warm Redis, incentive pricing to attract more drivers to hot zones
Frequently Asked Questions
How does Uber match drivers to riders?
Uber uses a two-step process. First, geospatial retrieval: GEORADIUS in Redis finds all available drivers within 2km of the rider in O(log n) time. Drivers are stored in a Redis Geo sorted set with their last known GPS coordinates, updated every 4 seconds. Second, optimization: from the top 10 candidates, the dispatch service estimates ETA using a road-network routing engine (not straight-line distance) and selects the driver with the lowest ETA. The offer is sent to that driver with a 15-second acceptance timeout; if declined, the next candidate is tried. An atomic Redis SETNX lock prevents two dispatch servers from assigning the same driver to two different riders simultaneously.
How does surge pricing work in a ride-sharing system?
Surge pricing uses a supply-demand ratio computed per geohash cell (roughly 5km area). The system counts available drivers and pending ride requests in the cell over a rolling 5-minute window. When demand greatly exceeds supply (ratio below 0.2), a multiplier of 2x-3x is applied to the base fare. The multiplier uses exponential moving average smoothing to prevent rapid price oscillations that would confuse riders. Riders must explicitly accept the surge price before the request is submitted. Surge pricing serves as a market signal — higher earnings attract more drivers to the area, naturally rebalancing supply and demand.
Why use Redis instead of a relational database for driver locations?
Driver location storage has extreme write requirements: 1 million active drivers sending GPS updates every 4 seconds = 250,000 writes per second. A traditional relational database cannot sustain this write throughput with acceptable latency. Redis provides O(1) GEOADD writes and O(log n) GEORADIUS range queries, serving millions of operations per second with sub-millisecond latency. Redis Geo uses a sorted set internally, encoding geographic coordinates as a geohash score — no schema needed, and expired/offline drivers are cleaned up with TTL. Additionally, Redis keeps data in memory, eliminating the disk I/O bottleneck.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does Uber match drivers to riders?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Uber uses a two-step process. First, geospatial retrieval: GEORADIUS in Redis finds all available drivers within 2km of the rider in O(log n) time. Drivers are stored in a Redis Geo sorted set with their last known GPS coordinates, updated every 4 seconds. Second, optimization: from the top 10 candidates, the dispatch service estimates ETA using a road-network routing engine (not straight-line distance) and selects the driver with the lowest ETA. The offer is sent to that driver with a 15-second acceptance timeout; if declined, the next candidate is tried. An atomic Redis SETNX lock prevents two dispatch servers from assigning the same driver to two different riders simultaneously.”
}
},
{
“@type”: “Question”,
“name”: “How does surge pricing work in a ride-sharing system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Surge pricing uses a supply-demand ratio computed per geohash cell (roughly 5km area). The system counts available drivers and pending ride requests in the cell over a rolling 5-minute window. When demand greatly exceeds supply (ratio below 0.2), a multiplier of 2x-3x is applied to the base fare. The multiplier uses exponential moving average smoothing to prevent rapid price oscillations that would confuse riders. Riders must explicitly accept the surge price before the request is submitted. Surge pricing serves as a market signal — higher earnings attract more drivers to the area, naturally rebalancing supply and demand.”
}
},
{
“@type”: “Question”,
“name”: “Why use Redis instead of a relational database for driver locations?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Driver location storage has extreme write requirements: 1 million active drivers sending GPS updates every 4 seconds = 250,000 writes per second. A traditional relational database cannot sustain this write throughput with acceptable latency. Redis provides O(1) GEOADD writes and O(log n) GEORADIUS range queries, serving millions of operations per second with sub-millisecond latency. Redis Geo uses a sorted set internally, encoding geographic coordinates as a geohash score — no schema needed, and expired/offline drivers are cleaned up with TTL. Additionally, Redis keeps data in memory, eliminating the disk I/O bottleneck.”
}
}
]
}