System Design Interview: Ride-Sharing App (Uber / Lyft)

Uber operates in 70+ countries with 5M+ trips per day and 4M+ drivers. Designing a ride-sharing app covers real-time geospatial systems, driver-rider matching, surge pricing, and distributed trip state management. This is a standard senior system design question at Uber, Lyft, DoorDash, and any company with marketplace or logistics systems.

Requirements

Functional: Riders request a ride. The system finds the nearest available driver and notifies them. Driver accepts or declines. Trip tracking with live location updates. Estimated time of arrival (ETA). Dynamic pricing (surge). Payment on trip completion.

Non-functional: Driver location updated every 4 seconds. Match found in <2 seconds of rider request. ETA accurate within 2 minutes. Scale: 5M concurrent drivers sending location updates = 1.25M writes/second. 1M concurrent riders. 99.99% uptime — being unable to book a ride during an emergency is critical.

Driver Location Service

Every 4 seconds, each active driver sends their GPS coordinates to the location service. At 5M drivers, that is 1.25M writes/second. Architecture:

  • Drivers connect via WebSocket or HTTP long-polling to a geolocation gateway
  • Location updates are written to Redis Geo (GEOADD command): stores coordinates in a sorted set using geohash encoding
  • Redis GEORADIUS queries return all drivers within a radius of a given point in O(N+log(M)) time
  • Driver location data is short-lived — TTL of 30 seconds. A driver not updating for 30 seconds is considered offline

At 1.25M writes/second, a single Redis instance cannot handle all writes. Shard by city: each city has its own Redis cluster. Uber has 10K+ cities — most have fewer than 1K active drivers; only megacities (NYC, London, Mumbai) need dedicated clusters.

Driver Matching Algorithm

When a rider requests a trip, the matching service:

  1. Queries the location service for all available drivers within 5km of the pickup point
  2. Scores each candidate driver: score = f(distance, driver_rating, car_type_preference, estimated_pickup_time)
  3. Attempts to match with the top-scored driver: sends a push notification (Firebase/APNs) to their phone
  4. The driver has 15 seconds to accept. If declined or no response, the next driver in the ranked list is tried
  5. If no driver accepts after 5 attempts, expand the search radius to 10km and retry

ETA calculation: use a pre-computed road graph with edge weights representing travel time (updated with real-time traffic). Run A* or Contraction Hierarchies shortest-path from driver location to pickup, then from pickup to destination.

Trip State Machine

Each trip is a state machine stored in PostgreSQL:

States: REQUESTED → ACCEPTED → DRIVER_ARRIVED → IN_PROGRESS → COMPLETED
        REQUESTED → CANCELLED (rider cancels before match)
        ACCEPTED  → CANCELLED (driver cancels or rider cancels)

trip_id, rider_id, driver_id, status, pickup_lat, pickup_lng,
dest_lat, dest_lng, fare_estimate, actual_fare, started_at, completed_at

State transitions are enforced with database constraints and optimistic locking (version column). A trip in ACCEPTED state cannot transition to IN_PROGRESS unless the driver has arrived (GPS proximity check). Events are published to Kafka on each transition — other services (payment, analytics, notifications) consume these events.

Real-Time Location Sharing During Trip

Once a driver is matched, both rider and driver share real-time location. Architecture: driver sends location every 2 seconds to a location streaming service. The streaming service publishes to a Redis Pub/Sub channel keyed by trip_id. The rider app subscribes to this channel via a WebSocket server and receives driver location updates in real-time. At 1M concurrent active trips, Redis Pub/Sub channels handle the fan-out — each trip has at most 2-3 subscribers (rider, sometimes shared passengers).

Dynamic Pricing (Surge)

When demand exceeds supply in an area, surge pricing multiplies the base fare to attract more drivers and reduce demand. Surge calculation:

# Simplified surge multiplier per geographic cell (geohash level 6, ~1km^2)
def surge_multiplier(cell: str) -> float:
    demand = open_requests[cell]          # active ride requests
    supply = available_drivers[cell]      # idle drivers
    if supply == 0:
        return 3.0  # max surge
    ratio = demand / supply
    if ratio < 1.0: return 1.0
    if ratio < 1.5: return 1.2
    if ratio < 2.0: return 1.5
    if ratio < 3.0: return 2.0
    return 3.0

Demand and supply counts are maintained in Redis per geohash cell, updated in real time as requests arrive and drivers go online/offline. Surge multipliers are re-computed every 60 seconds per cell. The rider sees the current multiplier before confirming a booking.

Payment Processing

Fare is calculated at trip completion: base_fare + per_km_rate × distance + per_minute_rate × duration, multiplied by the surge that was applied at booking time. Payment is triggered by the COMPLETED state transition. The payment service charges the pre-authorized payment method (card or Uber Cash). Idempotency key = trip_id ensures the charge runs exactly once. The driver receives their share (typically 75-80% of fare) after platform fee deduction, paid weekly via ACH bank transfer.

Maps and ETA

Uber builds its own mapping stack (Uber Movement data). For smaller companies, Google Maps Platform or HERE provides: geocoding (address → coordinates), routing (A→B with traffic), ETA, and street view. Google Maps Directions API costs ~$0.005 per request — at 5M trips/day, that is $25K/day. At scale, building an in-house routing engine (or licensing HERE) becomes cost-effective. Road network is stored as a graph with time-dependent edge weights updated from traffic sensors and historical data.

Frequently Asked Questions

How does Uber store and query driver locations in real time?

Uber uses Redis Geo, which stores geographic coordinates as geohash values in a sorted set. Each driver location update calls GEOADD drivers:{city_id} {lng} {lat} {driver_id}. Finding drivers within 5km of a pickup point calls GEORADIUS drivers:{city_id} {lng} {lat} 5 km ASC. Redis executes this in O(N+log(M)) time where N is the number of returned points and M is the total set size. Location updates arrive at 1.25 million per second (5M drivers × every 4 seconds). This exceeds single-instance Redis capacity, so sharding by city is essential — each major city has its own Redis cluster. Driver location TTL is 30 seconds: if a driver does not send an update, they are considered offline and removed from the available pool. Redis GEO also supports GEOSEARCH with radius, polygon, or bounding box, enabling efficient queries for drivers in specific areas.

How does surge pricing work in a ride-sharing system?

Surge pricing adjusts fares dynamically when demand exceeds supply in a geographic area. The calculation divides the city into geohash cells (approximately 1km²). For each cell, two counters are maintained in Redis: active ride requests (demand) and available idle drivers (supply). A pricing service recomputes the surge multiplier every 60 seconds per cell using the demand/supply ratio. If demand equals supply, the multiplier is 1.0 (no surge). As demand grows relative to supply, the multiplier increases in steps (1.2, 1.5, 2.0, 3.0). The multiplier is shown to riders before they confirm a booking, and the fare at the time of booking is locked in — even if surge changes during the trip. The goal of surge pricing is market clearing: higher prices reduce rider demand (some choose alternatives) and attract more drivers to the high-demand area, restoring equilibrium. The multiplier and the surge response curve are tuned empirically per city based on price elasticity data.

How do you design the driver-rider matching algorithm?

Driver-rider matching is an online assignment problem solved greedily in real time. When a ride request arrives, the matching service queries all available drivers within a configurable radius (default 5km) from the pickup point. Each candidate driver is scored using a weighted formula: score = w1 × (1/distance) + w2 × driver_rating + w3 × (1/estimated_pickup_time) + w4 × preference_match (car type, accessibility). The top-scored driver receives a push notification (Firebase Cloud Messaging on Android, APNs on iOS). The driver has 15 seconds to accept. If they decline or do not respond, the next driver in the ranked list is offered the trip. If no driver accepts after 5 attempts, the search radius expands to 10km. ETA is computed using a pre-computed road graph with time-dependent edge weights (traffic from historical data and real-time sensors), running A* or Contraction Hierarchies for fast shortest-path computation. The matching decision happens in under 2 seconds, which requires the driver location cache in Redis to respond in under 10ms and the scoring to run in-memory without additional I/O.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does Uber store and query driver locations in real time?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Uber uses Redis Geo, which stores geographic coordinates as geohash values in a sorted set. Each driver location update calls GEOADD drivers:{city_id} {lng} {lat} {driver_id}. Finding drivers within 5km of a pickup point calls GEORADIUS drivers:{city_id} {lng} {lat} 5 km ASC. Redis executes this in O(N+log(M)) time where N is the number of returned points and M is the total set size. Location updates arrive at 1.25 million per second (5M drivers × every 4 seconds). This exceeds single-instance Redis capacity, so sharding by city is essential — each major city has its own Redis cluster. Driver location TTL is 30 seconds: if a driver does not send an update, they are considered offline and removed from the available pool. Redis GEO also supports GEOSEARCH with radius, polygon, or bounding box, enabling efficient queries for drivers in specific areas.”
}
},
{
“@type”: “Question”,
“name”: “How does surge pricing work in a ride-sharing system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Surge pricing adjusts fares dynamically when demand exceeds supply in a geographic area. The calculation divides the city into geohash cells (approximately 1km²). For each cell, two counters are maintained in Redis: active ride requests (demand) and available idle drivers (supply). A pricing service recomputes the surge multiplier every 60 seconds per cell using the demand/supply ratio. If demand equals supply, the multiplier is 1.0 (no surge). As demand grows relative to supply, the multiplier increases in steps (1.2, 1.5, 2.0, 3.0). The multiplier is shown to riders before they confirm a booking, and the fare at the time of booking is locked in — even if surge changes during the trip. The goal of surge pricing is market clearing: higher prices reduce rider demand (some choose alternatives) and attract more drivers to the high-demand area, restoring equilibrium. The multiplier and the surge response curve are tuned empirically per city based on price elasticity data.”
}
},
{
“@type”: “Question”,
“name”: “How do you design the driver-rider matching algorithm?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Driver-rider matching is an online assignment problem solved greedily in real time. When a ride request arrives, the matching service queries all available drivers within a configurable radius (default 5km) from the pickup point. Each candidate driver is scored using a weighted formula: score = w1 × (1/distance) + w2 × driver_rating + w3 × (1/estimated_pickup_time) + w4 × preference_match (car type, accessibility). The top-scored driver receives a push notification (Firebase Cloud Messaging on Android, APNs on iOS). The driver has 15 seconds to accept. If they decline or do not respond, the next driver in the ranked list is offered the trip. If no driver accepts after 5 attempts, the search radius expands to 10km. ETA is computed using a pre-computed road graph with time-dependent edge weights (traffic from historical data and real-time sensors), running A* or Contraction Hierarchies for fast shortest-path computation. The matching decision happens in under 2 seconds, which requires the driver location cache in Redis to respond in under 10ms and the scoring to run in-memory without additional I/O.”
}
}
]
}

Companies That Ask This Question

Scroll to Top