Designing a ride-sharing app is a rich system design problem that tests your ability to handle real-time location data, geospatial queries, state machines, and dynamic pricing simultaneously. Uber processes millions of trips daily across 70+ countries — and the core architecture is publicly documented in their engineering blog.
Step 1: Clarify Requirements
- Core flows: Rider requests trip → driver match → pickup → trip → payment.
- Scale: 5M trips/day, 3M active drivers globally at peak.
- Location updates: How frequently do drivers report their GPS position?
- Matching: Nearest available driver? Or optimize for ETA?
- Real-time tracking: Rider sees driver moving on map during pickup and trip.
- Pricing: Static fare? Dynamic surge pricing?
Assume: 5M trips/day, 3M active drivers sending location every 4 seconds, match by lowest ETA, real-time map tracking, surge pricing.
Step 2: Back-of-Envelope
Location updates:
3M active drivers × 1 update/4s = 750,000 location writes/sec
Each update: driver_id (8B) + lat (8B) + lng (8B) + timestamp (8B) = 32 bytes
750K × 32B = 24MB/sec of location data
Trip requests:
5M trips/day = ~58 trips/sec (peak: ~500/sec)
Active trips to track:
Avg trip duration 15 min → 58/sec × 900s = ~52,000 concurrent active trips
Each trip: 2 users tracking in real-time
The dominant load is location writes — 750K/sec. Everything else is comparatively light.
Step 3: Location Service
The location service ingests driver GPS updates, stores current positions, and answers “find N closest available drivers to this point.”
Storage: Redis Geospatial Index
Redis has native geospatial commands using a sorted set with geohash-encoded scores:
GEOADD drivers_available {lng} {lat} {driver_id}
# Updates position in O(log N)
GEORADIUS drivers_available {rider_lng} {rider_lat} 5 km ASC COUNT 10
# Returns 10 closest available drivers within 5km, sorted by distance
# O(N+log M) where N = results, M = total drivers
With 3M drivers, a single Redis node handles this if partitioned by city/region. Shard by geographic region (one Redis shard per major metro area) to distribute load.
Location Update Pipeline
750K writes/sec to Redis directly would work — Redis handles ~1M simple operations/sec. But we also need location history (for ETA prediction, routing, dispute resolution).
Driver app
↓ gRPC every 4 seconds
Location Service
├─ GEOADD to Redis (for real-time proximity queries)
└─ Publish to Kafka "driver_locations"
└─ Location History Consumer
→ Cassandra (time-series location log per driver)
→ ETA Model updater
Step 4: Driver Matching
When a rider requests a trip:
- Geocode the pickup location.
- Query Redis: find the 10 closest available drivers within 5km.
- For each candidate driver, compute estimated time of arrival (ETA) using road network routing (not straight-line distance — a driver 800m away but on the other side of a river may have a worse ETA than one 1.2km away).
- Select the driver with the lowest ETA.
- Send a trip request to that driver. If they decline or don’t respond within 15 seconds, try the next candidate.
- Once a driver accepts, lock their status to “on trip” in Redis (
GEOADD drivers_on_trip) and remove fromdrivers_available.
def match_driver(rider_location, pickup):
candidates = redis.georadius(
'drivers_available', pickup.lng, pickup.lat,
radius=5, unit='km', count=10, sort='ASC'
)
etas = [(driver, compute_eta(driver.location, pickup)) for driver in candidates]
etas.sort(key=lambda x: x[1])
for driver, eta in etas:
result = request_driver(driver, rider, eta)
if result == 'accepted':
redis.geomove(driver.id, 'drivers_available', 'drivers_on_trip')
return driver
# else: driver declined, try next
return None # no drivers available
ETA Computation
ETA = time from driver’s current location to pickup, via road network. Options:
- Google Maps / Mapbox API: Accurate but costs money per request and adds latency. Fine for a low-volume system.
- Internal routing engine (Uber’s H3): Uber uses H3 hexagonal geospatial indexing internally. Road network is stored as a graph; Dijkstra or A* with live traffic data computes ETA in milliseconds.
- Precomputed ETA grid: For matching speed, use a rough precomputed grid: “from hex X to hex Y ≈ N seconds given current traffic.” Accurate enough for ranking candidates; use a precise routing call for the final ETA shown to the rider.
Step 5: Trip State Machine
A trip moves through well-defined states. State transitions trigger side effects (notifications, billing).
REQUESTED
↓ driver accepts
DRIVER_ASSIGNED
↓ driver arrives at pickup
DRIVER_ARRIVED (sends push notification to rider)
↓ rider enters car
IN_PROGRESS
↓ driver ends trip
COMPLETED (triggers billing, receipt email)
↓ (if rider cancels)
CANCELLED (cancellation fee logic)
↓ (if driver cancels / no response)
DRIVER_CANCELLED → re-match to new driver
State is stored in the Trip Service (Postgres). State transitions are events published to Kafka, consumed by downstream services (notification service, billing service, analytics).
trips (
trip_id UUID PRIMARY KEY,
rider_id BIGINT,
driver_id BIGINT,
status VARCHAR(20),
pickup_lat DOUBLE PRECISION,
pickup_lng DOUBLE PRECISION,
dropoff_lat DOUBLE PRECISION,
dropoff_lng DOUBLE PRECISION,
requested_at TIMESTAMP,
matched_at TIMESTAMP,
started_at TIMESTAMP,
completed_at TIMESTAMP,
fare_cents INTEGER,
surge_multiplier DECIMAL(3,1)
)
Step 6: Real-Time Tracking (Rider Watches Driver on Map)
During pickup and trip, the rider’s app shows the driver moving in real-time. The driver’s app shows the rider’s current position.
Mechanism: WebSocket (or long-polling) between the rider’s app and the Trip Service. The driver’s location updates flow:
Driver app → Location Service → Redis GEOADD
→ Kafka "driver_locations"
↓
Trip Service subscribes to location updates
for active trips
↓
WebSocket push to rider app
Only broadcast location updates to the rider for their specific driver — not a global broadcast. The Trip Service maintains a mapping of trip_id → rider WebSocket connection.
Step 7: Surge Pricing
When demand exceeds supply in an area, prices increase to attract more drivers.
surge_multiplier = f(demand, supply)
= 1.0 + k × max(0, (active_requests - available_drivers) / available_drivers)
where k is a tuning constant
Implementation:
- Divide the city into H3 hexagons (Uber’s approach) or geohash cells.
- Every 60 seconds, compute demand (trip requests in last 5 min) and supply (available drivers) per cell.
- Store surge multipliers per cell in Redis with a 60-second TTL.
- Fare calculation at trip request:
base_fare × surge_multiplier × (distance + time). - Surge is shown to the rider before they confirm. They must explicitly accept.
High-Level Architecture
Driver App ──→ Location Service ──→ Redis Geo Index
└──→ Kafka → Location History (Cassandra)
Rider App ──→ Trip Service ──→ Matching Service
├─ Redis Geo (nearby drivers)
├─ ETA Engine
└─ Driver Notification (push)
│
↓ state changes
Kafka ──→ Billing Service
└──→ Notification Service
└──→ Analytics
Real-time tracking:
Driver Location → Trip Service → WebSocket → Rider App
Follow-up Questions
Q: How do you handle a driver going offline mid-trip?
Location updates stop. Trip Service detects no update in >30 seconds, marks driver as “location_unknown.” Trip continues — the route is known; billing uses elapsed time. After 5 minutes offline, escalate to support.
Q: How do you prevent the same driver from being matched to two riders simultaneously?
Redis atomic operations: use GETSET or a Lua script to atomically check driver status and update it. Only one matching request can win the atomic operation.
Q: How does payment work?
When a trip completes, the Billing Service calculates fare (distance × time × surge). It calls a payment processor (Stripe, Braintree) asynchronously. On success, updates trip status and emails receipt. On failure, retries with exponential backoff, then surfaces to support if unresolvable.
Q: How do you handle the matching service at 500 requests/sec?
The matching service is stateless (reads from Redis, writes trip state to Postgres). Horizontal scaling behind a load balancer. Redis geospatial queries are O(log N + K) and handle thousands of queries/sec per node. Postgres handles trip writes with connection pooling (PgBouncer).
Summary
A ride-sharing app is a real-time geospatial matching system. Location updates (750K/sec) flow into a Redis geospatial index for fast proximity queries. Driver matching selects the minimum-ETA candidate from nearby available drivers using a road-network routing engine. Trip state is a state machine in Postgres, with transitions published to Kafka for downstream services. Real-time tracking uses WebSockets between the Trip Service and rider/driver apps. Surge pricing is computed per geospatial cell every 60 seconds and cached in Redis.
Related System Design Topics
- Consistent Hashing — partitioning the geo-index across location servers
- Caching Strategies — caching driver location and ETA estimates
- Load Balancing — routing WebSocket connections to the right matching server
- Message Queues — trip event stream for analytics and billing
- Database Sharding — sharding trip history by city or user
See also: Design a Proximity Service (Yelp / Nearby Search) — static business location indexing with Geohash, complementing the real-time driver location tracking in ride-sharing.
Companies That Ask This System Design Question
This problem type commonly appears in interviews at:
See our company interview guides for full interview process, compensation, and preparation tips.