Ride-Sharing App (Uber/Lyft) High-Level System Design

Core Services

A ride-sharing platform like Uber or Lyft decomposes into focused microservices:

  • Location Service: ingests driver GPS pings, maintains real-time positions
  • Dispatch Service: matches ride requests to nearby available drivers
  • Trip Service: owns the trip state machine and trip records
  • Payment Service: fare calculation, charge processing, driver payouts
  • Surge Pricing Service: computes real-time multipliers by geographic cell
  • Notification Service: push notifications and SMS for both rider and driver

Location Service

Every active driver pings their GPS coordinates every 4 seconds. At 1 million active drivers that is 250,000 writes per second. The design:

  • Driver app sends UDP packets to a fleet of location ingest servers (UDP is acceptable — losing one ping is fine).
  • Ingest servers write to Redis using GEOADD drivers {lng} {lat} {driver_id}. Redis Geo uses a sorted set with geohash scores, enabling radius queries in O(N+log M).
  • Ingest servers also publish to Kafka for downstream analytics, trip event history, and persistence to Cassandra.

Dispatch Service

When a rider requests a trip:

  1. Call GEORADIUS drivers {rider_lng} {rider_lat} 5 km ASC COUNT 20 to get nearby drivers.
  2. Filter by driver status = AVAILABLE.
  3. Score candidates by estimated ETA (from routing service) and historical acceptance rate.
  4. Offer trip to the top-scored driver with a 15-second accept timeout. If declined or timeout, waterfall to the next candidate.
  5. Once accepted, update driver status to DISPATCHED and create a trip record.

Trip State Machine

REQUESTED → DRIVER_ASSIGNED → DRIVER_EN_ROUTE → ARRIVED → IN_PROGRESS → COMPLETED
               ↓                    ↓              ↓           ↓
            CANCELLED           CANCELLED      CANCELLED   CANCELLED

State transitions are persisted to MySQL and published to Kafka so all downstream services (notifications, payments) react to events rather than polling.

Surge Pricing

The city is divided into H3 hexagonal grid cells (~1 km diameter). The Surge Pricing Service computes, every 60 seconds:

surge_multiplier = f(open_requests / available_drivers in cell)

Multipliers are stored in Redis with a 90-second TTL. The fare estimate API reads from Redis — no DB hit needed.

Payment Service

Fare is computed at trip completion:

fare = (base_fare + distance_rate * km + time_rate * minutes) * surge_multiplier

Payment is charged asynchronously after COMPLETED event fires. Retries use exponential backoff. Driver payout is batched daily. All payment records are stored in MySQL (ACID required for money movement).

Notification Service

Consumes trip events from Kafka and sends real-time updates:

  • Push notifications via APNs (iOS) and FCM (Android) for state changes
  • SMS fallback (via Twilio) if push delivery fails within 5 seconds
  • In-app WebSocket channel for live driver location updates to rider

Database Choices

  • MySQL: trips, payments — transactional integrity required
  • Redis: live driver locations (Geo), surge multipliers, driver status
  • Cassandra: time-series trip event log, driver location history
  • Kafka: event bus between all services

Scaling Strategy

Geo-partition the deployment: drivers and riders in NYC are handled by the US-East cluster, keeping dispatch latency under 100 ms. Dispatch servers are stateless and load-balanced using consistent hashing on region_id. Redis is clustered with slots partitioned by geohash prefix.

Key APIs

POST /rides/request              → {ride_id, estimated_fare, eta}
GET  /rides/{id}/status          → {state, driver_location, eta}
POST /drivers/location           → 200 OK
GET  /rides/{id}/fare-estimate   → {low, high, surge_multiplier}

Interview Tips

  • Why not use a graph DB for location? Redis Geo is purpose-built for radius queries and fits in memory — graph DBs add complexity without benefit here.
  • How do you handle a driver going offline mid-trip? Trip Service has a heartbeat watchdog; if no driver ping for 30s during IN_PROGRESS, alert operations.
  • Surge pricing fairness: cap multiplier at 3x during emergencies (configurable policy).

Uber system design interviews cover the full ride-sharing platform. See common questions for Uber interview: ride-sharing platform system design.

Lyft system design covers the full ride-sharing platform. Review design patterns for Lyft interview: ride-sharing app system design.

Stripe system design interviews cover payment flows in marketplace platforms. See patterns for Stripe interview: payment processing in ride-sharing system design.

Scroll to Top