Low Level Design: Ride-Hailing Dispatch System

⏱ 10 min read

Driver State Machine

Every driver in the system exists in one of four states: offline, available, on_trip, or returning. The transitions are: offline → available (driver goes online), available → on_trip (trip accepted and started), on_trip → returning (trip completed, driver returning to available area), returning → available (driver marks ready for next trip), and any state → offline (driver goes offline).

State is stored in a driver_states table: driver_id, state, last_updated, current_trip_id (nullable). The canonical state lives in the database, but for low-latency dispatch reads, it is mirrored in Redis: HSET driver:{driver_id} state available lat 37.77 lon -122.41 updated_at 1720000000.

Only available drivers are indexed for matching. When a driver transitions to on_trip, they are removed from the geospatial matching index (ZREM drivers:available {driver_id}). When they return to available, they are re-added. State transitions are published to a Kafka topic driver-state-changes for consumption by analytics pipelines, incentive tracking services, and the demand forecasting model.

Supply and Demand Heatmap

Demand is measured as the rate of incoming ride requests per geographic cell per time window. The geohash grid (precision 6, ~1.2km x 0.6km cells) is used to bin both supply (available drivers) and demand (open requests). For each 5-minute window, the system increments a Redis counter per geohash cell as requests arrive: INCR demand:{geohash}:{window_id} with a TTL of 15 minutes.

Supply is the count of available drivers per geohash cell, maintained as a Redis sorted set or counter updated in real time as drivers change state. The demand/supply ratio per cell is computed by a background job every 30 seconds and stored in Redis for fast reads by the pricing service and the driver-facing heatmap API.

When demand/supply ratio exceeds threshold T1 (e.g., 2.0), the surge pricing multiplier activates for that cell. When ratio exceeds T2 (e.g., 3.5), driver repositioning incentives are triggered: drivers nearby receive push notifications showing the high-demand zone highlighted on their in-app map, along with a guaranteed minimum earnings bonus for trips originating from that zone.

Geospatial Indexing

Driver locations are stored in Redis using the native geospatial commands. When a driver is available and submits a location ping, the system executes: GEOADD drivers:available {longitude} {latitude} {driver_id}. Location pings arrive every 5 seconds from the driver app. The ping handler updates both the Redis geospatial index and the driver_locations table for audit/replay purposes.

To find candidate drivers for a new ride request, the dispatch service executes: GEOSEARCH drivers:available FROMLONLAT {pickup_lon} {pickup_lat} BYRADIUS 5 km ASC COUNT 20. This returns up to 20 nearest available drivers sorted by distance. The COUNT limit prevents the result set from becoming unmanageably large in dense urban areas.

For very high-density markets, the geospatial index can be sharded by city or region. A routing layer maps each request’s geohash to the appropriate Redis shard. This also provides natural failure isolation: an outage in one shard affects only one city’s matching.

Stale driver locations are a concern. If a driver’s app crashes without going offline, their location in the index becomes stale. A TTL-based expiration handles this: a separate sorted set driver:last_ping tracks the last ping timestamp per driver. A background job runs every 30 seconds and removes drivers from drivers:available if their last ping is more than 60 seconds old, transitioning them to offline in the database.

Greedy vs Batch Matching

Greedy matching processes each ride request as it arrives. The dispatch service calls GEOSEARCH, applies filters (rating, vehicle type, acceptance rate), scores candidates, and offers the trip to the top-ranked available driver. This approach is simple, low-latency (sub-100ms to offer generation), and works well under moderate load. The downside is suboptimality: two requests arriving simultaneously in nearby locations might both get assigned to drivers that are close to one but far from the other, when swapping assignments would reduce total ETA.

Batch matching collects all incoming requests over a fixed window (typically 3-5 seconds), then solves the assignment problem over the full set of (requests, available drivers) to minimize total estimated pickup time. The Hungarian algorithm solves this in O(n^3) time; for n up to a few hundred, this runs in well under 1 second. For larger n (major cities at peak), approximate algorithms or auction-based methods are used.

The batch approach improves total system efficiency by 10-20% in dense markets. The trade-off is a fixed delay of up to 5 seconds before the rider receives a driver offer. Most production systems use greedy matching with a short debounce window (1-2 seconds) as a compromise, capturing most of the batch benefit without the full delay.

Driver Offer and Acceptance

Once a driver is selected, an offer is sent via push notification and WebSocket. The offer payload includes: rider name, pickup location, estimated pickup distance and time, destination (optional, configurable by market), trip type, and current surge multiplier. The driver has a 15-second acceptance window, configurable per market.

If the driver declines explicitly or the window expires without response, the driver is temporarily deprioritized in the matching algorithm (soft cooldown of 60 seconds) and the offer moves to the next candidate. The original request does not restart the full matching process; the dispatch service maintains an ordered candidate list and walks it sequentially, refreshing driver availability from Redis before each offer to avoid offering to a driver who has gone offline between candidates.

Decline rates are tracked per driver: driver_stats table with a rolling 30-day decline rate column. Drivers with decline rate above threshold receive in-app warnings and may be subject to reduced incentive eligibility. This data also feeds the driver scoring function used in matching to prefer drivers with low historical decline rates.

ETA Computation

Two ETAs matter: pickup ETA (time for driver to reach rider) and trip ETA (time for driver to take rider to destination). Pickup ETA is computed by routing from driver’s current location to the pickup point using a road network graph (OSRM, Valhalla, or Google Maps Platform). The raw routing time is adjusted upward by a traffic factor derived from current segment speeds from a traffic data provider, applied per road segment.

Trip ETA uses a combination of historical trip times for the origin-destination pair at the current hour of day and day of week, plus a live traffic adjustment. Historical data is stored in a pre-aggregated table: trip_time_stats with columns origin_geohash, dest_geohash, hour_of_week (0-167), p50_seconds, p90_seconds. This covers the typical case; for novel OD pairs, routing-based fallback is used.

ETA updates are pushed to the rider every 30 seconds during the driver-approaching state. If the driver deviates significantly from the expected route (detected by comparing reported location to routing path), a re-route and re-estimate is triggered. ETA accuracy is tracked as a key metric: the difference between estimated and actual pickup time at offer generation, segmented by city, time of day, and weather conditions.

Fleet Positioning

When drivers complete trips or go online, they tend to cluster in areas they know—often residential zones or their own neighborhoods—while demand concentrates in commercial districts, transit hubs, and entertainment zones. Left uncorrected, this supply mismatch causes long ETAs in high-demand areas even when total supply is adequate.

The repositioning system addresses this by surfacing demand heatmaps to drivers in the app: geohash cells colored by demand intensity, updated every 60 seconds. Drivers can see where riders are requesting trips before deciding where to wait. Incentive overlays show guaranteed minimum earnings per trip originating from specific zones during surge windows.

The system avoids creating perverse incentives. If all drivers follow the heatmap to the same hotspot, they overshoot and the incentive must be designed to decay as supply fills a zone. The incentive engine uses a feedback loop: as available driver count in a geohash cell rises toward the demand level, the incentive multiplier for that cell decreases, naturally spreading drivers across multiple high-demand areas.

Cancellation Handling

Cancellations are modeled as state machine transitions on the trip entity. A trip can be cancelled by either party before completion. Cancellation policy determines whether fees apply: riders who cancel after the driver has been en route for more than N minutes (typically 5) are charged a flat cancellation fee, handled via the same payment flow as trip charges. Riders who cancel within N minutes pay nothing.

Driver cancellations re-enter the rider into the matching queue with elevated priority: their request gets a short-circuit path to the front of the batch window. The cancelling driver’s record is updated with a cancellation event. Rolling cancellation rate (cancellations / offers accepted, 30-day window) is computed nightly and stored in driver_stats. Drivers above the threshold receive a warning and are eligible for review. Repeated high-cancellation behavior triggers deactivation review.

Both rider and driver cancellation rates are surfaced in their respective profiles and factor into matching: a rider with a high cancellation rate may be deprioritized in tight supply conditions, or required to pre-authorize payment to reduce frivolous bookings.

{ “$schema”: “https://schema.org”, “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “What are the trade-offs between greedy and batch matching for ride dispatch?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Greedy matching assigns the nearest available driver to each incoming ride request immediately, minimizing individual dispatch latency (sub-second). It is simple and works well in high-supply markets. Batch matching accumulates requests over a short window (1–5 seconds) and solves a global assignment problem (e.g., min-cost bipartite matching or Hungarian algorithm) to maximize fleet-wide efficiency, reduce total deadhead miles, and enable carpooling. Batch matching is preferred during peak demand or when optimizing for utilization. In practice, dispatch systems use greedy as a fallback with a timeout and batch for high-density zones, switching modes based on real-time supply/demand ratios.” } }, { “@type”: “Question”, “name”: “How do you implement geospatial indexing for real-time driver location queries in a ride-hailing system?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Drivers publish GPS updates (every 3–5 seconds) to a location service. Encode each position as a geohash or S2 cell ID at an appropriate precision level (e.g., ~150m radius). Store active drivers in a Redis geospatial index (GEOADD) or a purpose-built tile-based in-memory store. To find drivers within radius R of a pickup, query the target cell and its 8 neighbors to handle boundary effects, then filter by exact distance. For city-scale fleets, shard the index by city or region. Periodically expire stale driver records (TTL 30s) to prevent ghost entries. S2 geometry is preferred over geohash for uniform area coverage and easy neighbor enumeration.” } }, { “@type”: “Question”, “name”: “How do you design supply and demand heatmaps for surge pricing in a ride-hailing platform?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Partition the city into hexagonal cells (H3 library) at a resolution balancing granularity and noise (e.g., ~0.7 kmu00b2 cells). Continuously aggregate open ride requests (demand) and idle driver counts (supply) per cell using a sliding 5-minute window in a stream processor (Kafka Streams or Flink). Compute a supply/demand ratio per cell and map it to a surge multiplier via a configurable piecewise function. Publish updated heatmap snapshots to a distributed cache every 30 seconds. The pricing service reads the cell of the pickup point at request time. Smooth surge changes with exponential moving averages to avoid jarring price swings and reduce gaming.” } }, { “@type”: “Question”, “name”: “How do you handle driver offer and acceptance timeouts in a ride dispatch system?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “When a driver is selected, create an offer record with a state machine: OFFERED -> ACCEPTED | REJECTED | EXPIRED. Set a server-side deadline (e.g., 10 seconds) stored in Redis with a TTL. A timeout worker (or Redis keyspace expiry notification) transitions the offer to EXPIRED and re-queues the ride for the next candidate driver. To prevent double-dispatch, use a distributed lock (Redlock) on the ride ID during offer creation. Track driver non-acceptance rate to deprioritize unresponsive drivers in future matching. Push the offer to the driver app via a persistent WebSocket or FCM/APNs push with a client-side countdown UI to reinforce the time constraint.” } }, { “@type”: “Question”, “name”: “How do you design fleet repositioning incentives to balance supply in a ride-hailing system?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Identify undersupplied zones using the real-time heatmap. Generate repositioning recommendations by solving a minimum-cost flow problem: sources are driver current locations, sinks are demand hotspots, and edge costs are estimated travel times. Surface these as in-app ‘heat map’ overlays and optional navigation hints to idle drivers rather than hard commands (drivers are contractors). Incentivize movement via guaranteed minimum fare bonuses for trips originating in target zones or hourly guarantees for drivers who reposition. Track repositioning effectiveness by measuring supply/demand ratio delta 15 minutes after incentive issuance. Use a feedback loop to tune bonus amounts dynamically to avoid overpaying.” } } ] }