Q: How do you track 1 million driver locations in real-time?

Each active driver sends GPS coordinates every 3-5 seconds. With 1 million drivers: 200K-333K updates/sec. Real-time index: Redis GEO data type. GEOADD drivers longitude latitude driver_id for updates. GEOSEARCH for proximity queries. Redis handles this throughput easily on a single instance. For higher scale, shard by geographic region. Location history: stream updates to Kafka, store in a time-series database (TimescaleDB, InfluxDB) for ETA model training, route optimization, and fraud detection. The real-time index is separate from the historical store. Rider updates: once matched, the rider app receives driver location via WebSocket every 3-5 seconds. The server subscribes to the matched driver location stream and pushes updates.

Q: How is ETA calculated accurately in a ride-sharing app?

ETA has three components: driver-to-pickup time, and trip duration (pickup to destination). For driver-to-pickup: use a routing service (Google Maps, OSRM) with real-time traffic. For the initial estimate before driver assignment, use a pre-computed travel time matrix (geohash cell to cell, updated hourly from historical GPS traces). For trip duration: same routing service with current traffic conditions. ML improvement: train a gradient boosted tree on features -- distance, time of day, day of week, traffic congestion index, weather, special events. Historical trip data provides labels (actual duration). The model learns city-specific patterns (school zones at 3 PM, construction on specific routes). Display a range (25-35 minutes) rather than a point estimate to manage expectations. After driver assignment, recalculate every 30 seconds using real-time driver GPS position.

Question 1

How does the driver matching algorithm work in a ride-sharing app?

Accepted Answer

When a rider requests a ride: (1) Geographic search -- query Redis GEOSEARCH for available drivers within 3-5 km of the pickup location. (2) ETA calculation -- for each candidate, estimate arrival time using a routing API or pre-computed travel time matrix. (3) Ranking -- score candidates by ETA (closest first), driver rating, acceptance rate, and vehicle type match. (4) Offer -- send the request to top 3-5 candidates simultaneously with a 15-30 second acceptance window. The first to accept gets the trip. Parallel offering reduces matching time from potentially 90 seconds (3 sequential 30-second timeouts) to under 15 seconds. For high-demand areas, batch multiple ride requests and solve the optimal global assignment using the Hungarian algorithm, minimizing total wait time across all pending riders.

Question 2

How does surge pricing work technically?

Accepted Answer

Divide the city into hexagonal zones (H3 geospatial indexing). For each zone every 1-2 minutes, compute: supply (available drivers) and demand (ride requests in the last 5 minutes). Surge multiplier = f(demand/supply). When demand is 3x supply, the multiplier might be 2.0x. The function is typically a step function: no surge when ratio < 1.2, gradual increase from 1.2x to cap (e.g., 5x). Store multipliers in Redis per zone. The pricing service reads the surge for the pickup zone and applies it to the base fare. Display to the rider before confirmation. Purpose: attract drivers to high-demand areas (higher earnings), reduce demand from price-sensitive riders, and balance supply/demand in real-time. The surge is zone-specific and time-limited.

Question 3

How do you track 1 million driver locations in real-time?

Accepted Answer

Each active driver sends GPS coordinates every 3-5 seconds. With 1 million drivers: 200K-333K updates/sec. Real-time index: Redis GEO data type. GEOADD drivers longitude latitude driver_id for updates. GEOSEARCH for proximity queries. Redis handles this throughput easily on a single instance. For higher scale, shard by geographic region. Location history: stream updates to Kafka, store in a time-series database (TimescaleDB, InfluxDB) for ETA model training, route optimization, and fraud detection. The real-time index is separate from the historical store. Rider updates: once matched, the rider app receives driver location via WebSocket every 3-5 seconds. The server subscribes to the matched driver location stream and pushes updates.

Question 4

How is ETA calculated accurately in a ride-sharing app?

Accepted Answer

ETA has three components: driver-to-pickup time, and trip duration (pickup to destination). For driver-to-pickup: use a routing service (Google Maps, OSRM) with real-time traffic. For the initial estimate before driver assignment, use a pre-computed travel time matrix (geohash cell to cell, updated hourly from historical GPS traces). For trip duration: same routing service with current traffic conditions. ML improvement: train a gradient boosted tree on features -- distance, time of day, day of week, traffic congestion index, weather, special events. Historical trip data provides labels (actual duration). The model learns city-specific patterns (school zones at 3 PM, construction on specific routes). Display a range (25-35 minutes) rather than a point estimate to manage expectations. After driver assignment, recalculate every 30 seconds using real-time driver GPS position.

System Design: Ride-Sharing (Uber/Lyft) — Matching Algorithm, ETA, Surge Pricing, Driver Location Tracking

High-Level Architecture

Driver Location Tracking

Matching Algorithm

Surge Pricing

ETA Calculation

Payment and Trip Settlement