What Is an ETA Prediction Service?
An ETA (Estimated Time of Arrival) prediction service answers the question: given a route from A to B, when will the traveler arrive? The challenge is that static graph weights are insufficient — ETA depends on current traffic, historical patterns, time of day, weather, and route-specific variability. A production ETA service combines graph-based travel time estimation with machine learning to produce calibrated arrival time distributions.
Data Model
- Historical segment speeds:
(edge_id BIGINT, day_of_week TINYINT, hour_of_day TINYINT, speed_p50 FLOAT, speed_p85 FLOAT, speed_p95 FLOAT)— percentile speeds by time bucket, precomputed from probe data. - Live traffic:
(edge_id BIGINT, observed_at TIMESTAMP, travel_time_s FLOAT, source ENUM('probe','sensor','incident')) - Incident:
(incident_id BIGINT, edge_id BIGINT, type ENUM('accident','construction','closure'), delay_factor FLOAT, starts_at TIMESTAMP, ends_at TIMESTAMP) - ETA request log:
(request_id UUID, route_id UUID, predicted_eta TIMESTAMP, actual_arrival TIMESTAMP, error_s INT)— used to monitor model accuracy and trigger retraining. - Feature store: precomputed route-level features (total distance, number of turns, road class distribution, historical variance) stored in Redis or a feature store (Feast) for low-latency ML inference.
Core Algorithm: Hybrid Graph + ML
Step 1 — Base Travel Time
Sum edge-level travel times along the route using the best available speed estimate: live > historical percentile > speed limit. This gives a baseline ETA.
Step 2 — ML Correction Layer
A gradient boosted model (XGBoost or LightGBM) takes as input:
- Baseline travel time from Step 1
- Time of day and day of week
- Historical variance of the route (coefficient of variation of past travel times)
- Number and severity of active incidents on the route
- Weather features (precipitation, visibility) from a weather API
- Recent probe speed ratio: live speed / historical speed for key segments
The model outputs a corrected expected travel time and optionally a confidence interval. Training uses the ETA request log, pairing route features at request time with actual arrival times as labels. The model is retrained daily on a rolling 90-day window.
Step 3 — Uncertainty Quantification
For high-variance routes (busy highways, event venues) the service returns a range: best-case (p15 speed profile), expected (p50), and worst-case (p85). The UI surfaces this as a range (e.g., “35–50 min”) rather than a point estimate, improving user trust.
Failure Handling
- ML model serving failure: fall back to the graph-only baseline ETA. Accuracy degrades but the service remains functional.
- Stale live traffic: blend live data with historical using a staleness-weighted average; weight live data at 0 if older than 10 minutes.
- Feature store unavailability: precompute a minimal feature set on the fly from the route graph; skip features requiring external lookups (weather).
- Model drift: monitor mean absolute error (MAE) of ETA predictions against actuals in real time. Alert and trigger emergency retraining if MAE exceeds a threshold (e.g., 15% above baseline).
Scalability Considerations
- ETA inference is fast (<10 ms) once features are assembled; the bottleneck is feature retrieval. Redis-backed feature store with sub-millisecond reads keeps p99 latency under 50 ms end-to-end.
- Heavy probe data ingestion flows through Kafka; a Flink streaming job aggregates per-edge speeds and writes to the live traffic table every 30 seconds.
- Batch retraining runs on a Spark cluster overnight; the model artifact is pushed to an artifact store (MLflow) and rolled out to inference servers via a canary deployment.
- For very long routes (cross-country), the route is split into segments and ETA is computed per segment, then summed, reducing per-request feature volume.
Summary
A production ETA prediction service layers a graph-based travel time estimator with a machine learning correction model trained on historical arrival data. The key design decisions are: maintain live and historical speed profiles per edge, build a low-latency feature store, and always have a graph-only fallback. Continuous monitoring of prediction error against actuals closes the feedback loop and keeps accuracy high as traffic patterns evolve.
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems