Waze is a fascinating mobile system design topic — millions of users producing real-time GPS pings, manual incident reports, and traffic data, all aggregated into routing decisions for everyone else. The interview tests how you handle massive write throughput, geographic data partitioning, and the social dynamics of crowd-sourced data.
Functional requirements
- Real-time map with traffic overlay
- Turn-by-turn routing
- User-reported incidents (accidents, police, hazards)
- Estimated arrival time updates as conditions change
- Reporting and confirming reports
Architecture
Three pipelines: ingest (GPS pings + reports), aggregate (compute road-segment speeds), serve (routing + map).
GPS ingest at scale
Each active Waze user sends a GPS ping every 5–10 seconds during navigation. With millions of concurrent users, ingest is millions of writes per second.
Architecture:
- Ingest pings to a streaming pipeline (Kafka or equivalent)
- Partition by S2 cell or geohash to localize processing
- Stream-processing layer (Flink, Beam) computes per-segment speeds
- Speeds written to a fast-read store (Redis or in-memory cache)
Map matching
GPS pings are noisy. Map-match each ping to the most plausible road segment. Done client-side (using the bundled map data) before sending to the server.
The server trusts the client’s segment assignment but cross-validates against neighboring pings.
Speed aggregation
For each road segment, compute average speed of recent pings (last 60 seconds). Smoothing reduces noise. Outliers (a single car going 5mph on a clear highway) are filtered.
Sparse segments (few pings) revert to historical averages or rely on neighboring segment data.
Routing
Server-side. Take user A→B request. Compute shortest paths weighted by current segment speeds. Return polyline + ETA. Re-route automatically when conditions change.
Algorithms: contraction hierarchies, customizable contraction hierarchies (CCH), or A* with live edge weights.
Incident reporting
User taps “Report → Police.” Client sends location + type. Server stores in geographic index. Other clients within X miles of the report see it.
Reports decay over time. Other users can confirm or dispute, raising or lowering the report’s confidence score.
Voting and trust
To prevent abuse:
- New users have lower trust scores
- Confirmed reports increase the reporter’s score
- Disputed or stale reports decrease score
- Multiple high-trust confirmations create a “verified” report
Data partitioning
Geographic data scales by space, not by user count. Partition by S2 cells. Each cell has its own:
- Active users / pings stream
- Speed cache
- Incident database
Cross-cell queries (long routes) join across cells; in practice, sharded routing engines handle this.
Battery
- GPS at high accuracy during active navigation
- Cellular uplink batched every 5–10s
- Disable GPS when navigation ends
Frequently Asked Questions
How does Waze handle a user who reports false incidents?
Trust scoring penalizes their reports. Repeated abuse can result in account flags. Most reports are passive (driving slowly = traffic), so manipulation is hard.
Why does Waze sometimes give worse routes than Google Maps?
Waze optimizes for current traffic; Google often has better long-range planning and faster recovery from sudden conditions. Tradeoffs in routing algorithms.
How fresh is the traffic data?
Sub-30-second freshness for high-density areas. Lower-density areas degrade to 1-5 minute freshness.