System Design Interview: Design Google Maps / Navigation System
Designing a mapping and navigation system like Google Maps is a complex system design question that tests geospatial knowledge, real-time routing, data storage at scale, and map rendering. It’s commonly asked at Google, Uber, Lyft, Apple (Maps), and DoorDash.
Requirements Clarification
Functional Requirements
- Display map tiles for any location and zoom level
- Search for locations (places, addresses, POIs)
- Calculate optimal routes between two points (driving, walking, transit)
- Turn-by-turn navigation with real-time traffic
- ETA calculation and dynamic re-routing
- Traffic layer — show congestion in real-time
Non-Functional Requirements
- Scale: 1 billion users, 10M concurrent navigation sessions
- Map tile serving: extremely read-heavy, cacheable
- Route calculation: P99 under 2 seconds for any global route
- Traffic updates: near real-time (30-60 second staleness)
- High availability: 99.99% (maps are mission-critical for drivers)
High-Level Architecture
Client → CDN (map tiles) → Load Balancer
↓
┌─────────────────┴──────────────────┐
│ │
Tile Service Navigation Service
(static tiles) (routing + traffic)
│ │
Object Storage ┌──────────────┴───────────┐
(S3/GCS) │ │
Graph DB Traffic Service
(road network) (Kafka + stream proc)
│ │
Route Engine Location DB
(Dijkstra/A*) (time-series)
Component 1: Map Tile System
Tile Coordinate System
Maps are divided into tiles using the Slippy Map system. Each tile is a 256×256 or 512×512 PNG/WebP image:
- Zoom level 0: entire world in 1 tile
- Zoom level 1: 4 tiles (2×2 grid)
- Zoom level z: 4^z tiles (2^z x 2^z grid)
- Zoom level 20: over 1 trillion tiles (street-level detail)
import math
def lat_lon_to_tile(lat: float, lon: float, zoom: int):
"""Convert lat/lon to tile coordinates at given zoom"""
n = 2 ** zoom
x_tile = int((lon + 180.0) / 360.0 * n)
lat_rad = math.radians(lat)
y_tile = int((1.0 - math.asinh(math.tan(lat_rad)) / math.pi) / 2.0 * n)
return x_tile, y_tile
def tile_to_lat_lon(x: int, y: int, zoom: int):
"""Convert tile coordinates back to lat/lon (NW corner of tile)"""
n = 2 ** zoom
lon = x / n * 360.0 - 180.0
lat_rad = math.atan(math.sinh(math.pi * (1 - 2 * y / n)))
lat = math.degrees(lat_rad)
return lat, lon
# Tile URL pattern: /{zoom}/{x}/{y}.png
# Example: /15/16378/10892.png
Tile Storage and Serving
- Generation: Offline process renders tiles from OpenStreetMap data using tools like Mapnik. Takes days for full global render.
- Storage: ~100TB for all zoom levels globally. Stored in object storage (GCS/S3) keyed by zoom/x/y.
- Serving: CDN with edge caching — tiles are static and highly cacheable (1-year TTL). Cache hit rate 99%+ for popular areas.
- Differential updates: When map data changes (new road), only affected tiles are re-rendered and cache-invalidated.
Component 2: Road Network Graph
Graph Representation
class RoadNode:
def __init__(self, node_id: int, lat: float, lon: float):
self.node_id = node_id
self.lat = lat
self.lon = lon
class RoadEdge:
def __init__(self, from_node: int, to_node: int,
distance_meters: float, speed_limit_kmh: int,
road_class: str, is_one_way: bool):
self.from_node = from_node
self.to_node = to_node
self.distance = distance_meters
self.speed_limit = speed_limit_kmh
self.road_class = road_class # highway, arterial, residential
self.is_one_way = is_one_way
def travel_time_seconds(self, current_speed_multiplier: float = 1.0) -> float:
"""Base travel time adjusted for current traffic"""
base_speed = self.speed_limit * current_speed_multiplier
return (self.distance / 1000) / base_speed * 3600
Graph Scale
- OpenStreetMap: ~7 billion nodes, ~800 million ways (roads)
- Road graph after preprocessing: ~1 billion nodes, ~1.5 billion edges
- Cannot fit in single machine memory — partitioned by geography (country/region)
- Contracted Hierarchies (CH) preprocessing reduces effective graph size dramatically
Component 3: Routing Engine
Dijkstra’s Algorithm (Baseline)
import heapq
from typing import Dict, List, Tuple
def dijkstra(graph: Dict[int, List[Tuple[int, float]]],
start: int, end: int) -> Tuple[float, List[int]]:
"""
Standard Dijkstra — too slow for global routing (billions of nodes)
graph: {node_id: [(neighbor_id, weight), ...]}
Returns: (total_distance, path)
"""
distances = {start: 0}
prev = {start: None}
pq = [(0, start)] # (distance, node)
while pq:
dist, node = heapq.heappop(pq)
if node == end:
# Reconstruct path
path = []
while node is not None:
path.append(node)
node = prev[node]
return dist, list(reversed(path))
if dist > distances.get(node, float('inf')):
continue # Stale entry
for neighbor, weight in graph.get(node, []):
new_dist = dist + weight
if new_dist < distances.get(neighbor, float('inf')):
distances[neighbor] = new_dist
prev[neighbor] = node
heapq.heappush(pq, (new_dist, neighbor))
return float('inf'), [] # No path found
A* Algorithm (Faster with Heuristic)
def haversine_distance(lat1, lon1, lat2, lon2) -> float:
"""Great-circle distance in meters"""
R = 6371000 # Earth radius in meters
phi1, phi2 = math.radians(lat1), math.radians(lat2)
dphi = math.radians(lat2 - lat1)
dlambda = math.radians(lon2 - lon1)
a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
def astar(graph, node_coords, start, end) -> Tuple[float, List[int]]:
"""
A* with straight-line distance heuristic.
For routing: heuristic = haversine(current, goal) / max_speed
Explores far fewer nodes than Dijkstra for point-to-point routing.
"""
end_lat, end_lon = node_coords[end]
def heuristic(node_id):
lat, lon = node_coords[node_id]
dist = haversine_distance(lat, lon, end_lat, end_lon)
return dist / (130 / 3.6) # 130 km/h = max highway speed
g_score = {start: 0}
f_score = {start: heuristic(start)}
open_set = [(f_score[start], start)]
came_from = {}
while open_set:
_, current = heapq.heappop(open_set)
if current == end:
path = []
while current in came_from:
path.append(current)
current = came_from[current]
path.append(start)
return g_score[end], list(reversed(path))
for neighbor, weight in graph.get(current, []):
tentative_g = g_score[current] + weight
if tentative_g < g_score.get(neighbor, float('inf')):
came_from[neighbor] = current
g_score[neighbor] = tentative_g
f_score[neighbor] = tentative_g + heuristic(neighbor)
heapq.heappush(open_set, (f_score[neighbor], neighbor))
return float('inf'), []
Contraction Hierarchies (Production Algorithm)
Google Maps uses Contraction Hierarchies (CH) or similar hierarchical routing. Key idea:
- Preprocessing: Rank nodes by importance (major highways > arterials > residential). Add “shortcut” edges that bypass less-important nodes.
- Query: Bidirectional Dijkstra from source upward and from destination upward. Only explore edges going “up” the hierarchy. Meet in the middle at the most important node.
- Result: 1000x speedup over plain Dijkstra. Cross-continental routes in milliseconds.
Component 4: Real-Time Traffic
Traffic Data Pipeline
Mobile Clients (GPS probes) → Kafka → Stream Processor (Flink)
↓
Segment Speed Aggregator
(rolling 5-min window)
↓
Traffic DB (time-series)
↓
Route Engine (updated edge weights)
Probe Data Processing
class GPSProbe:
def __init__(self, user_id: int, lat: float, lon: float,
speed_kmh: float, timestamp: int, heading: float):
self.user_id = user_id
self.lat = lat
self.lon = lon
self.speed_kmh = speed_kmh
self.timestamp = timestamp
self.heading = heading
def map_match_to_segment(probe: GPSProbe, road_network) -> str:
"""
Map matching: snap GPS point to nearest road segment.
Uses HMM (Hidden Markov Model) for sequence of GPS points.
Returns road segment ID.
"""
# Find candidate road segments within 50 meters
candidates = road_network.find_nearby_segments(probe.lat, probe.lon, radius=50)
# Score by distance + heading alignment
best = min(candidates, key=lambda seg: heading_distance_score(probe, seg))
return best.segment_id
# Flink aggregation: rolling 5-minute window per segment
# Output: {segment_id: avg_speed_kmh, sample_count, timestamp}
# Route engine reads: edge_weight = segment_distance / adjusted_speed
Geospatial Indexing
Geohash
Geohash encodes lat/lon into a string where common prefix = geographic proximity. Level 6 geohash (~1.2km x 0.6km) is common for POI search. Used to partition data and find nearby points:
import geohash2
# Encode
gh = geohash2.encode(37.7749, -122.4194, precision=6) # "9q8yy9"
# Decode back to lat/lon
lat, lon, lat_err, lon_err = geohash2.decode_exactly(gh)
# Find neighbors (for search: query center + 8 surrounding cells)
neighbors = geohash2.neighbors(gh)
# Returns: {'n': '9q8yyy', 's': '9q8yy3', 'e': '9q8yyd', ...}
# POI search: find all places in geohash cell + neighbors
# SELECT * FROM pois WHERE geohash6 IN ('9q8yy9', '9q8yyy', ...)
S2 Geometry (Google’s approach)
Google Maps uses S2 library which projects Earth onto a cube, then uses Hilbert curves for space-filling indexing. Benefits over Geohash: no distortion at poles, cells are more uniform in area. S2 cells at different levels correspond to different zoom levels.
ETA Calculation
def calculate_eta(route_segments: list, traffic_data: dict) -> int:
"""
Calculate ETA in seconds accounting for:
- Base travel time from speed limits
- Current traffic (probe speeds)
- Historical patterns (time-of-day, day-of-week)
- Intersection delays (turn penalties)
"""
total_seconds = 0
for segment in route_segments:
base_time = segment.distance / segment.speed_limit_ms
traffic_factor = traffic_data.get(segment.id, {}).get('factor', 1.0)
# Historical adjustment: rush hour = 1.5x, 3am = 0.8x
historical = get_historical_factor(segment.road_class, segment.timestamp)
total_seconds += base_time * traffic_factor * historical
# Add intersection delays (left turns slower, traffic lights)
turn_penalties = sum(get_turn_penalty(s1, s2)
for s1, s2 in zip(route_segments, route_segments[1:]))
return int(total_seconds + turn_penalties)
Interview Discussion Points
- Why not use standard graph DBs for routing? Neo4j/Neptune can’t handle 1B+ node road networks with sub-second queries. Custom graph representations with Contraction Hierarchies are necessary.
- How do you handle map updates? OSM community updates + paid data providers (HERE, TomTom). Differential tile rendering. Road network updates require rerunning CH preprocessing (takes hours — done nightly).
- How does offline maps work? Download geohash cells at all zoom levels + road graph for region. SQLite database on device. Route calculation runs locally with A*.
- How do you route in areas with no GPS? Dead reckoning: accelerometer + gyroscope + last known position. Map matching helps confirm position when GPS signal returns.