System Design Interview: Design Google Maps / Navigation System

System Design Interview: Design Google Maps / Navigation System

Designing a mapping and navigation system like Google Maps is a complex system design question that tests geospatial knowledge, real-time routing, data storage at scale, and map rendering. It’s commonly asked at Google, Uber, Lyft, Apple (Maps), and DoorDash.

Requirements Clarification

Functional Requirements

  • Display map tiles for any location and zoom level
  • Search for locations (places, addresses, POIs)
  • Calculate optimal routes between two points (driving, walking, transit)
  • Turn-by-turn navigation with real-time traffic
  • ETA calculation and dynamic re-routing
  • Traffic layer — show congestion in real-time

Non-Functional Requirements

  • Scale: 1 billion users, 10M concurrent navigation sessions
  • Map tile serving: extremely read-heavy, cacheable
  • Route calculation: P99 under 2 seconds for any global route
  • Traffic updates: near real-time (30-60 second staleness)
  • High availability: 99.99% (maps are mission-critical for drivers)

High-Level Architecture

Client → CDN (map tiles) → Load Balancer
                                ↓
              ┌─────────────────┴──────────────────┐
              │                                    │
        Tile Service                      Navigation Service
        (static tiles)                   (routing + traffic)
              │                                    │
        Object Storage              ┌──────────────┴───────────┐
        (S3/GCS)                    │                          │
                              Graph DB                   Traffic Service
                           (road network)              (Kafka + stream proc)
                                    │                          │
                             Route Engine               Location DB
                           (Dijkstra/A*)               (time-series)

Component 1: Map Tile System

Tile Coordinate System

Maps are divided into tiles using the Slippy Map system. Each tile is a 256×256 or 512×512 PNG/WebP image:

  • Zoom level 0: entire world in 1 tile
  • Zoom level 1: 4 tiles (2×2 grid)
  • Zoom level z: 4^z tiles (2^z x 2^z grid)
  • Zoom level 20: over 1 trillion tiles (street-level detail)
import math

def lat_lon_to_tile(lat: float, lon: float, zoom: int):
    """Convert lat/lon to tile coordinates at given zoom"""
    n = 2 ** zoom
    x_tile = int((lon + 180.0) / 360.0 * n)
    lat_rad = math.radians(lat)
    y_tile = int((1.0 - math.asinh(math.tan(lat_rad)) / math.pi) / 2.0 * n)
    return x_tile, y_tile

def tile_to_lat_lon(x: int, y: int, zoom: int):
    """Convert tile coordinates back to lat/lon (NW corner of tile)"""
    n = 2 ** zoom
    lon = x / n * 360.0 - 180.0
    lat_rad = math.atan(math.sinh(math.pi * (1 - 2 * y / n)))
    lat = math.degrees(lat_rad)
    return lat, lon

# Tile URL pattern: /{zoom}/{x}/{y}.png
# Example: /15/16378/10892.png

Tile Storage and Serving

  • Generation: Offline process renders tiles from OpenStreetMap data using tools like Mapnik. Takes days for full global render.
  • Storage: ~100TB for all zoom levels globally. Stored in object storage (GCS/S3) keyed by zoom/x/y.
  • Serving: CDN with edge caching — tiles are static and highly cacheable (1-year TTL). Cache hit rate 99%+ for popular areas.
  • Differential updates: When map data changes (new road), only affected tiles are re-rendered and cache-invalidated.

Component 2: Road Network Graph

Graph Representation

class RoadNode:
    def __init__(self, node_id: int, lat: float, lon: float):
        self.node_id = node_id
        self.lat = lat
        self.lon = lon

class RoadEdge:
    def __init__(self, from_node: int, to_node: int,
                 distance_meters: float, speed_limit_kmh: int,
                 road_class: str, is_one_way: bool):
        self.from_node = from_node
        self.to_node = to_node
        self.distance = distance_meters
        self.speed_limit = speed_limit_kmh
        self.road_class = road_class  # highway, arterial, residential
        self.is_one_way = is_one_way

    def travel_time_seconds(self, current_speed_multiplier: float = 1.0) -> float:
        """Base travel time adjusted for current traffic"""
        base_speed = self.speed_limit * current_speed_multiplier
        return (self.distance / 1000) / base_speed * 3600

Graph Scale

  • OpenStreetMap: ~7 billion nodes, ~800 million ways (roads)
  • Road graph after preprocessing: ~1 billion nodes, ~1.5 billion edges
  • Cannot fit in single machine memory — partitioned by geography (country/region)
  • Contracted Hierarchies (CH) preprocessing reduces effective graph size dramatically

Component 3: Routing Engine

Dijkstra’s Algorithm (Baseline)

import heapq
from typing import Dict, List, Tuple

def dijkstra(graph: Dict[int, List[Tuple[int, float]]], 
             start: int, end: int) -> Tuple[float, List[int]]:
    """
    Standard Dijkstra — too slow for global routing (billions of nodes)
    graph: {node_id: [(neighbor_id, weight), ...]}
    Returns: (total_distance, path)
    """
    distances = {start: 0}
    prev = {start: None}
    pq = [(0, start)]  # (distance, node)

    while pq:
        dist, node = heapq.heappop(pq)

        if node == end:
            # Reconstruct path
            path = []
            while node is not None:
                path.append(node)
                node = prev[node]
            return dist, list(reversed(path))

        if dist > distances.get(node, float('inf')):
            continue  # Stale entry

        for neighbor, weight in graph.get(node, []):
            new_dist = dist + weight
            if new_dist < distances.get(neighbor, float('inf')):
                distances[neighbor] = new_dist
                prev[neighbor] = node
                heapq.heappush(pq, (new_dist, neighbor))

    return float('inf'), []  # No path found

A* Algorithm (Faster with Heuristic)

def haversine_distance(lat1, lon1, lat2, lon2) -> float:
    """Great-circle distance in meters"""
    R = 6371000  # Earth radius in meters
    phi1, phi2 = math.radians(lat1), math.radians(lat2)
    dphi = math.radians(lat2 - lat1)
    dlambda = math.radians(lon2 - lon1)
    a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
    return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))

def astar(graph, node_coords, start, end) -> Tuple[float, List[int]]:
    """
    A* with straight-line distance heuristic.
    For routing: heuristic = haversine(current, goal) / max_speed
    Explores far fewer nodes than Dijkstra for point-to-point routing.
    """
    end_lat, end_lon = node_coords[end]

    def heuristic(node_id):
        lat, lon = node_coords[node_id]
        dist = haversine_distance(lat, lon, end_lat, end_lon)
        return dist / (130 / 3.6)  # 130 km/h = max highway speed

    g_score = {start: 0}
    f_score = {start: heuristic(start)}
    open_set = [(f_score[start], start)]
    came_from = {}

    while open_set:
        _, current = heapq.heappop(open_set)
        if current == end:
            path = []
            while current in came_from:
                path.append(current)
                current = came_from[current]
            path.append(start)
            return g_score[end], list(reversed(path))

        for neighbor, weight in graph.get(current, []):
            tentative_g = g_score[current] + weight
            if tentative_g < g_score.get(neighbor, float('inf')):
                came_from[neighbor] = current
                g_score[neighbor] = tentative_g
                f_score[neighbor] = tentative_g + heuristic(neighbor)
                heapq.heappush(open_set, (f_score[neighbor], neighbor))

    return float('inf'), []

Contraction Hierarchies (Production Algorithm)

Google Maps uses Contraction Hierarchies (CH) or similar hierarchical routing. Key idea:

  1. Preprocessing: Rank nodes by importance (major highways > arterials > residential). Add “shortcut” edges that bypass less-important nodes.
  2. Query: Bidirectional Dijkstra from source upward and from destination upward. Only explore edges going “up” the hierarchy. Meet in the middle at the most important node.
  3. Result: 1000x speedup over plain Dijkstra. Cross-continental routes in milliseconds.

Component 4: Real-Time Traffic

Traffic Data Pipeline

Mobile Clients (GPS probes) → Kafka → Stream Processor (Flink)
                                              ↓
                                    Segment Speed Aggregator
                                    (rolling 5-min window)
                                              ↓
                                    Traffic DB (time-series)
                                              ↓
                                    Route Engine (updated edge weights)

Probe Data Processing

class GPSProbe:
    def __init__(self, user_id: int, lat: float, lon: float, 
                 speed_kmh: float, timestamp: int, heading: float):
        self.user_id = user_id
        self.lat = lat
        self.lon = lon
        self.speed_kmh = speed_kmh
        self.timestamp = timestamp
        self.heading = heading

def map_match_to_segment(probe: GPSProbe, road_network) -> str:
    """
    Map matching: snap GPS point to nearest road segment.
    Uses HMM (Hidden Markov Model) for sequence of GPS points.
    Returns road segment ID.
    """
    # Find candidate road segments within 50 meters
    candidates = road_network.find_nearby_segments(probe.lat, probe.lon, radius=50)
    # Score by distance + heading alignment
    best = min(candidates, key=lambda seg: heading_distance_score(probe, seg))
    return best.segment_id

# Flink aggregation: rolling 5-minute window per segment
# Output: {segment_id: avg_speed_kmh, sample_count, timestamp}
# Route engine reads: edge_weight = segment_distance / adjusted_speed

Geospatial Indexing

Geohash

Geohash encodes lat/lon into a string where common prefix = geographic proximity. Level 6 geohash (~1.2km x 0.6km) is common for POI search. Used to partition data and find nearby points:

import geohash2

# Encode
gh = geohash2.encode(37.7749, -122.4194, precision=6)  # "9q8yy9"

# Decode back to lat/lon
lat, lon, lat_err, lon_err = geohash2.decode_exactly(gh)

# Find neighbors (for search: query center + 8 surrounding cells)
neighbors = geohash2.neighbors(gh)
# Returns: {'n': '9q8yyy', 's': '9q8yy3', 'e': '9q8yyd', ...}

# POI search: find all places in geohash cell + neighbors
# SELECT * FROM pois WHERE geohash6 IN ('9q8yy9', '9q8yyy', ...)

S2 Geometry (Google’s approach)

Google Maps uses S2 library which projects Earth onto a cube, then uses Hilbert curves for space-filling indexing. Benefits over Geohash: no distortion at poles, cells are more uniform in area. S2 cells at different levels correspond to different zoom levels.

ETA Calculation

def calculate_eta(route_segments: list, traffic_data: dict) -> int:
    """
    Calculate ETA in seconds accounting for:
    - Base travel time from speed limits
    - Current traffic (probe speeds)
    - Historical patterns (time-of-day, day-of-week)
    - Intersection delays (turn penalties)
    """
    total_seconds = 0
    for segment in route_segments:
        base_time = segment.distance / segment.speed_limit_ms
        traffic_factor = traffic_data.get(segment.id, {}).get('factor', 1.0)
        # Historical adjustment: rush hour = 1.5x, 3am = 0.8x
        historical = get_historical_factor(segment.road_class, segment.timestamp)
        total_seconds += base_time * traffic_factor * historical

    # Add intersection delays (left turns slower, traffic lights)
    turn_penalties = sum(get_turn_penalty(s1, s2) 
                         for s1, s2 in zip(route_segments, route_segments[1:]))
    return int(total_seconds + turn_penalties)

Interview Discussion Points

  • Why not use standard graph DBs for routing? Neo4j/Neptune can’t handle 1B+ node road networks with sub-second queries. Custom graph representations with Contraction Hierarchies are necessary.
  • How do you handle map updates? OSM community updates + paid data providers (HERE, TomTom). Differential tile rendering. Road network updates require rerunning CH preprocessing (takes hours — done nightly).
  • How does offline maps work? Download geohash cells at all zoom levels + road graph for region. SQLite database on device. Route calculation runs locally with A*.
  • How do you route in areas with no GPS? Dead reckoning: accelerometer + gyroscope + last known position. Map matching helps confirm position when GPS signal returns.
Scroll to Top