Google Maps serves over 1 billion monthly active users with real-time navigation, traffic data, satellite imagery, and location search. Designing a maps platform combines geospatial data structures, graph algorithms for routing, image tiling for rendering, and ML for traffic prediction. This is one of the most comprehensive system design questions — touching storage, algorithms, real-time data, and rendering. This guide covers the key architectural components.
Geospatial Data Model
The world map is represented as: (1) Road network graph — nodes are intersections, edges are road segments with attributes: distance, speed limit, number of lanes, one-way flag, road type (highway, residential), and current traffic speed. The global road graph has billions of nodes and edges. (2) Points of Interest (POI) — restaurants, gas stations, hospitals with coordinates, names, categories, ratings, hours. Billions of POIs worldwide. (3) Map tiles — pre-rendered image tiles at multiple zoom levels for visual display. (4) Address data — mapping street addresses to coordinates (geocoding) and coordinates to addresses (reverse geocoding). Geospatial indexing: storing and querying spatial data efficiently. Approaches: (1) Geohash — encode latitude/longitude into a string. Nearby locations share a common prefix. geohash(“40.7128, -74.0060”) = “dr5ru7”. Prefix “dr5ru” covers a small area. Use for: proximity queries (find POIs near a location). (2) S2 Geometry (Google) — divides the Earth into hierarchical cells using a space-filling curve. Each cell has a unique 64-bit ID. Cells at the same level are roughly equal in area (unlike geohash which has distortion at poles). Google Maps uses S2 internally. (3) R-tree — a balanced tree for spatial indexing. Each node represents a bounding rectangle. Used by PostGIS for spatial queries.
Routing and Navigation
Finding the fastest route from A to B on a graph with billions of edges requires optimizations beyond basic Dijkstra (which is too slow for real-time queries on the full graph). Contraction Hierarchies (CH): a preprocessing technique that adds “shortcut” edges to skip intermediate nodes. During preprocessing: repeatedly contract the least important node (remove it and add shortcuts between its neighbors if the shortest path goes through it). The augmented graph enables bidirectional Dijkstra that expands very few nodes — sub-second queries on continent-scale graphs. Google and OSRM use variants of CH. A* with landmarks: A* uses a heuristic to guide search toward the destination. Landmarks are pre-selected distant points with pre-computed distances to all nodes. The triangle inequality provides a tight lower bound, dramatically reducing explored nodes. Traffic-aware routing: edge weights change with real-time traffic. The routing engine uses current traffic speeds (from GPS traces of millions of drivers) as edge weights. Traffic updates invalidate pre-computed shortcuts, so a hybrid approach is used: CH for the highway backbone (less affected by traffic) and real-time Dijkstra for the last-mile urban routing. Alternative routes: after finding the optimal route, compute 2-3 alternatives that are meaningfully different (not just minor detours). Penalize edges in the optimal route and re-run the search.
Map Tile Rendering
The visual map is rendered as tiles: square images at multiple zoom levels. At zoom level 0: one tile covers the entire world. At zoom level 1: 4 tiles (2×2). At zoom level Z: 4^Z tiles. At zoom level 18 (street level): approximately 69 billion tiles. Tiles are pre-rendered and served from CDN cache. When the user pans or zooms, the client requests the tiles for the visible area. Vector tiles (modern approach): instead of pre-rendered images, send the raw geographic data (road geometries, building outlines, labels) as Protocol Buffer-encoded vector tiles. The client renders them on the GPU. Benefits: smaller data size (10x smaller than raster tiles), smooth zooming (no pixelation), dynamic styling (switch between day/night mode without re-downloading), and rotation/tilt. Google Maps, Mapbox, and Apple Maps use vector tiles. Tile serving: tiles are keyed by (zoom_level, x, y). Store in object storage (S3/GCS). Serve via CDN with long cache TTLs (tiles change infrequently). At zoom level 18, most tiles are rarely requested (middle of the ocean). Only render on demand and cache; pre-render for popular areas.
ETA Prediction and Traffic
ETA accuracy is critical for navigation and ride-sharing. Google Maps uses an ML model (DeepMind collaboration) that predicts travel time with high accuracy. Data sources: (1) Real-time GPS traces from millions of Android phones and Google Maps users. Aggregated and anonymized: average speed per road segment updated every few minutes. (2) Historical traffic patterns: time-of-day and day-of-week patterns. Monday morning rush hour on Highway 101 is predictable. (3) Live incident data: accidents, road closures, construction from user reports and official feeds. (4) Weather data: rain, snow, and fog affect travel speeds. ML model: a graph neural network operates on the road network graph. Node features: road type, speed limit, number of lanes. Edge features: current speed, historical speed pattern, incidents. The model predicts travel time for each road segment and sums them for the full route. The model captures complex interactions: a traffic jam on one road causes spillover to alternate routes. Training data: billions of historical trips with actual travel times. ETA display: show a range (25-35 minutes) during planning. Show a single estimate during active navigation, updated every minute based on real-time position and conditions.
Location Search and Geocoding
Location search: the user types “coffee near me” or “123 Main St, New York.” The search system must understand: text queries (search POI database by name, category, and location), address queries (geocode the address to coordinates), and categorical queries (find nearby restaurants, sorted by distance and relevance). Geocoding: converting an address to coordinates. This requires a structured address database mapping street names, numbers, cities, and postal codes to coordinate ranges. Reverse geocoding: coordinates to address. Given (40.7128, -74.0060), return “New York, NY, USA.” Implemented with a spatial index: find the nearest address point or road segment to the coordinates. Autocomplete: as the user types, suggest completions from POIs, addresses, and recent searches. Rank by: relevance (text match), proximity (closer results first), popularity (frequently searched places first), and personalization (places the user has visited). The search system combines a text index (Elasticsearch for POI names) with a spatial index (S2/geohash for proximity) and an ML ranker that blends text relevance with geographic and behavioral signals.