Geo Routing Service Low-Level Design: IP Geolocation, Latency-Based Routing, and Failover

What a Geo Routing Service Does

A geo routing service maps each incoming request to the most appropriate backend region based on the client's geographic location, measured or estimated network latency, and the health of each region. It sits upstream of your application, either as part of the DNS layer or as a traffic manager, and steers requests before they reach any application server.

IP Geolocation: Mapping Addresses to Locations

The foundation is an IP-to-location database. MaxMind GeoIP2 is the most widely used: it maps CIDR blocks to country, region, city, and approximate latitude/longitude. The database is built from BGP routing tables, WHOIS registration data, and active probing. Lookups are in-memory prefix-tree lookups and complete in microseconds.

Accuracy varies by network type. Residential ISPs are typically locatable to city level. Corporate and cloud provider ranges are often mapped to headquarters, not physical location. VPN and proxy users appear at the VPN exit node. For geo routing purposes, country-level accuracy is usually sufficient to pick the right continent-scale region.

EDNS Client Subnet (ECS) improves accuracy when the routing decision is made at the DNS layer. The recursive resolver includes a prefix of the actual client IP in the DNS query, allowing the authoritative server to geolocate the real client rather than the resolver.

Latency-Based Routing

IP geolocation tells you where a client probably is; latency probing tells you which region is actually fastest for them. Active latency measurement runs continuous synthetic probes from each region to reference points (other regions, major ISP peering points). These measurements feed a latency matrix. When routing a client, the service selects the healthy region with the lowest measured round-trip time to the client's estimated location.

The probe data ages quickly due to internet route changes. Probes must run frequently (every 30–60 seconds) to stay current. Historical p50/p95 latency is more stable than instantaneous measurements and less prone to transient spikes causing routing flaps.

Health-Aware Failover

Region health is a prerequisite for routing. Each region exposes a health endpoint that the routing service polls on a short interval (10–30 seconds). Health checks verify that the region is accepting traffic, not just that the health endpoint responds. Relevant signals: error rate, request success rate, capacity available, and whether the region is in a scheduled maintenance window.

When a primary region becomes unhealthy, the routing service removes it from the candidate set and routes affected clients to the nearest healthy secondary. This failover must propagate quickly. DNS-based failover is limited by TTL; a traffic manager that intercepts requests at the proxy or load-balancer layer can fail over in seconds by changing upstream selection without waiting for DNS TTL expiration.

GeoDNS Integration

GeoDNS is the most common implementation of geo routing at scale. The authoritative DNS server for your zone returns different A or CNAME records depending on where the query comes from. A query from a European resolver returns the Frankfurt endpoint; a query from an Asian resolver returns the Tokyo endpoint. Route 53 Latency-Based Routing and Cloudflare Load Balancing both offer this as a managed service.

The DNS layer enforces coarse-grained routing. Finer control — per-user routing, real-time capacity-based steering — requires a proxy layer that can inspect requests and redirect at the HTTP level.

Traffic Steering Policies

Beyond simple nearest-region routing, a geo routing service supports multiple steering policies:

  • Weighted split: send 90% to primary region, 10% to secondary (useful for load distribution when regions have different capacities)
  • Blue-green per region: route all traffic for a region to the new deployment, with instant cut-back if health degrades
  • Canary per region: gradually increase the percentage of a region's traffic going to the new version
  • Data residency enforcement: EU-origin traffic must route only to EU regions regardless of latency, to satisfy GDPR or data sovereignty requirements

Regional Capacity Weighting

Regions are not equal in capacity. A geo routing service should factor in each region's available capacity when distributing load. A region at 90% capacity should receive less traffic than one at 40%, even if it is geographically closer. Capacity signals come from the region itself (CPU utilization, queue depth, active connection count) and feed into the routing weight calculation alongside latency and health.

Anycast BGP Routing

At the network layer, anycast BGP routing provides automatic geo-proximity routing without any application-layer involvement. Multiple PoPs announce the same IP prefix; BGP selects the route with the fewest AS hops. Traffic flows to the topologically nearest PoP. CDNs and authoritative DNS providers use this for their own infrastructure. Application-layer geo routing builds on top of anycast for the final request dispatch.

Measuring Routing Accuracy and Failover Propagation

Routing accuracy is measured by comparing the region a client was routed to against the region with the lowest actual latency for that client. Persistent misrouting (e.g., European traffic landing in US-East) indicates a stale or inaccurate geolocation database or misconfigured steering policy. Failover propagation time is the interval from the moment a region fails its health check to the moment new clients stop being routed there. For DNS-based routing this is bounded by TTL; for proxy-based routing it should be under 30 seconds.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does IP geolocation mapping work at scale?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “IP geolocation databases (such as MaxMind GeoIP2) map CIDR prefixes to geographic coordinates by aggregating BGP routing table data, WHOIS registry records, and active measurement probes. At scale these databases are loaded into memory-mapped radix trees so that an incoming IP can be resolved to a country, region, or city in sub-microsecond time without a network round-trip.”
}
},
{
“@type”: “Question”,
“name”: “How is latency-based routing measured between regions?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Latency-based routing relies on continuous synthetic probes — typically ICMP or TCP SYN pings — sent between all region pairs on a fixed interval, with results stored in a low-latency key-value store that the routing layer reads per request. AWS Route 53 latency routing, for example, uses passive measurements from actual resolver queries to maintain a region-to-resolver latency matrix that is updated every few minutes.”
}
},
{
“@type”: “Question”,
“name”: “How does geo-routing handle data residency requirements?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Geo-routing enforces data residency by maintaining a policy table that maps jurisdiction codes (derived from the client's resolved location) to allowed region sets, and the routing layer rejects or redirects any request whose origin jurisdiction is incompatible with the candidate region. This is typically combined with per-tenant metadata so that a GDPR-regulated EU user is never routed to a US region even if that region currently has lower latency.”
}
},
{
“@type”: “Question”,
“name”: “How quickly can traffic be shifted during a regional failover?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Traffic shift speed is bounded by DNS TTL for DNS-based geo-routing (tens of seconds to minutes) or by control-plane propagation lag for anycast/BGP-based routing (typically under 30 seconds once a withdraw is announced). Pre-warming the target region's connection pools and auto-scaling groups before the shift is critical to prevent a thundering-herd collapse when redirected traffic arrives.”
}
}
]
}

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

Scroll to Top