Traffic Management Service Low-Level Design: Virtual Services, Weighted Routing, and Fault Injection

Overview

A traffic management service provides a centralized control plane for defining how traffic flows between services in a microservice deployment. It exposes high-level routing abstractions such as virtual services and destination rules, translates them into proxy-level configuration, and distributes that configuration to the data plane. This enables canary releases, A/B testing, fault injection for chaos engineering, and header-based routing for multi-tenant environments without code changes.

Requirements

Functional Requirements

  • Define virtual service routing rules that match on HTTP method, path, headers, and query parameters.
  • Split traffic by percentage across multiple service versions using weighted destinations.
  • Inject artificial delays and abort responses for specified fractions of requests (fault injection).
  • Route traffic based on request headers to support dark launching and user cohort routing.
  • Apply timeout and retry policies at the virtual service level.

Non-Functional Requirements

  • Policy changes propagate to all proxies within 30 seconds under normal conditions.
  • The control plane handles clusters with 10,000 services and 100,000 proxy instances.
  • Configuration is version-controlled with rollback capability to any previous version.
  • Validation rejects syntactically invalid or semantically conflicting rules before applying them.

Data Model

Virtual Service

A virtual service object contains: name, host (the DNS name of the destination service), a list of HTTP route rules, and optional TLS and TCP route rules. Each HTTP route rule contains: an ordered list of match conditions (path, headers, query params, source labels), a list of weighted route destinations each specifying a subset name and weight, and optional fault injection and timeout overrides. Route rules are evaluated in order; the first matching rule wins.

Destination Rule

A destination rule defines subsets of a service and connection policies. Each subset maps a label selector (such as version: v2) to a named subset. The traffic policy section of the destination rule specifies connection pool settings (max connections, max pending requests, max retries) and outlier detection settings (consecutive errors before ejection, ejection percentage, inspection interval). These settings are translated into cluster-level circuit breaker and health checking configuration in the proxy.

Configuration Store

All virtual service and destination rule objects are persisted in a strongly consistent configuration store keyed by (type, namespace, name). Each write creates a new immutable version with a monotonically increasing sequence number. The current active version per key is tracked in a separate pointer record. Rollback sets the pointer to a previous version without deleting the history, enabling audit trails and instant recovery.

Core Algorithms

Route Rule Compilation

When a virtual service is created or updated, the control plane compiles the high-level routing rules into xDS RouteConfiguration objects for delivery to proxies. The compilation process:

  • Validates that all referenced destination subsets exist in a corresponding destination rule.
  • Checks that weights across destinations in each route rule sum to 100.
  • Detects conflicting rules (identical match conditions on different routes) and returns a validation error.
  • Translates header match conditions into Envoy-native regex or exact match structures.
  • Serializes the compiled configuration to protobuf and writes it to the xDS snapshot cache keyed by the virtual service name.

Weighted Traffic Splitting

The compiled route destination list is delivered to proxies with integer weights. The proxy performs weighted random selection per request: it generates a random number in [0, total_weight) and linearly scans the destination list, accumulating weights until the cumulative sum exceeds the random value. The selected destination determines the upstream cluster. For high-weight imbalance (such as 99/1 splits), this linear scan is efficient because the first destination is selected in the vast majority of cases. For many-way splits, a binary search on the prefix sum array reduces selection to O(log D) where D is the destination count.

Fault Injection

Fault injection rules specify a delay fault (inject a fixed or range delay before forwarding the request) and an abort fault (return an error response immediately without forwarding). Each fault has a percentage specifying what fraction of matching requests it applies to. The proxy evaluates fault rules after route selection: it draws a random value in [0, 100) and applies the fault if the value is below the configured percentage. Delay faults use the proxy event loop timer to inject the delay without blocking a worker thread, preserving connection handling capacity during the injected pause.

API Design

  • ApplyVirtualService(spec) — validates and applies a virtual service definition; returns the new version number on success.
  • ApplyDestinationRule(spec) — validates and applies a destination rule; checks that all referenced subsets are defined.
  • RollbackVirtualService(name, namespace, version) — reverts the active pointer to a specified historical version.
  • GetVirtualService(name, namespace) — returns the current active specification and version history.
  • ValidateVirtualService(spec) — dry-run validation that returns errors without persisting the object.
  • GetPropagationStatus(name, namespace) — returns the count of proxies that have acknowledged the current configuration version.

Scalability

xDS Distribution

The control plane uses the Aggregated Discovery Service (ADS) protocol to multiplex all xDS resource types over a single gRPC stream per proxy. When any virtual service or destination rule changes, the control plane computes a new configuration snapshot and notifies affected proxy streams via the ADS push mechanism. Proxies acknowledge each version; the control plane tracks the ACK state per proxy to monitor propagation progress and to avoid overwhelming slow proxies with rapid successive updates by coalescing changes within a 100 ms debounce window.

Horizontal Scaling

Control plane instances are stateless with respect to proxy connections. Proxy-to-control-plane assignment uses consistent hashing on pod ID, distributing connections evenly. When a control plane instance fails, affected proxies reconnect and are redistributed to surviving instances. Reconnecting proxies receive the current configuration snapshot immediately upon connection establishment, restoring them to the correct state within one round-trip time after reconnection.

Monitoring

Key metrics include configuration push latency percentiles, ACK rate (fraction of proxies that acknowledged the latest version within 30 seconds), validation error rate per API call, rollback frequency (high rollback rate signals instability in the deployment pipeline), and per-virtual-service request distribution across weighted destinations sampled from proxy metrics to verify that actual traffic ratios match configured weights within acceptable statistical variance.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do virtual service routing rules work in a traffic management system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A VirtualService resource defines match conditions (host, path, headers, method) and corresponding routing actions (destination, weight, rewrite). The data plane (sidecar proxies) watches the control plane for VirtualService updates and recompiles their internal route tables. Rules are evaluated top-to-bottom with first-match semantics, so more-specific rules are placed earlier. This decouples deployment topology from routing intent, enabling dark launches and A/B tests without redeployment.”
}
},
{
“@type”: “Question”,
“name”: “How is weighted traffic splitting implemented for canary and blue-green deployments?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Weighted splitting assigns integer percentage weights to multiple destination subsets (e.g., stable=90, canary=10). Each proxy normalises the weights to a probability distribution and selects a subset per request using a random draw or a consistent-hash slot assignment. The control plane provides an API to adjust weights without downtime, allowing gradual canary promotion while monitoring error rates — rolling back is a single weight update.”
}
},
{
“@type”: “Question”,
“name”: “How does header-based routing enable targeted traffic management?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Header-based routing matches specific HTTP headers — such as X-User-Id, X-Region, or custom feature-flag headers — against exact values, prefixes, or regexes. Matching requests are forwarded to a designated destination subset regardless of path. This pattern supports internal employee testing (route requests with X-Internal: true to a staging cluster), regional affinity, and per-tenant routing without changing client-side URLs.”
}
},
{
“@type”: “Question”,
“name”: “What are fault injection primitives in traffic management and why are they useful?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Fault injection allows operators to deliberately introduce latency (delay faults) or error responses (abort faults) on a configured percentage of requests. Delay faults test timeout handling and cascade resilience; abort faults validate circuit-breaker and retry logic. Because injection is controlled by the data plane rather than application code, it can target specific routes or downstream services in production-like conditions without modifying service binaries.”
}
}
]
}

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Atlassian Interview Guide

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

Scroll to Top