Low Level Design: Shadow Mode Testing

Shadow mode testing (dark launch or traffic mirroring) runs a new code path in parallel with the production path, comparing outputs without affecting users. It validates new implementations against real production traffic before cutover. The technique is invaluable for replacing critical components (search ranking models, recommendation algorithms, database migrations) where synthetic tests cannot fully represent production behavior.

How Shadow Mode Works

The production system receives a request and routes it to two paths simultaneously: the primary path (existing code, result returned to user) and the shadow path (new code, result discarded). Both paths receive identical inputs. The shadow path results are logged and compared against the primary path results. Differences are recorded for analysis. Users experience no change — they see only the primary path result. The shadow path can fail without impacting users.

Traffic Mirroring at the Proxy Layer

Implement at the proxy layer (Nginx, Envoy, Istio) without application code changes. Envoy mirror filter: send a copy of each incoming request to the shadow upstream (different service version or endpoint). The mirror request is fire-and-forget: the response is ignored and does not affect the primary request latency. This is the cleanest implementation — no application code changes required. Limit mirroring to a percentage of traffic (e.g., 10%) to reduce shadow service load.

Application-Level Shadow Mode

For finer control, implement shadow mode in application code: call the new implementation asynchronously in a background goroutine/thread, log the result, and return the primary result without waiting. This allows shadow testing of internal functions (not just HTTP endpoints), applying shadow mode selectively (only for specific request types), and implementing sophisticated comparison logic. Ensure the shadow execution cannot cause side effects (do not write to production databases, send emails, or charge payments).

Result Comparison

Log both primary and shadow results with a shared request ID for correlation. Compare: response structure (same fields, same types), response values (numeric differences within tolerance), and response codes. For recommendation engines, compare the ranked item lists — measuring overlap (Jaccard similarity) and rank correlation (Kendall’s tau). For search ranking, compare NDCG of shadow vs primary results against clicked documents. Store comparison results in a data warehouse for analysis and visualization.

Side Effect Prevention

Shadow code must not produce side effects visible to users or external systems. Intercept and nullify: database writes (use a shadow database or NOP the write), external API calls (mock or shadow endpoint), event publishing (discard or publish to a shadow Kafka topic), emails/SMS (suppress or send to a test mailbox). Use dependency injection to swap production dependencies for shadow-safe stubs when running in shadow mode. Flag the execution context so all downstream code knows it is in shadow mode.

Gradual Cutover

After shadow testing validates the new implementation, gradually shift traffic: 0% → 1% → 5% → 20% → 50% → 100%. During each stage, monitor production metrics (error rate, latency, business metrics) for regressions. The shadow period builds confidence; the gradual cutover detects issues at small blast radius. Maintain the ability to roll back quickly by keeping the old implementation ready. Shadow testing does not replace gradual rollout — it provides pre-cutover validation.

Use Cases

Shadow mode is most valuable for: ML model upgrades (compare recommendation quality and latency), database engine migrations (same queries against old and new database, compare results and performance), microservice extractions (shadow the new service against the monolith it replaces), algorithm replacements (new sorting or ranking logic), and compliance validation (shadow mode verifies new audit logging or data masking before cutover). Any time correctness matters more than speed-of-delivery, shadow mode de-risks the change.

Scroll to Top