Low Level Design: Service Mesh and Sidecar Proxy Design

A service mesh is an infrastructure layer that handles service-to-service communication in a microservices architecture, providing observability, traffic management, and security without requiring application code changes. It achieves this by deploying a sidecar proxy (Envoy, Linkerd-proxy) alongside each service instance that intercepts all inbound and outbound traffic. Understanding service meshes is critical for system design interviews at companies running large-scale microservices.

Sidecar Proxy Architecture

In a Kubernetes pod, the sidecar proxy runs as an additional container alongside the application container. Both containers share a network namespace. iptables rules redirect all traffic to the proxy instead of the application directly. The proxy handles inbound traffic (from other services), applies policies, and passes it to the application on localhost. For outbound traffic, the application sends to localhost, the proxy intercepts, applies policies, and forwards to the destination service. The application never makes direct network calls to other services — the proxy mediates everything transparently.

# Kubernetes pod with sidecar injection (Istio)
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    image: myservice:v1.2
    # app binds to localhost:8080
  - name: istio-proxy   # auto-injected by admission webhook
    image: istio/proxyv2:1.18
    # intercepts all traffic via iptables rules
  initContainers:
  - name: istio-init    # sets up iptables redirect rules
    image: istio/proxyv2:1.18
    command: ["/bin/sh", "-c", "iptables -t nat -A OUTPUT -p tcp ! -d 127.0.0.1 -j REDIRECT --to-port 15001"]

Control Plane and Data Plane

A service mesh has two planes: the data plane (the proxies handling live traffic) and the control plane (managing proxy configuration). The control plane (Istiod in Istio) watches Kubernetes resources, computes routing rules and policies, and pushes configurations to all proxies via the xDS API (an Envoy-specific gRPC streaming protocol). Proxies apply rules locally without per-request control plane calls — configuration is pre-distributed. The control plane is high-availability but not on the critical path for live traffic.

Traffic Management Features

The proxy layer enables traffic management without application changes: Load balancing: round-robin, least connections, consistent hash by header. Circuit breaking: trip when error rate exceeds threshold, return fast errors while service recovers. Retries with backoff: automatically retry failed requests with configurable attempts and timeout budgets. Traffic splitting: send 10% of traffic to a canary version, 90% to stable. Fault injection: inject delays or errors for resilience testing in production without code changes.

mTLS: Zero-Trust Security

Service meshes implement mutual TLS (mTLS) between all proxies automatically. Each workload receives a short-lived X.509 certificate from the control plane CA (SPIFFE-compliant identity). Every service-to-service call is encrypted and authenticated at the proxy level — the application code never handles TLS. Authorization policies (PeerAuthentication, AuthorizationPolicy) declare which services can talk to which: service A can call service B on port 8080 using GET only. This implements zero-trust networking: no implicit trust based on network location.

Observability: Automatic Telemetry

Proxies emit telemetry automatically for every request: distributed traces (Zipkin B3 or W3C trace context headers, forwarded to Jaeger/Tempo), metrics (request rate, error rate, p50/p95/p99 latency per service pair, exported to Prometheus), and access logs (structured JSON per request). This gives full observability of inter-service traffic without any application instrumentation. Service topology maps (Kiali) are automatically derived from the traffic data the proxies report to the control plane.

Key Interview Discussion Points

  • Latency overhead: sidecar proxy adds ~1-2ms per hop; acceptable for most microservices but significant for high-frequency internal calls
  • eBPF-based meshes: Cilium uses eBPF kernel programs instead of sidecar proxies, reducing overhead to microseconds
  • Ambient mesh: Istio ambient mode uses per-node proxies instead of per-pod sidecars, reducing memory overhead for large clusters
  • East-west vs. north-south traffic: service meshes handle east-west (service-to-service); API gateways handle north-south (external to internal)
  • Service mesh vs. API gateway: gateway is the ingress point for external traffic; mesh handles internal traffic — both use Envoy as proxy in many deployments
Scroll to Top