System Design Interview: API Gateway and Service Mesh

API gateways and service meshes are the networking backbone of modern microservices architectures. They handle cross-cutting concerns — authentication, rate limiting, routing, observability — so individual services don’t have to. Understanding when to use each is a common senior engineering interview topic.

API Gateway: North-South Traffic

An API gateway handles traffic entering the system from external clients (internet → services). It is the single entry point for all external requests.

External Client
    │
    ▼
API Gateway (Kong, AWS API Gateway, nginx, Envoy)
    │  Responsibilities:
    │  ├── TLS termination
    │  ├── Authentication (JWT validation, API keys, OAuth)
    │  ├── Rate limiting (per-client, per-endpoint)
    │  ├── Request routing (path → service)
    │  ├── Request/response transformation
    │  ├── Load balancing (to service instances)
    │  └── Observability (access logs, metrics, tracing)
    │
    ├── /api/v1/users    → User Service
    ├── /api/v1/orders   → Order Service
    └── /api/v1/products → Product Service

Authentication at the Gateway

JWT validation flow:
  Client → "Authorization: Bearer eyJhbGc..."
  Gateway:
    1. Parse JWT header → algorithm (RS256/ES256)
    2. Fetch public key from JWKS endpoint (cached)
    3. Verify signature
    4. Check exp, iss, aud claims
    5. Extract user_id, tenant_id, scopes
    6. Forward to service as X-User-ID, X-Tenant-ID headers
       (services trust these headers — no re-validation)

API Key auth:
  Client → "X-API-Key: sk_live_xxxxx"
  Gateway → hash(key) → lookup in API key store (Redis)
           → return associated tenant_id + rate limits

Rate Limiting at the Gateway

Token bucket implementation (Redis + Lua):
  Key: rate_limit:{client_id}:{endpoint}
  Per-request Lua script (atomic):
    current = GET key
    if current < limit:
        INCR key; EXPIRE key window_seconds
        allow request
    else:
        return 429 Too Many Requests

Response headers:
  X-RateLimit-Limit: 1000
  X-RateLimit-Remaining: 847
  X-RateLimit-Reset: 1713272400  (Unix epoch when window resets)
  Retry-After: 15               (seconds until retry allowed)

Distributed rate limiting (multiple gateway instances):
  All gateways share Redis cluster for consistent counting
  Trade-off: Redis round-trip adds ~1ms per request
  Alternative: approximate counting with local buckets + periodic sync

Service Mesh: East-West Traffic

A service mesh handles traffic between services within the cluster. It operates at the infrastructure layer without requiring application code changes.

Service A ─── Envoy sidecar ──► Envoy sidecar ─── Service B
                   │                    │
                   └────────────────────┘
                         Control plane
                    (Istio / Linkerd / Consul)

Sidecar proxy handles:
  ├── mTLS: automatic certificate rotation, mutual auth
  ├── Load balancing: round-robin, least-request, zone-aware
  ├── Circuit breaking: open circuit after N failures
  ├── Retries: automatic retry with exponential backoff
  ├── Timeouts: per-route timeout enforcement
  ├── Observability: metrics, distributed traces (no app changes)
  └── Traffic shaping: canary deployment, A/B testing

mTLS: Zero-Trust Service Identity

Without service mesh:
  Service A → Service B (no authentication — any pod can call any service)

With mTLS (mutual TLS):
  Istio CA issues X.509 certificate to each service (SPIFFE format)
  Certificate: "spiffe://cluster.local/ns/default/sa/order-service"

  Service A presents cert → Service B verifies A's identity
  Service B presents cert → Service A verifies B's identity
  Channel is encrypted end-to-end

  AuthorizationPolicy (Istio):
    apiVersion: security.istio.io/v1beta1
    kind: AuthorizationPolicy
    metadata:
      name: order-service-policy
    spec:
      selector:
        matchLabels:
          app: order-service
      rules:
      - from:
        - source:
            principals: ["cluster.local/ns/default/sa/checkout-service"]
        to:
        - operation:
            methods: ["POST"]
            paths: ["/api/orders"]

Circuit Breaker Pattern

States: CLOSED → OPEN → HALF-OPEN

CLOSED (normal): requests pass through; failure rate tracked
  If failure rate > threshold (e.g., 50% in 10s):
    → OPEN: fail fast, return error immediately (no actual call)

OPEN: wait for recovery period (e.g., 30s)
  → HALF-OPEN: allow small fraction of requests through

HALF-OPEN: test if dependency has recovered
  If requests succeed → CLOSED (resume normal operation)
  If requests fail    → OPEN again (wait longer)

Service mesh implementation (Istio DestinationRule):
  trafficPolicy:
    outlierDetection:
      consecutive5xxErrors: 5      # open after 5 consecutive errors
      interval: 10s                # evaluation window
      baseEjectionTime: 30s        # how long to eject unhealthy host
      maxEjectionPercent: 50       # eject at most 50% of hosts

API Gateway vs Service Mesh Comparison

Concern API Gateway Service Mesh
Traffic direction North-south (external → internal) East-west (service → service)
Authentication External JWT/API key validation Internal mTLS identity
Rate limiting Per-client/IP global limits Service-to-service limits
Observability Access logs, API analytics Service dependency map, latency breakdown
Implementation Application-aware (paths, headers) Infrastructure layer (transparent)
Typical tools Kong, AWS API GW, nginx, Traefik Istio, Linkerd, Consul Connect

BFF Pattern: Backend for Frontend

Problem: one API must serve multiple clients with different needs
  Mobile app: needs lightweight responses, push notifications
  Web app:    can handle richer data, SSE instead of polling
  Partner API: needs different auth, rate limits, data format

Solution: BFF (Backend For Frontend)
  Mobile Gateway  → Mobile-optimized API → Services
  Web Gateway     → Web-optimized API    → Services
  Partner Gateway → Partner API          → Services

Benefits:
  - Each gateway tailored to client needs (field selection, pagination)
  - Independent versioning and deprecation
  - Client-specific auth strategies
  - Different rate limits per client type

Implementation: GraphQL as BFF (Apollo Federation)
  Client sends GraphQL query → BFF resolves to multiple service calls
  → assembles composite response → returns only requested fields

Interview Discussion Points

  • When do you need a service mesh? When you have 10+ microservices and need consistent observability, mTLS, and traffic management without modifying every service. For simple architectures (< 5 services), the operational complexity of Istio outweighs the benefits — use a shared middleware library instead.
  • How to handle API versioning? URL path versioning (/v1/, /v2/) is most explicit. Header-based versioning (Accept: application/vnd.api+json;version=2) is RESTful but harder to route. Keep at most 2 major versions in production simultaneously; deprecate old versions with sunset headers and 12-month migration periods.
  • Service mesh overhead: Envoy sidecar adds ~5-10ms per hop and 50-100MB RAM per pod. Justify with the operational savings on observability and security. Ambient mesh (Istio 1.15+) removes per-pod sidecars — uses node-level DaemonSet instead, reducing overhead significantly.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the difference between an API gateway and a service mesh?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An API gateway handles north-south traffic u2014 requests entering the system from external clients. It enforces authentication (JWT validation, API keys), rate limiting per client, request routing, and API versioning. A service mesh handles east-west traffic u2014 communication between microservices within the cluster. It provides mTLS for service-to-service authentication, circuit breaking, automatic retries, distributed tracing, and traffic shaping (canary deployments) without requiring application code changes. Both can coexist: the API gateway is the external entry point, while the service mesh governs internal service communication. They are complementary, not alternatives.”
}
},
{
“@type”: “Question”,
“name”: “How does mTLS in a service mesh improve security over traditional service-to-service calls?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Without mTLS, any pod in the cluster can call any service without authentication u2014 a compromised pod can impersonate any service. With mTLS, each service has a cryptographic identity (X.509 certificate issued by the mesh CA, in SPIFFE format). Both sides of a connection present certificates and verify each other’s identity before communication begins. This enables AuthorizationPolicies: “only the checkout-service can call POST /orders on the order-service.” Certificate rotation is automatic (every 24h by default in Istio), removing the operational burden of manual certificate management. mTLS also encrypts all inter-service traffic, protecting against network-level eavesdropping within the cluster.”
}
},
{
“@type”: “Question”,
“name”: “What is the Backend for Frontend (BFF) pattern and when is it useful?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “BFF creates a dedicated API layer for each client type (mobile app, web app, partner API) rather than one general-purpose API. This solves the problem of different clients needing different data shapes, response sizes, auth strategies, and rate limits. A mobile BFF can aggregate multiple service calls into one response optimized for mobile bandwidth constraints. A web BFF can use server-sent events instead of polling. A partner BFF enforces different rate limits and data access controls without affecting internal clients. GraphQL is commonly used as a BFF because it allows clients to request exactly the fields they need, with the BFF resolving those fields from multiple downstream services.”
}
}
]
}

  • Airbnb Interview Guide
  • HashiCorp Interview Guide
  • LinkedIn Interview Guide
  • Uber Interview Guide
  • Netflix Interview Guide
  • Stripe Interview Guide
  • Companies That Ask This

    Scroll to Top