Question 1

What is the difference between Layer 4 and Layer 7 load balancing?

Accepted Answer

Layer 4 load balancers operate at the transport layer (TCP/UDP). They route traffic based on IP address and port number without examining the application payload. The load balancer forwards the raw TCP connection to a backend server. Advantages: extremely fast and efficient (no payload parsing), protocol-agnostic (works with any application protocol), and lower resource consumption. Disadvantages: cannot make routing decisions based on HTTP content (URL, headers, cookies). Layer 7 load balancers operate at the application layer (HTTP/HTTPS). They inspect the full HTTP request and can route based on URL path, host header, HTTP method, cookies, and request body. Advantages: content-based routing (send /api to API servers, /static to CDN), session affinity via cookies, HTTP/2 support, Web Application Firewall integration, and SSL termination with certificate management. Disadvantages: higher latency (must parse HTTP), more CPU-intensive (especially with TLS), and protocol-specific (HTTP only). Use Layer 4 for high-throughput TCP/UDP services (databases, game servers, generic TCP proxies). Use Layer 7 for web applications needing content-based routing, SSL termination, and HTTP-aware features.

Question 2

Which load balancing algorithm should you use and when?

Accepted Answer

Round Robin: distributes requests evenly in sequence. Best when servers have equal capacity and requests have similar processing cost. Simple and predictable. Weighted Round Robin: assigns proportional weights to servers. A server with weight 3 gets 3x the traffic of weight 1. Use when servers have different hardware specs or you want to send more traffic to newer, faster instances. Least Connections: routes to the server with the fewest active connections. Best when request processing times vary significantly -- a long-running request keeps a connection open, and round robin would overload a server that happens to receive multiple slow requests. Least connections naturally balances load based on actual server utilization. IP Hash: hashes the client IP to always route the same client to the same server. Provides sticky sessions without cookies. Downside: uneven distribution if client IPs are concentrated (corporate NAT). Random with Two Choices (power of two random choices): pick two random servers, route to the one with fewer connections. Surprisingly close to Least Connections in effectiveness with lower overhead. Default recommendation: Least Connections for most web applications. It adapts to varying request costs and server speeds without configuration.

Question 3

How do health checks prevent traffic from being sent to unhealthy servers?

Accepted Answer

Without health checks, the load balancer sends requests to all configured backend servers, including those that are crashed, overloaded, or experiencing database connectivity issues. Users see errors for requests routed to unhealthy servers. Health check mechanism: the load balancer periodically sends a probe to each backend server and evaluates the response. TCP check: attempt a TCP connection to the server port. Passes if the connection is established. Only verifies the port is open -- the application could be running but returning errors. HTTP check: send GET /health and expect HTTP 200. The /health endpoint should verify critical dependencies: database connection is alive, cache is reachable, disk has sufficient space. Return 503 if any check fails. Timing parameters: interval (how often to check, e.g., every 5 seconds), timeout (how long to wait for a response, e.g., 3 seconds), unhealthy threshold (consecutive failures to mark unhealthy, e.g., 3 failures = 15 seconds to detect), healthy threshold (consecutive passes to mark healthy, e.g., 2 passes = 10 seconds to recover). When a server is marked unhealthy, the load balancer stops routing new traffic to it. Existing connections may be drained (allowed to complete) or terminated depending on configuration.

Question 4

How does SSL termination at the load balancer improve performance?

Accepted Answer

SSL termination means the load balancer decrypts HTTPS traffic and forwards plain HTTP to backend servers. Performance benefits: (1) CPU offload -- TLS handshakes and symmetric encryption/decryption are CPU-intensive operations. Offloading to the load balancer frees backend servers to focus on application logic. High-end load balancers use hardware TLS accelerators (ASICs) that handle encryption much faster than general-purpose CPUs. (2) Connection reuse -- the load balancer maintains a persistent connection pool to backends over plain HTTP. New client TLS connections do not require new backend connections. (3) Certificate management -- manage TLS certificates in one place instead of on every backend server. Easier renewal, rotation, and monitoring. AWS Certificate Manager provides free certificates auto-renewed for ALB. (4) HTTP inspection -- after decryption, the load balancer can inspect HTTP headers for routing, add headers (X-Forwarded-For, X-Request-Id), and apply Web Application Firewall rules. Security trade-off: traffic between the load balancer and backends is unencrypted. For compliance or zero-trust requirements, use SSL re-encryption (decrypt at LB, re-encrypt to backend) or SSL passthrough (forward encrypted traffic without decryption, losing HTTP inspection capability).

System Design: Load Balancer Architecture — L4 vs L7, Nginx, HAProxy, Algorithms, Health Checks, SSL Termination

Layer 4 vs Layer 7 Load Balancing

Load Balancing Algorithms

Health Checks and Failover

SSL/TLS Termination

Nginx and HAProxy in Production

Global Server Load Balancing (GSLB)