Question 1

How does the Kubernetes scheduler decide which node to place a pod on?

Accepted Answer

The kube-scheduler runs in two phases: filtering and scoring. Filtering eliminates nodes that cannot run the pod — based on resource requests (CPU/memory), node selectors, taints and tolerations, affinity/anti-affinity rules, and volume topology constraints. Scoring ranks the remaining nodes using weighted priority functions such as LeastAllocated (prefer nodes with more free resources), BalancedResourceAllocation (prefer balanced CPU/memory ratio), NodeAffinity score, and InterPodAffinity score. The node with the highest score wins. If multiple nodes tie, one is selected at random. Custom schedulers or scheduler extenders can override this logic for specialized workloads like GPU jobs or latency-sensitive services.

Question 2

What is the difference between HPA, VPA, and KEDA for Kubernetes autoscaling?

Accepted Answer

HPA (Horizontal Pod Autoscaler) scales the number of pod replicas based on metrics — typically CPU/memory utilization or custom metrics via the metrics API. It reacts to load by adding or removing pods. VPA (Vertical Pod Autoscaler) adjusts the CPU and memory requests/limits of existing pods based on observed usage — it scales up or down the resources per pod, which usually requires a pod restart. KEDA (Kubernetes Event-Driven Autoscaler) extends HPA to scale based on external event sources — queue depth (SQS, Kafka, RabbitMQ), cron schedules, Prometheus metrics, or cloud service triggers. KEDA can scale to zero replicas, which HPA cannot. In practice: use HPA for stateless HTTP services, VPA for right-sizing resource requests, and KEDA for queue-driven or batch workloads.

Question 3

What are Pod Disruption Budgets and when should you use them?

Accepted Answer

A PodDisruptionBudget (PDB) limits the number of pods of a deployment or stateful set that can be simultaneously unavailable during voluntary disruptions — node drains, cluster upgrades, or manual evictions. You specify either minAvailable (e.g., at least 2 pods must be up) or maxUnavailable (e.g., at most 1 pod can be down). Kubernetes will block a drain operation if honoring it would violate the PDB. PDBs only govern voluntary disruptions; node failures are involuntary and bypass PDB. Use PDBs for any production workload with an availability SLA — web servers, databases, message consumers. Without a PDB, a rolling node upgrade can take down all pods of a small deployment simultaneously.

Question 4

How do stateful and stateless workloads differ in Kubernetes, and how does StatefulSet address statefulness?

Accepted Answer

Stateless workloads (web servers, API services) hold no per-instance data — any pod can handle any request and pods are interchangeable. Deployments work well: pods get random names, rolling updates are straightforward. Stateful workloads (databases, Kafka brokers, ZooKeeper) require stable network identity, ordered startup/shutdown, and persistent storage per instance. StatefulSet provides: (1) stable pod names with ordinal indices (pod-0, pod-1, pod-2); (2) stable DNS hostnames per pod via a headless service; (3) per-pod PersistentVolumeClaims that survive pod restarts; (4) ordered, sequential pod creation and termination. StatefulSets are not a full database solution — you still need application-level replication logic (e.g., MySQL primary/replica setup). Operators (like the Postgres Operator) build on StatefulSet to automate complex stateful lifecycle management.

Question 5

How does the service mesh sidecar pattern work and what problems does it solve?

Accepted Answer

A service mesh injects a proxy container (sidecar) into every pod — typically Envoy in Istio or Linkerd's micro-proxy. The sidecar intercepts all inbound and outbound network traffic at the pod level using iptables rules, transparent to the application. This enables: (1) mutual TLS (mTLS) between all services without application code changes; (2) fine-grained traffic control — retries, timeouts, circuit breaking, traffic splitting for canary releases; (3) distributed tracing and per-request telemetry (latency, error rate, throughput) automatically; (4) policy enforcement (authorization policies, rate limiting). The tradeoffs are added latency per hop (typically 1-5ms), higher memory usage per pod (sidecar overhead), and operational complexity of managing the control plane. Sidecar-less mesh architectures (Cilium eBPF, Ambient Mesh) are emerging to reduce this overhead.

Low Level Design: Container Orchestration with Kubernetes

Introduction

Control Plane Components

Pod Scheduling

Deployments and ReplicaSets

Services and Networking

Resource Management

Horizontal Pod Autoscaler

Storage