Introduction
Kubernetes orchestrates containerized workloads across a cluster, managing scheduling, scaling, self-healing, and networking. The control plane manages desired state while worker nodes run the actual workloads.
Control Plane Components
The API Server is the central REST endpoint for all cluster operations. etcd is the distributed key-value store holding all cluster state. The Scheduler assigns pods to nodes based on resource requirements and constraints. The Controller Manager runs controllers: Deployment, ReplicaSet, Node, Service Account, and others. All components except etcd are stateless. The API Server is the only component that reads from and writes to etcd.
Pod Scheduling
The scheduler selects a node for a pod in two phases. Filtering removes nodes that do not meet requirements: insufficient CPU or memory, missing taint toleration, node affinity mismatches, and pod anti-affinity violations. Scoring ranks the remaining nodes by balanced resource utilization, zone spreading, and topology constraints. The pod is assigned to the highest-scoring node. The pod spec includes resource requests (guaranteed allocation) and limits (maximum allowed).
Deployments and ReplicaSets
A Deployment defines the desired state: image, replica count, and update strategy. The Deployment controller creates and manages a ReplicaSet. The ReplicaSet ensures N pod replicas are running at all times. During a rolling update a new ReplicaSet is created and pods are gradually replaced. Revision history is retained for rollback. A readiness probe gates a pod from receiving traffic until it is healthy. A liveness probe restarts unhealthy pods.
Services and Networking
A Service provides a stable DNS name and virtual IP for a group of pods. Service types: ClusterIP (internal only), NodePort (external via node IP), LoadBalancer (external via cloud load balancer), ExternalName (DNS alias to an external name). kube-proxy implements service routing via iptables or IPVS rules on each node. CoreDNS resolves service names to ClusterIP addresses. Pod-to-pod networking is handled by a CNI plugin (Flannel, Calico, Cilium).
Resource Management
Resource requests define the minimum guaranteed CPU and memory used for scheduling decisions. Resource limits define the maximum allowed usage: CPU is throttled at the limit, and a container is OOM-killed if it exceeds its memory limit. Quality of Service classes: Guaranteed (requests equal limits), Burstable (requests less than limits), BestEffort (no requests or limits set). LimitRange sets per-namespace defaults and maximums. ResourceQuota limits total resources consumed per namespace.
Horizontal Pod Autoscaler
The HPA scales replica count based on metrics such as CPU utilization or custom metrics via the Prometheus adapter. With a target of 70% CPU utilization, replicas are added when average CPU exceeds 70% and removed when it falls below. Scale-up is aggressive (fast response); scale-down is conservative with a 5-minute stabilization window to prevent thrashing. KEDA extends autoscaling to event-driven sources such as Kafka consumer lag or queue depth.
Storage
A PersistentVolume (PV) abstracts the underlying storage. A PersistentVolumeClaim (PVC) requests storage by size and access mode (ReadWriteOnce, ReadWriteMany). A StorageClass provisions volumes dynamically (AWS EBS, GCP PD, NFS). StatefulSets give pods stable network identities and persistent storage across rescheduling. PVs persist independently of pod lifecycle, so data is not lost when a pod is rescheduled.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering