Question 1

How are containers different from virtual machines?

Accepted Answer

Containers and virtual machines both provide isolation, but at different levels. A virtual machine runs a complete guest operating system (its own kernel) on top of a hypervisor. Each VM includes the full OS, system libraries, and the application -- typically 1-20GB per VM. Startup time: 30 seconds to several minutes (booting a full OS). Resource overhead: each VM needs its own kernel memory, system processes, and reserved CPU/RAM. A container shares the host operating system kernel. It uses Linux namespaces for isolation and cgroups for resource limits. The container only includes the application and its dependencies -- typically 50-500MB. Startup time: milliseconds to seconds (just starting a process). Resource overhead: minimal -- only the application memory and CPU, no kernel overhead. Trade-offs: VMs provide stronger isolation (separate kernel, hardware-level isolation -- suitable for multi-tenant hosting where tenants cannot be trusted). Containers provide weaker isolation (a kernel vulnerability could allow container escape) but much higher density (hundreds of containers per host vs tens of VMs). In practice: use containers for application workloads and microservices. Use VMs for workloads requiring strong isolation boundaries (different security contexts, different OS requirements).

Question 2

What is a multi-stage Docker build and why should you use it?

Accepted Answer

A multi-stage build uses multiple FROM statements in a single Dockerfile. Each FROM starts a new build stage. You can copy artifacts from one stage to another using COPY --from=stage_name. Example: Stage 1 (build): FROM golang:1.22 AS builder, copy source code, run go build to compile the binary. Stage 2 (runtime): FROM alpine:3.19, COPY --from=builder /app/server /app/server, CMD /app/server. The final image contains only the compiled binary on a minimal Alpine base -- not the Go compiler, source code, or build dependencies. Size reduction: a Go application build image might be 1.5GB (Go toolchain + source). The runtime image is 15-50MB (Alpine + compiled binary). Security improvement: the runtime image has a smaller attack surface -- no compiler, no package manager, no source code. Build tools and their vulnerabilities are not present in production. For interpreted languages (Python, Node.js), multi-stage builds install build-time dependencies in stage 1, copy only the installed packages to stage 2. This excludes build tools (gcc, make) and reduces the image size by 50-70%.

Question 3

How do Docker image layers work and how do you optimize them?

Accepted Answer

Each Dockerfile instruction (FROM, RUN, COPY, ADD) creates a new filesystem layer. Layers are stacked and content-addressed (identified by SHA256 hash). When you pull an image, Docker downloads only the layers it does not already have -- if two images share the same base, the shared layers are downloaded once. Layer caching: Docker caches each layer. When rebuilding, if a layer input has not changed, Docker reuses the cached layer and all subsequent layers. If a layer changes, it and all following layers are rebuilt. Optimization strategies: (1) Order by change frequency -- put rarely changing instructions first (install system packages) and frequently changing last (copy application code). This maximizes cache hits. (2) Combine RUN commands -- each RUN creates a layer. RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/* in one RUN avoids a layer with the apt cache. (3) Use .dockerignore -- exclude .git, node_modules, and build artifacts from the build context. This speeds up the COPY step and prevents unnecessary layer bloat. (4) Minimize layer count -- fewer layers mean faster pulls and smaller images. But do not sacrifice readability for minimal layers.

Question 4

What are the most important container security practices?

Accepted Answer

Essential container security practices: (1) Run as non-root -- add a USER directive in the Dockerfile to run the process as an unprivileged user. If an attacker exploits the application, they are a non-root user both inside the container and (with user namespace mapping) on the host. (2) Use minimal base images -- alpine (5MB), distroless (Google, no shell or package manager), or scratch (empty, for statically compiled binaries). Fewer packages mean fewer vulnerabilities. (3) Scan images for vulnerabilities -- integrate Trivy, Snyk, or Grype into CI/CD. Fail builds with critical or high CVEs. Scan regularly (new CVEs are published daily against existing packages). (4) Read-only filesystem -- mount the container root filesystem as read-only (--read-only flag). The application can only write to explicitly mounted volumes. This prevents attackers from modifying binaries or installing tools. (5) Drop Linux capabilities -- containers run with a default set of Linux capabilities. Drop all and add only what the application needs: --cap-drop ALL --cap-add NET_BIND_SERVICE. (6) Network policies -- in Kubernetes, use NetworkPolicy to restrict which pods can communicate. Default deny all ingress, then allow specific paths. (7) Never put secrets in images -- use Kubernetes Secrets, Vault, or environment variables injected at runtime.

System Design: Docker Containers — Namespaces, Cgroups, Image Layers, Dockerfile Best Practices, Container Security

How Containers Work: Namespaces and Cgroups

Docker Image Layers and the Union Filesystem

Dockerfile Best Practices

Container Networking

Container Security

Container Registry and Image Distribution