System Design Interview Framework — Step-by-Step: Requirements, Estimation, High-Level Design, Deep Dive, Tradeoffs

The system design interview evaluates your ability to design large-scale systems under ambiguity. Unlike coding interviews with a single correct answer, system design is about demonstrating structured thinking, making reasonable tradeoffs, and communicating clearly. This guide provides a step-by-step framework for tackling any system design question in 45 minutes — the same framework used by interviewers at Google, Meta, Amazon, and other top companies to evaluate candidates.

Step 1: Requirements Clarification (5 minutes)

Never start designing without clarifying requirements. Ask questions to narrow the scope. Functional requirements: “What are the core features we need to support?” For a URL shortener: create short URLs, redirect to original URLs, analytics (click counts). Non-functional requirements: scale (how many users, requests per second), latency (what is acceptable response time), availability (what uptime target), consistency (is eventual consistency acceptable), durability (can we lose data). Example questions: “How many URLs are shortened per day?” “What is the read-to-write ratio?” “Do short URLs expire?” “Do we need real-time analytics or daily batch?” The interviewer will give you numbers or ask you to estimate. These numbers drive your architecture decisions — a system handling 100 requests per second has very different architecture from one handling 1 million RPS. Write down the agreed requirements on the whiteboard. This is your design contract — refer back to it throughout the interview to justify your decisions.

Step 2: Back-of-Envelope Estimation (5 minutes)

Estimation grounds your design in reality. Key estimates: (1) Traffic estimation: 100M URLs created per day = 100M / 86400 = ~1200 writes per second. If read-to-write ratio is 100:1: 120,000 reads per second. (2) Storage estimation: each URL record is ~500 bytes (short URL, original URL, creation date, user ID). 100M * 500 bytes = 50GB per day. Over 5 years: 50GB * 365 * 5 = ~91TB. (3) Bandwidth estimation: 120,000 reads/sec * 500 bytes = 60MB/s outbound. (4) Memory estimation for caching: if we cache the top 20% of URLs (Pareto principle), and daily unique URLs requested is 1 billion: 1B * 20% * 500 bytes = 100GB cache. This fits in a single high-memory Redis instance or a small Redis cluster. These numbers tell you: you need a distributed database (91TB does not fit on one server), a caching layer (120K reads/sec requires cache), and the write load (1200/sec) is manageable for a single primary database with sharding for growth. Show your math on the whiteboard. Interviewers evaluate the process, not the exact numbers.

Step 3: High-Level Design (10 minutes)

Draw the major components and their interactions on the whiteboard. Start with the user and work inward: Client -> Load Balancer -> Application Servers -> Cache -> Database. For each component, explain why it exists. Load balancer: distributes traffic across multiple application servers for scalability and availability. Cache (Redis): serves frequent URL lookups without hitting the database. Database (PostgreSQL with sharding): stores all URL mappings durably. Key design decisions at this stage: (1) How to generate short URLs? Base62 encoding of an auto-incrementing ID, or MD5/SHA256 hash truncated to 7 characters, or a pre-generated key service (KGS) that maintains a pool of unused short codes. (2) How to handle reads? Check cache first (Redis). On cache miss, query the database, populate the cache, return the result. (3) How to handle writes? Generate short URL, write to database, populate cache. Draw arrows showing data flow for both reads and writes. Do not go deep into any one component yet — cover the full system first. The interviewer may redirect you to a different component for the deep dive.

Step 4: Deep Dive (15 minutes)

The interviewer picks 1-2 components for a detailed discussion. Common deep-dive topics: (1) Database design — schema design, indexing strategy, sharding approach. For the URL shortener: shard by hash of short_code for even distribution. Index on short_code for O(1) lookups. (2) Caching strategy — eviction policy (LRU), cache invalidation, cache consistency. For URL shortener: cache-aside pattern. URLs rarely change, so cache staleness is minimal. Set TTL to 24 hours. (3) Scalability bottlenecks — what breaks at 10x scale? The key generation service becomes a bottleneck if it uses a single database sequence. Solution: pre-generate batches of keys and distribute to application servers. (4) Availability and fault tolerance — what happens when a component fails? Database fails: serve reads from cache (degraded but available). Cache fails: reads go to database (higher latency but functional). Application server fails: load balancer routes to healthy servers. During the deep dive, explicitly state tradeoffs: “I chose this approach because X, but the tradeoff is Y.” This demonstrates engineering maturity — there is no perfect solution, only informed tradeoffs.

Step 5: Wrap-Up and Extensions (5 minutes)

Summarize your design and proactively address extensions. Summary: restate the key design decisions and why they satisfy the requirements. “The system handles 120K reads/sec through the Redis cache layer, stores 91TB over 5 years across sharded PostgreSQL, and achieves 99.9% availability through redundancy at every layer.” Extensions to mention: (1) Analytics — real-time click tracking using Kafka for event streaming and ClickHouse for analytical queries. (2) Rate limiting — token bucket algorithm at the API gateway to prevent abuse. (3) Geographic distribution — deploy in multiple regions with GeoDNS routing users to the nearest datacenter. (4) Monitoring and alerting — RED metrics (Rate, Errors, Duration) on Grafana with PagerDuty integration. Mentioning extensions shows breadth of knowledge and production awareness — you are not just designing the happy path. However, do not overextend. If the interviewer asks about one extension, go deep on it. If they do not, mentioning them briefly is sufficient. Time management: aim to cover all 5 steps. Running out of time during the deep dive without a wrap-up leaves a worse impression than a slightly shorter deep dive with a complete narrative.

Common Mistakes to Avoid

Mistakes that cost candidates: (1) Jumping into design without requirements — this signals junior thinking. Always ask questions first. (2) Over-engineering — designing for Google scale when the requirements say 1000 users. Scale your architecture to the stated requirements, not to hypothetical billions. (3) Monologuing — the interview is a conversation, not a lecture. Check in with the interviewer: “Should I go deeper here or move on?” (4) Ignoring tradeoffs — every design decision has tradeoffs. Saying “we use Kafka” without explaining why Kafka over RabbitMQ or SQS misses the point. (5) Not quantifying — “the system will be fast” is meaningless. “P99 latency will be 50ms because reads are served from Redis” is specific and verifiable. (6) Ignoring failure modes — production systems fail. Address what happens when each component goes down. (7) Forgetting about data consistency — in distributed systems, consistency is a design choice, not a default. State whether you need strong consistency or eventual consistency and why. (8) Not drawing diagrams — always use the whiteboard. Diagrams communicate more effectively than words and show structured thinking.

Scroll to Top