Low Level Design: Back-of-Envelope Calculations for System Design

⏱ 4 min read

Back-of-envelope calculations are a core skill in system design interviews. They let you quickly determine whether a single server suffices or whether you need a distributed fleet, whether a relational database can handle the load or whether you need sharding. Interviewers are not looking for precision — they’re evaluating whether you can reason about scale quantitatively.

Key Latency Numbers Every Engineer Should Know

These numbers, originally compiled by Jeff Dean at Google, give you intuition for where time goes in a system. Memorize the order of magnitude for each:

Operation	Latency
L1 cache reference	1 ns
L2 cache reference	5 ns
L3 cache reference	20 ns
Main memory (DRAM) access	100 ns
SSD random read	100 μs (0.1 ms)
HDD seek	10 ms
Network round-trip (same datacenter)	0.5 ms
Network round-trip (cross-region)	~100 ms

Practical implications: reading from memory is 1000× faster than reading from SSD, which is 100× faster than HDD. A cross-region network call is 200× slower than an intra-datacenter call. These ratios justify caching layers and regional data replication.

Data Size Reference

Knowing the size of common data types lets you estimate storage without a calculator:

Data Type	Size
1 ASCII character	1 byte
Integer (int32)	4 bytes
Long (int64)	8 bytes
UUID / GUID	16 bytes
Average URL	~100 bytes
Average tweet	~140 bytes
Thumbnail image	~200 KB
HD photo	~5 MB
1 minute HD video	~50 MB

These are approximations — interviewers expect round numbers, not precision. If an interviewer challenges your assumption, adjust and recalculate. The ability to adapt quickly matters more than the initial estimate.

QPS Estimation Template

Queries Per Second (QPS) drives decisions about server count, load balancing, and database connection pooling. Use this template:

Average QPS = DAU × requests_per_user_per_day / 86400
Peak QPS    = Average QPS × 2 to 3

Example:
DAU = 100M users
Requests per user per day = 10
Average QPS = 100,000,000 × 10 / 86,400 ≈ 11,574 QPS ≈ 12K QPS
Peak QPS    = 12K × 3 = 36K QPS

86,400 is the number of seconds in a day (60 × 60 × 24). Use 100,000 as an approximation when rounding. A single modern API server handles roughly 1K–10K QPS depending on request complexity. At 36K peak QPS, you need at minimum 4–36 servers plus a load balancer.

Storage Estimation Template

Storage estimation determines database sizing, object storage needs, and backup costs. Always estimate daily growth first, then project to 1 year and 5 years:

Daily storage   = events_per_day × avg_size_per_event
Annual storage  = daily_storage × 365
5-year storage  = annual_storage × 5

Twitter example:
100M new tweets/day × 140 bytes = 14 GB/day (raw text)
With metadata, indexes, replication × 5 = 70 GB/day
5-year total = 70 GB × 365 × 5 ≈ 128 TB

Always apply a multiplier for metadata, indexes, and replication overhead. A factor of 3–5× is common. State your assumption explicitly: "I’ll use a 5× multiplier for metadata and replication."

Bandwidth Estimation

Bandwidth drives CDN cost, network card specifications, and egress pricing. Estimate inbound (upload) and outbound (download) separately:

Upload bandwidth   = uploads_per_day × avg_size / 86400
Download bandwidth = upload_bandwidth × read_to_write_ratio

Photo upload example (1M photos/day × 200KB):
Upload   = 1,000,000 × 200,000 bytes / 86,400 ≈ 2.3 MB/s
Download = 2.3 MB/s × 10 (read:write ratio) = 23 MB/s outbound

Convert to bits for network capacity planning (1 byte = 8 bits). 23 MB/s = 184 Mbps — well within a 1Gbps NIC but worth noting at peak multiplier.

Cache Sizing

The Pareto principle applies reliably to caching: 20% of content drives 80% of traffic. Caching that 20% gives you an 80% cache hit rate, dramatically reducing database load.

Cache size needed = total_hot_data × 0.20

Example: 1 TB of active data
Cache needed = 1 TB × 0.20 = 200 GB of RAM (Redis cluster)

Redis can hold 1 GB per GB of RAM with overhead. A 200 GB cache requires roughly a 256 GB RAM Redis node or a cluster of smaller nodes. State the eviction policy: LRU (Least Recently Used) is the default and appropriate for most workloads.

Availability Numbers

Availability SLOs are expressed in nines. Know what each level means in terms of allowable downtime:

Availability	Downtime per Year	Downtime per Month
99% ("two nines")	3.65 days	7.2 hours
99.9% ("three nines")	8.76 hours	43.8 minutes
99.99% ("four nines")	52.6 minutes	4.4 minutes
99.999% ("five nines")	5.26 minutes	26 seconds

Five nines requires redundancy at every layer, automated failover, zero-downtime deployments, and serious operational investment. Ask whether that’s actually required before designing for it — most systems don’t need it.

Worked Example: URL Shortener

Assumptions: 1B total URLs stored, 100M reads/day, 100K new URLs created/day.

Storage:
1B URLs × 100 bytes = 100 GB total URL data
With metadata × 3 = 300 GB

Read QPS:
100M reads/day / 86,400 = 1,157 QPS average
Peak = 1,157 × 3 = 3,471 QPS ≈ 3.5K QPS

Write QPS:
100K new URLs/day / 86,400 = 1.16 QPS (negligible)

Cache:
Top 20% of URLs = 20% of 100 GB = 20 GB of RAM for cache
With 80% hit rate: only 700 QPS reaches the database

Conclusion: a small cluster of API servers, a single primary PostgreSQL database with a read replica, and a 20 GB Redis cache handles this workload comfortably. No sharding needed at this scale.

Assumptions: 1B registered users, 20% DAU = 200M active users, 2 photo uploads per DAU per day, 30 photo reads per DAU per day.

Upload QPS:
200M users × 2 uploads/day / 86,400 = 4,630 QPS uploads

Read QPS:
200M users × 30 reads/day / 86,400 = 69,444 QPS reads
Peak reads = 69,444 × 2 = 138,888 QPS ≈ 139K QPS

Photo storage:
200M uploads/day × 5 MB (HD photo) = 1,000 TB/day = 1 PB/day
With thumbnails + multiple resolutions × 3 = 3 PB/day

5-year storage = 3 PB/day × 365 × 5 ≈ 5,475 PB ≈ 5.5 exabytes

At this scale: CDN is mandatory (139K read QPS cannot hit origin servers), photos go to object storage (S3 or equivalent), metadata goes into a sharded NoSQL store (Cassandra), and the upload pipeline uses async processing (upload to S3 → trigger Lambda → generate thumbnails → update metadata). This is a fundamentally different architecture than the URL shortener example.