Question 1

How do you generate unique short codes for a URL shortener?

Accepted Answer

Three main approaches: (1) Hash-based: take the first 7 characters of MD5 or SHA256 of the original URL. Fast but requires collision handling - if a short code already exists for a different URL, append a counter and rehash. (2) Counter-based: use an auto-incrementing ID from the database and encode it as base62 (0-9, a-z, A-Z). Guaranteed unique, no collisions, predictable length growth. 7 base62 characters = 62^7 = 3.5 trillion unique codes. Risk: IDs are sequential and somewhat guessable. (3) Pre-generated pool: batch-generate random base62 codes and store unused ones in a Redis set. On URL creation, SPOP one code from the set. Simplest logic, no DB roundtrip for the code itself.

Question 2

Should a URL shortener return HTTP 301 or 302 for redirects?

Accepted Answer

HTTP 301 (Moved Permanently): browsers cache the redirect and subsequent requests go directly to the destination without hitting your server. Reduces load but loses click analytics. HTTP 302 (Found / Moved Temporarily): every redirect request hits your server, enabling accurate click counting, geographic data, and referer tracking. Recommended: use 302 by default for analytics capability. Offer 301 as an option for users who do not need analytics and want maximum performance. Never use 301 for links that may expire or change - the browser cache will keep redirecting to the old destination even after the URL is updated or deleted on your server.

Question 3

How do you scale a URL shortener to handle 100K redirects per second?

Accepted Answer

Redirect path is read-heavy (10:1 to 100:1 read/write ratio). Caching is the primary scaling mechanism. Redis cluster: store short_code -> original_url with TTL matching the link expiry (or 24h default). 90%+ cache hit rate expected for popular links. Redirect servers are stateless and horizontally scalable behind a load balancer. Database is only hit on cache miss (cold start or long-tail links). Write path: URL creation at ~1000/sec is trivial for a single DB primary. For the DB, use a single table indexed on short_code with a read replica for any reporting queries. At extreme scale, shard the URL table by the first character of the short_code (62 shards).

Question 4

How do you track click analytics for a URL shortener?

Accepted Answer

On each redirect: fire an async Kafka event (do not block the 302 response on analytics writes). Event payload: {short_code, timestamp, ip_address, user_agent, referer_header}. A Kafka consumer enriches the event with geo data (IP-to-country lookup) and writes to a click_events table. A scheduled aggregation job (runs every hour) computes summary stats: total clicks, clicks per country, clicks per referrer, click time distribution. Store summaries in a stats table for fast dashboard queries. For real-time counters (show total clicks immediately): INCR click_count:{short_code} in Redis on each redirect; persist to DB in the hourly batch. This pattern handles the write amplification of analytics without blocking redirects.

Question 5

How do you handle custom aliases in a URL shortener?

Accepted Answer

Custom aliases are user-defined short codes (e.g., /my-company-blog). Validation: check against a reserved words list (api, admin, login, static, health, etc.), enforce max length (20 chars), allow only alphanumeric and hyphens, check uniqueness in the DB with a SELECT FOR UPDATE or UPSERT with conflict detection. Store a boolean custom_alias flag on the URL record to distinguish custom from auto-generated codes. Rate limit custom alias creation per user to prevent squatting. For paid plans: allow custom aliases; for free plans: auto-generated codes only. Conflict resolution: if the desired custom alias is taken, return a 409 Conflict error with a suggestion of an available similar alias.

URL Shortener System Low-Level Design

Requirements

Short Code Generation

1. Hash-Based

2. Counter-Based

3. Pre-Generated Pool

Base62 Encoding

Database Schema

Redirect Performance: 301 vs 302

Caching Layer

Analytics

Custom Aliases

Scale Math

Horizontal Scaling