Requirements
Functional requirements for a URL shortener:
- Create a short URL from a long original URL
- Redirect users from the short URL to the original URL
- Optional: allow custom aliases (e.g., example.com/my-brand)
- Optional: URL expiry after a set time
- Analytics: track click counts, referrers, and geographic data
Short Code Generation
Three viable approaches to generating a 7-character short code:
1. Hash-Based
Take MD5 or SHA-256 of the original URL, use the first 7 characters. If a collision occurs (different URL maps to the same code), append a counter suffix and rehash until unique. Simple but collision handling adds complexity at scale.
2. Counter-Based
Use an auto-incrementing integer ID from the database and encode it as base62. ID 1 becomes “0000001”, ID 3521614606208 becomes “zzzzzzz”. Guaranteed unique, no collision handling needed. Risk: sequential IDs are guessable, exposing volume. Use a distributed ID generator (Snowflake) to make them non-sequential.
3. Pre-Generated Pool
Generate random 7-character base62 codes in bulk offline, check uniqueness, and store unused codes in a Redis set. On each URL creation request, pop one code from the set. Worker process refills the pool when it drops below a threshold. Eliminates per-request uniqueness checks entirely.
Base62 Encoding
Base62 uses 62 characters: digits 0-9, lowercase a-z, and uppercase A-Z. With 7 characters the total capacity is 62^7 = 3,521,614,606,208 – over 3.5 trillion unique short codes. That is sufficient for any realistic URL shortener at any scale.
def encode_base62(num):
chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
result = []
while num:
result.append(chars[num % 62])
num //= 62
return ''.join(reversed(result)).zfill(7)
Database Schema
CREATE TABLE urls (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_code VARCHAR(20) UNIQUE NOT NULL,
original_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NULL,
custom_alias BOOLEAN DEFAULT FALSE,
INDEX idx_short_code (short_code),
INDEX idx_user_id (user_id)
);
Index on short_code is the critical read path. original_url is TEXT since URLs can exceed 2083 characters. expires_at NULL means no expiry.
Redirect Performance: 301 vs 302
This is a deliberate design decision:
- HTTP 301 (Permanent): Browser caches the redirect. Subsequent visits skip the server entirely – maximum performance. Downside: clicks cannot be tracked and you cannot update the destination URL.
- HTTP 302 (Temporary): Browser always hits the server. Enables click tracking, destination updates, and expiry enforcement. Slight latency overhead, negligible with caching.
Recommendation: use 302 by default for analytics. Offer 301 as an opt-in for power users who do not need tracking and want maximum redirect speed.
Caching Layer
Redis is the centerpiece of redirect performance. Use a hash: short_code -> original_url.
# Redirect handler pseudocode
def redirect(short_code):
url = redis.get(f"url:{short_code}")
if url:
return redirect_to(url)
row = db.query("SELECT original_url, expires_at FROM urls WHERE short_code = %s", short_code)
if not row or (row.expires_at and row.expires_at < now()):
return 404
ttl = min((row.expires_at - now()).seconds, 86400) if row.expires_at else 86400
redis.setex(f"url:{short_code}", ttl, row.original_url)
return redirect_to(row.original_url)
Cache TTL matches expiry or defaults to 24 hours. With a hot dataset, expect 90%+ cache hit rate. The DB only sees cache misses and new URLs.
Analytics
Do not write analytics synchronously on every redirect – it would add latency and become a bottleneck at scale.
CREATE TABLE click_events (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_code VARCHAR(20) NOT NULL,
clicked_at TIMESTAMP NOT NULL,
ip_hash VARCHAR(64), -- hashed for privacy
country VARCHAR(2),
referrer VARCHAR(500),
INDEX idx_short_code_time (short_code, clicked_at)
);
CREATE TABLE click_summary (
short_code VARCHAR(20) NOT NULL,
date DATE NOT NULL,
click_count INT DEFAULT 0,
PRIMARY KEY (short_code, date)
);
On each redirect, publish a lightweight event to Kafka. A consumer writes to click_events. An hourly batch job aggregates into click_summary. This decouples analytics writes from the redirect hot path entirely.
Custom Aliases
Rules for custom aliases:
- Must be unique across all short codes (same
urlstable, same uniqueness constraint) - Maintain a reserved words list: admin, api, login, static, assets, www, etc.
- Maximum length: 20 characters
- Allowed characters: a-z, 0-9, hyphens (no uppercase for custom aliases to avoid confusion)
- On conflict: return a clear error, do not silently modify the alias
Scale Math
Let’s size the system:
- 100M URLs created per day = 100,000,000 / 86,400 = ~1,157 writes/sec
- 10B redirects per day = 10,000,000,000 / 86,400 = ~115,700 reads/sec
This is a heavily read-skewed system (~100:1 read/write ratio). Redis with a cluster of 3 nodes handles 100K+ reads/sec easily. DB writes at ~1,157/sec are well within a single primary MySQL/PostgreSQL instance’s capacity (typical ceiling is 10K-50K simple writes/sec).
Storage: 100M URLs/day * 365 days * ~500 bytes/row = ~18 TB/year. Partition the table by created_at and archive cold data to object storage.
Horizontal Scaling
- Redirect servers: stateless – any server can handle any redirect. Scale horizontally behind a load balancer. Auto-scale based on CPU/RPS.
- Redis: Redis Cluster with hash slots. Shard by
short_code. Add replicas for read scaling. - Database: Shard by hash of
short_codeacross multiple DB primaries. Each shard handles a subset of short codes. Use read replicas for any reporting queries. - ID generation: Use a Snowflake-style distributed ID service or a dedicated counter service per shard to avoid global coordination.
Twitter uses URL shortening (t.co) at massive scale. See system design questions for Twitter/X interview: URL shortener system design.
Snap uses short links for content sharing. See system design patterns for Snap interview: link sharing and URL system design.
LinkedIn uses link shortening for tracking. See system design patterns for LinkedIn interview: URL shortener and analytics system design.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering