System Design: URL Shortener and Click Analytics Platform (2025)

Requirements and Scale

Functional: shorten long URLs, redirect short URLs to originals, track click analytics (count, geography, device, referrer), support custom aliases, set expiry. Non-functional: 100M URLs created/month (40 writes/sec), 10B redirects/month (4000 reads/sec – read-heavy), P99 redirect latency under 10ms. The redirect path is the critical hot path – everything optimizes for it.

URL Shortening and ID Generation

Core challenge: generate a short, unique, URL-safe identifier for each long URL. Approaches: Base62 encoding of auto-increment ID: Use a counter (database sequence or distributed ID generator like Snowflake). Encode the integer in base62 (a-z, A-Z, 0-9). 7 characters of base62 gives 62^7 = 3.5 trillion unique URLs. Pros: simple, predictable length, no collision. Cons: sequential IDs are guessable (security). Random hash: Generate 7 random base62 characters, check collision in DB. Pros: unguessable. Cons: requires collision check, O(1) average but rare retries. MD5/SHA hash of URL: Hash the long URL, take first 7 characters. Pros: deterministic (same URL gets same short code). Cons: hash collision possible, sharing one short URL for duplicates may be undesirable. Recommended: base62 of a distributed Snowflake-style ID for uniqueness + counter-based speed without guessability concerns at scale.

Data Model

URLs table: url_id (BIGINT), short_code (VARCHAR(8), indexed unique), long_url (TEXT), user_id, created_at, expires_at, is_active. Clicks table: click_id, url_id, clicked_at, ip_hash (anonymized), country_code (2-char), city, device_type (MOBILE/DESKTOP/TABLET/BOT), os, browser, referrer_domain. ClickAggregate table (pre-rolled): url_id, period_start (hourly bucket), period_type (HOUR/DAY/MONTH), total_clicks, unique_ips, country_breakdown (JSONB), device_breakdown (JSONB). Aggregates enable fast analytics queries without scanning the raw clicks table.

Redirect Architecture – The Hot Path

# Hot path: short code -> long URL redirect
# P99  str:
    # 1. Check Redis
    url = redis.get(f"url:{short_code}")
    if url:
        # Fire-and-forget click event (async, non-blocking)
        kafka.produce("clicks", {"code": short_code, "meta": request_meta})
        return url

    # 2. Database lookup (cache miss)
    record = db.query("SELECT long_url, expires_at, is_active "
                      "FROM urls WHERE short_code = %s", short_code)
    if not record or not record.is_active:
        return None
    if record.expires_at and record.expires_at < datetime.utcnow():
        return None

    # 3. Populate cache
    redis.setex(f"url:{short_code}", 86400, record.long_url)
    kafka.produce("clicks", {"code": short_code, "meta": request_meta})
    return record.long_url

Click Analytics Pipeline

Raw click events flow through Kafka -> stream processor (Flink/Spark Streaming) -> dual sinks: (1) raw clicks table for full-fidelity queries, (2) pre-aggregated counters updated every minute. Stream processor responsibilities: IP geolocation lookup (MaxMind GeoIP), user-agent parsing (device/OS/browser), bot filtering (known bot user agents, request rate anomalies), deduplication within 1-minute windows for unique IP counts. Aggregation granularity: per-hour buckets stored for 90 days, per-day buckets stored for 3 years, per-month buckets stored indefinitely. Analytics API reads aggregates for dashboard charts (O(buckets)), raw clicks for drill-down (paginated). Realtime counter: a Redis counter incremented on each click gives a live total with zero DB writes.

Custom Aliases and Expiry

Custom aliases: allow users to specify a short code (e.g., /my-promo). Check uniqueness against the urls table. Enforce: min 3, max 32 chars, alphanumeric + hyphens, reserved words blocklist (api, admin, static, login). Expiry: store expires_at timestamp. Redirect service checks expiry inline. Background job (daily cron): marks expired URLs as is_active=false and purges Redis cache entries. Expired short codes can be reused after a configurable delay (default 30 days) to avoid broken cached links. Deletion: soft delete (is_active=false) rather than hard delete to preserve click analytics history.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you generate short codes for a URL shortener at scale?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The recommended approach is base62 encoding of a distributed auto-increment ID (e.g., Snowflake ID). Base62 uses characters a-z, A-Z, 0-9. Seven characters give 62^7 = 3.5 trillion unique codes. The ID is generated by a distributed ID service, then base62-encoded. This avoids collision checks and produces non-sequential, non-guessable codes. For simplest implementation, use a database sequence with base62 encoding.”}},{“@type”:”Question”,”name”:”Why is the redirect path the most critical optimization in a URL shortener?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”URL shorteners are extremely read-heavy: typically 100:1 to 1000:1 reads (redirects) to writes (URL creation). Every redirect adds latency to the user's navigation experience, so P99 under 10ms is the target. The optimization stack: CDN edge caching (2ms, no origin hit for cached codes), Redis in-memory cache (0.5ms), then database as last resort. Click tracking is async via Kafka to avoid adding latency to the redirect response.”}},{“@type”:”Question”,”name”:”How do you track click analytics without slowing down redirects?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a fire-and-forget async pattern: on redirect, publish a click event to Kafka (non-blocking, < 1ms). A separate stream processing pipeline (Flink or Spark Streaming) consumes events to: parse user-agent for device/OS/browser, geolocate IP to country/city, filter bots, and write to the clicks table and pre-aggregated counters. The redirect endpoint returns immediately; analytics lag by seconds, which is acceptable for dashboards.”}},{“@type”:”Question”,”name”:”How do you handle URL expiry and code reuse?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Store an expires_at timestamp on each URL. On redirect, check expiry inline (Redis cached value includes expiry metadata). A background daily job soft-deletes expired URLs (is_active=false) and evicts them from Redis. Expired short codes enter a cooldown period (default 30 days) before being eligible for reuse, preventing users with cached links from being redirected to the wrong destination after reuse.”}},{“@type”:”Question”,”name”:”How do you scale a URL shortener to handle 4000 redirects per second?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Read path scaling: CDN caches popular short codes at edge nodes globally. Redis cluster caches all active codes (hot set). Database is only hit on cold cache misses (rare for popular codes). Write path: URL creation at 40/sec is trivial for any relational database. Analytics ingestion via Kafka decouples write amplification from the hot path. Horizontal scaling: stateless redirect service behind a load balancer, Redis cluster with read replicas, sharded Kafka topics by short_code prefix.”}}]}

Cloudflare system design interviews cover CDN caching and link routing at scale. Review typical questions in Cloudflare interview: CDN and link routing system design.

Twitter uses its own URL shortener (t.co) in production. See typical system design questions asked at Twitter/X interview: link shortening and analytics (t.co).

Snap interviews cover link tracking and analytics pipelines. Review system design patterns for Snap interview: link tracking and analytics systems.