What Is a Link Analytics Service?
A Link Analytics Service tracks every interaction with a shortened URL or QR code and surfaces aggregated statistics: click counts, geographic distribution, referrer breakdown, device types, and time-series trends. It is the reporting backbone for URL shorteners and QR platforms. The core challenge is ingesting high-velocity click events without impacting redirect latency, then making aggregated data queryable in near-real-time.
Data Model
Analytics data lives in two tiers: a raw event store for auditability and a pre-aggregated summary store for fast dashboard queries.
-- Raw click events (append-only, partitioned by date)
CREATE TABLE click_event (
id BIGINT UNSIGNED AUTO_INCREMENT,
link_id BIGINT UNSIGNED NOT NULL,
clicked_at DATETIME(3) NOT NULL,
country_code CHAR(2),
region VARCHAR(64),
city VARCHAR(128),
referrer VARCHAR(2048),
user_agent VARCHAR(512),
device_type ENUM('desktop','mobile','tablet','bot'),
browser VARCHAR(64),
os VARCHAR(64),
ip_hash CHAR(64), -- SHA-256 of IP for unique-visitor estimation
PRIMARY KEY (id, clicked_at)
) PARTITION BY RANGE (YEAR(clicked_at) * 100 + MONTH(clicked_at));
-- Pre-aggregated hourly rollups
CREATE TABLE link_stats_hourly (
link_id BIGINT UNSIGNED NOT NULL,
hour_bucket DATETIME NOT NULL, -- truncated to the hour
total_clicks BIGINT DEFAULT 0,
unique_ips BIGINT DEFAULT 0,
PRIMARY KEY (link_id, hour_bucket)
);
-- Dimension breakdown table
CREATE TABLE link_stats_dimension (
link_id BIGINT UNSIGNED NOT NULL,
day_bucket DATE NOT NULL,
dimension VARCHAR(32) NOT NULL, -- e.g., 'country', 'device', 'referrer'
value VARCHAR(128) NOT NULL,
click_count BIGINT DEFAULT 0,
PRIMARY KEY (link_id, day_bucket, dimension, value)
);
Click Tracking Workflow
The redirect hot path must stay fast. Click tracking is therefore fully asynchronous.
- Redirect service issues the 302 and immediately emits a lightweight event to a Kafka topic
click-events. The emit is fire-and-forget with a local buffer to survive brief Kafka unavailability. - Stream processor (Flink or Kafka Streams) consumes the topic, enriches each event with GeoIP lookup and User-Agent parsing, and writes enriched records to the
click_eventtable. - Aggregation job runs on a micro-batch schedule (every 60 seconds) to upsert counts into
link_stats_hourlyandlink_stats_dimension. - Dashboard API reads only from the aggregated tables; raw events are queried only for exports or deep-dive forensics.
// Pseudo-code: stream enrichment step
on_event(raw):
geo = geoip_lookup(raw.ip)
ua = parse_user_agent(raw.user_agent)
event = merge(raw, geo, ua)
event.ip_hash = sha256(raw.ip + daily_salt)
write_to_click_event(event)
emit_to_aggregation_queue(event)
Unique Visitor Estimation
Exact unique-IP counting at scale is expensive. Use a HyperLogLog structure (native in Redis) per link per day. On each click, call PFADD link:{id}:uv:{date} {ip_hash}. Query with PFCOUNT for an estimate with <1% error and O(1) memory regardless of cardinality. Persist the HLL to the summary table during the nightly rollup.
Failure Handling
- Kafka producer buffer full: Drop the event and increment a
clicks_droppedcounter. Accept minor under-counting rather than slow the redirect. Alert when the drop rate exceeds 0.1%. - GeoIP service down: Enrich with empty geo fields and flag the record. A backfill job re-enriches flagged records once the service recovers.
- Aggregation job lag: Dashboard queries fall back to the last successful rollup and display a staleness indicator. Use a Kafka consumer group lag metric to trigger auto-scaling of the processor.
- Partition pruning failures: Archive partitions older than 90 days to cold object storage (Parquet on S3). Query via Athena or BigQuery for historical reports without keeping old partitions hot in MySQL.
Scalability Considerations
- Event volume: At 100K redirects/sec, Kafka partitions should be sized at ~10K events/sec each. Use at least 10 partitions per topic; increase as volume grows.
- Write amplification: Batch inserts into
click_eventin micro-batches of 500-1000 rows to reduce per-row overhead. - Query isolation: Never run dashboard aggregation queries against the raw event table in production. All user-facing queries hit the pre-aggregated tables or a read replica.
- Time-zone handling: Store all timestamps in UTC. Apply time-zone offsets at the API layer when returning bucketed data to clients.
- Multi-tenant isolation: Shard
link_stats_hourlyandlink_stats_dimensionbylink_idrange or tenant ID to prevent noisy-neighbor effects on large accounts.
Summary
A Link Analytics Service succeeds by decoupling the critical redirect path from all analytics work. Kafka absorbs the event burst; a stream processor enriches and stores raw data; micro-batch jobs maintain fast-query aggregation tables. HyperLogLog handles unique-visitor estimation at negligible memory cost. Partition pruning and cold-tier archival keep the hot database lean as data accumulates over months and years.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What are the main components of a link analytics system design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A link analytics system needs an event ingestion layer (e.g., Kafka or Kinesis) to capture click events at high throughput, a stream processing layer (e.g., Flink or Spark Streaming) to enrich and aggregate events in near real-time, a time-series or columnar data store (e.g., ClickHouse, Druid, or BigQuery) for analytical queries, and a dashboard API that serves aggregated metrics to users.”
}
},
{
“@type”: “Question”,
“name”: “How do you accurately count unique visitors in a link analytics system without storing every user ID?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Use probabilistic data structures such as HyperLogLog to estimate unique visitor counts with very low memory overhead and configurable error rates (typically under 1%). For exact counts on smaller datasets, a bitmap (Roaring Bitmap) per time window per link is efficient. Storing a fingerprint of IP + User-Agent in a Bloom filter can also deduplicate clicks in real time before counting.”
}
},
{
“@type”: “Question”,
“name”: “How do you design the storage layer for a link analytics system that must support both real-time and historical queries?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Use a lambda or kappa architecture. The speed layer writes pre-aggregated counters to a fast store like Redis for the most recent time windows. The batch layer writes raw or aggregated events to a columnar store like ClickHouse or BigQuery for historical range queries. The query API merges results from both layers, giving low-latency access to recent data and full historical depth.”
}
},
{
“@type”: “Question”,
“name”: “How do you prevent bot traffic from polluting link analytics data?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Filter events server-side by matching the User-Agent string against known bot lists (IAB/ABC list is a standard reference). Apply IP reputation scoring and rate limiting: if a single IP generates hundreds of clicks per minute it’s almost certainly automated. For higher accuracy, use a JavaScript challenge or a CAPTCHA on redirect pages, and cross-reference with browser fingerprinting signals before counting an event as a genuine human click.”
}
}
]
}
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering