Time-Series Database Low-Level Design – Tech Interview Dot Org

What is a Time-Series Database?

A time-series database (TSDB) is optimized for storing and querying data points indexed by time: metrics (CPU usage, request rate), IoT sensor readings, financial tick data, and application traces. Time-series data has distinct characteristics: high write volume (thousands of data points per second), write-once (data is never updated), queries are almost always range-based (last 1 hour, last 7 days), and recent data is accessed far more often than old data. InfluxDB, TimescaleDB, Prometheus, and Druid are purpose-built TSDBs.

Requirements

Ingest 500K data points/second (metrics from 10K servers, 50 metrics each)
Point query: get CPU usage for server-42 at 2026-04-12 14:00:00
Range query: average CPU across all servers over the last 1 hour, grouped by 5-minute windows
Downsampling: automatically roll up high-resolution data to lower resolution as it ages
Data retention: keep raw data for 7 days, hourly aggregates for 1 year, daily aggregates forever
Query latency: <100ms for queries spanning up to 7 days of raw data

Core Data Structure

Measurement: the "table" — e.g., "cpu_usage", "request_latency", "temperature"

Tags: indexed metadata — {"host": "server-42", "region": "us-east-1", "env": "prod"}
      Tags are used in WHERE clauses for filtering

Fields: actual values — {"value": 73.5}
        Fields are not indexed (too high cardinality); only values to query

Timestamp: nanosecond precision Unix timestamp

Data point example:
  measurement="cpu_usage"
  tags={host="server-42", region="us-east-1"}
  fields={value=73.5}
  time=1712935200000000000

Storage Design

Time-series data is stored in time-ordered chunks. Within each chunk, data is column-oriented and compressed:

-- TimescaleDB (PostgreSQL extension) approach:
CREATE TABLE cpu_usage (
    time        TIMESTAMPTZ NOT NULL,
    host        TEXT,
    region      TEXT,
    value       DOUBLE PRECISION
);

-- Convert to hypertable: auto-partitions by time
SELECT create_hypertable('cpu_usage', 'time', chunk_time_interval => INTERVAL '1 day');

-- Each partition (chunk) is a 1-day range of data
-- TimescaleDB auto-creates and manages chunks
-- Compression: SELECT add_compression_policy('cpu_usage', INTERVAL '7 days');
-- Old chunks compressed to ~10% of raw size

-- Index for tag-based filtering
CREATE INDEX ON cpu_usage (host, time DESC);

Write Path: Batching and LSM

For 500K points/second, individual inserts are too slow. Write path:

Ingest API receives metrics in batches (e.g., 1000 points per POST)
Points written to an in-memory write buffer (WAL for durability)
Buffer flushed to disk as immutable sorted files (SSTable) every few seconds
Background compaction merges small files into larger ones

This is Log-Structured Merge (LSM) — the same approach used by Cassandra, LevelDB, and RocksDB. Writes are always sequential (no random I/O), which is much faster than B-tree updates.

Downsampling and Retention

-- TimescaleDB continuous aggregates (auto-updating materialized views)
CREATE MATERIALIZED VIEW cpu_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    host, region,
    avg(value) AS avg_cpu,
    max(value) AS max_cpu,
    min(value) AS min_cpu
FROM cpu_usage
GROUP BY bucket, host, region;

-- Refresh policy: update as new data arrives
SELECT add_continuous_aggregate_policy('cpu_hourly',
    start_offset => INTERVAL '3 hours',
    end_offset   => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour');

-- Retention policies:
SELECT add_retention_policy('cpu_usage', INTERVAL '7 days');     -- raw: 7 days
SELECT add_retention_policy('cpu_hourly', INTERVAL '1 year');    -- hourly: 1 year

Query Optimization

-- Range query with time bucketing
SELECT time_bucket('5 minutes', time) AS bucket,
       avg(value) as avg_cpu
FROM cpu_usage
WHERE time >= NOW() - INTERVAL '1 hour'
  AND host = 'server-42'
GROUP BY bucket
ORDER BY bucket;

-- Execution plan: index scan on (host, time DESC) → only scans 1-hour chunk
-- Without time partitioning, this would scan all historical data

Key Design Decisions

Time-based partitioning — queries touch only the relevant time chunks, not the full dataset
LSM writes — sequential writes, no random I/O; handles 500K points/second on commodity hardware
Continuous aggregates for downsampling — pre-computed rollups, queries on hourly/daily data are instant
Tags indexed, fields not — tags have bounded cardinality (100 hosts), fields can be floats at every second
Column-oriented compressed chunks — 10-100× better compression than row storage for numeric time-series data

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What makes a time-series database different from a regular relational database?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Time-series databases are optimized for insert-heavy, append-only workloads with time-based range queries. Key differences: (1) Time-based partitioning — data is automatically split into chunks by time, so queries only scan relevant periods. (2) Column-oriented storage with compression — numeric time-series data compresses 10-100x better than row storage. (3) LSM tree writes — sequential appends, no random I/O. (4) Automatic downsampling and retention — data ages from raw to hourly to daily aggregates automatically.”}},{“@type”:”Question”,”name”:”How does TimescaleDB handle 500K data points per second?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”TimescaleDB (a PostgreSQL extension) uses hypertables that automatically partition data into time-based chunks. Writes go to the current chunk — a standard PostgreSQL table with a time range. TimescaleDB’s compression converts old chunks from row format to columnar with delta-delta encoding for timestamps and XOR compression for floats, achieving 90%+ compression. Batch inserts (COPY or multi-row INSERT) are essential — individual row inserts at 500K/s would overwhelm even TimescaleDB.”}},{“@type”:”Question”,”name”:”What is downsampling in time-series databases and why do you need it?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Downsampling aggregates high-resolution data into lower-resolution summaries as it ages. Raw data (one point per second) is kept for 7 days; hourly averages for 1 year; daily averages forever. Without downsampling, 1-year queries over raw data would scan trillions of rows. With continuous aggregates (TimescaleDB) or recording rules (Prometheus), rollups are pre-computed and updated automatically as new data arrives. Queries on historical data hit the aggregate tables, not the raw data.”}},{“@type”:”Question”,”name”:”When would you use Prometheus vs TimescaleDB vs InfluxDB?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Prometheus: best for infrastructure monitoring (Kubernetes, servers). Pull-based scraping model, excellent ecosystem (Grafana, Alertmanager), PromQL query language. Not designed for long-term storage — use Thanos or Cortex for multi-year retention. TimescaleDB: best when you need SQL queries, existing PostgreSQL expertise, or multi-dimensional data with complex joins. InfluxDB: purpose-built TSDB, excellent for IoT and high-cardinality metrics, proprietary Flux query language. For interviews, mention any one with trade-offs.”}},{“@type”:”Question”,”name”:”What is cardinality and why does it matter in time-series systems?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Cardinality is the number of unique tag combinations (series). High cardinality causes performance problems: a metric "request_latency" with tags {user_id, endpoint, region} could have millions of unique combinations (one series per user). Most TSDBs store indexes per series — millions of series means millions of index entries, memory exhaustion, and slow queries. Avoid using high-cardinality values (user_id, request_id) as tags. Reserve tags for bounded-cardinality dimensions: host (100s), region (10s), service (10s).”}}]}

Time-series database design and metrics infrastructure is discussed in Databricks system design interview questions.

Time-series metrics and monitoring system design is covered in Uber system design interview preparation.

Time-series database and observability system design is discussed in Netflix system design interview guide.