Question 1

How does delta-of-delta timestamp compression work?

Accepted Answer

Delta-of-delta encoding first computes the difference between consecutive timestamps (delta), and then computes the difference between consecutive deltas (delta-of-delta). For regular time-series data collected at fixed intervals (e.g., every 15 seconds), the delta is always the same constant, so the delta-of-delta is always 0. A stream of zeros compresses to near-zero bits using variable-length encoding. Only irregular samples (missed intervals, jitter) produce non-zero delta-of-delta values, which are encoded with a small number of bits. This is the timestamp encoding used in the Gorilla paper from Facebook, achieving roughly 1.37 bits per timestamp on regular series.

Question 2

How does XOR value compression work for floating-point time-series?

Accepted Answer

XOR compression, from the Gorilla algorithm, XORs consecutive float64 values. When a sensor reads similar values over time, consecutive floats share the same high-order bits (sign, exponent, and leading mantissa bits). The XOR result has a long prefix of leading zeros and a long suffix of trailing zeros, leaving only a small meaningful XOR block. The encoder stores the number of leading zeros, the length of the meaningful block, and the meaningful bits themselves. For slowly changing metrics, meaningful block sizes are typically 10-20 bits versus 64 bits raw, yielding 3-6x compression on values alone.

Question 3

What triggers chunk sealing and what happens after a chunk is sealed?

Accepted Answer

A chunk is sealed when it reaches its time boundary (e.g., the 8-hour window closes) or when it hits a maximum byte size. Before sealing, the chunk is mutable and lives in memory. On seal, the chunk is compressed using delta-of-delta + XOR encoding, written to disk as an immutable block, and its metadata (chunk_start, chunk_end, row_count, compressed_size) is recorded in the catalog. After sealing, the chunk can be replicated, moved to warm storage, or downsampled. In-memory buffer is released. Sealed chunks are append-only and never modified; corrections are handled by writing a new data point with a later ingest timestamp.

Question 4

How do tiered retention transitions work in a time-series database?

Accepted Answer

Tiered retention moves chunks from hot storage (full resolution, SSD) to warm storage (downsampled to 1-minute rollups, HDD) after a configurable age, and then to cold storage (1-hour rollups, object storage like S3) after a longer age. The background retention job scans MetricChunk records older than the hot tier threshold, runs the downsampling aggregation (MIN, MAX, AVG, COUNT per interval), writes the rollup chunk to the next tier, and deletes the original full-resolution chunk. Cold tier chunks are stored as compressed parquet files in object storage. Queries are tier-aware: the query planner selects the finest-resolution tier that covers the requested time range.

Question 5

How does delta-of-delta timestamp compression work?

Accepted Answer

The first timestamp is stored as-is; subsequent timestamps store the difference from the previous (delta); the delta-of-delta stores the difference between consecutive deltas — for regular intervals this is always 0, requiring only 1 bit per timestamp.

Question 6

How does XOR compression encode floating-point values?

Accepted Answer

The first value is stored uncompressed; each subsequent value is XORed with the previous; since consecutive measurements are usually similar, the XOR result has many leading and trailing zeros, enabling compact variable-length encoding (Gorilla algorithm).

Question 7

How are chunks sealed and when does sealing occur?

Accepted Answer

A chunk is sealed when it reaches its time boundary (e.g., 8-hour window ends) or its size limit; sealed chunks are compressed and made immutable; only the active (unsealed) chunk accepts new writes.

Question 8

How does tiered retention work for long-term storage?

Accepted Answer

A retention job checks chunk ages against tier thresholds; hot-tier chunks (SSD, full resolution) are downsampled and moved to warm tier (HDD, 1-minute rollup) after 7 days; warm chunks are further downsampled and archived to cold storage after 30 days.

Time-Series Database Low-Level Design: Chunk Storage, Compression, Downsampling, and Retention

Chunk Storage Architecture

Timestamp Compression: Delta-of-Delta Encoding

Value Compression: XOR (Gorilla Algorithm)

Write Path

Read Path

Continuous Aggregation and Downsampling

SQL DDL: Metadata Catalog

Python: Core Operations

Design Considerations Summary