Hot-Cold Data Tiering Low-Level Design: Access Pattern Detection, Tier Migration, and Storage Cost Optimization

Why Hot-Cold Tiering?

Not all data is accessed equally. A small fraction of items (the “hot” set) drives the majority of reads, while the bulk of the dataset (the “cold” set) is rarely or never accessed. Storing all data on fast, expensive SSD storage wastes money. Hot-cold tiering automatically moves data between storage tiers based on access frequency, delivering both performance and cost efficiency.

Typical cost differential: SSD NVMe at $0.10/GB/month vs S3 object storage at $0.023/GB/month — over 4x savings for cold data, and real-world workloads often have 80-90% of data in the cold tier.

Access Frequency Tracking

Each item maintains an access counter with time-decay. Two common models:

  • LFU with exponential decay: access count is halved every N hours. Recent accesses count more than old ones. An item accessed 100 times two weeks ago may score lower than one accessed 10 times today.
  • Sliding window count: count accesses in the last 30 days only. Simple; no decay math required. Old accesses fall off naturally as the window slides.

On every read, access_count_30d is incremented and last_accessed_at is updated. A background job re-computes counts by querying the access event log, keeping the counter accurate over rolling windows.

Hot Promotion Threshold

An item is promoted to the hot tier if its access count exceeds a threshold within the tracking window. For example: promote if access_count_30d > 100. The threshold is tunable per data class — user profile photos may have a lower threshold than product images.

Promotion is executed by the background tiering job, not on the read path, to avoid write amplification on every access.

Cold Migration

Items not accessed in N days (configurable per tier) are migrated to cold storage:

  1. Background batch job queries DataItem WHERE last_accessed_at < now() - interval 'N days' AND tier = 'hot'.
  2. For each candidate: compress the item, upload to S3 (or HDD path), record the cold_storage_key in the metadata store.
  3. Mark the item as tier = 'cold' in the hot-tier metadata store.
  4. Optionally delete the hot-tier copy (or retain for a short grace period).

Migration is async and non-blocking. Reads during migration are served from the source (hot) tier until the migration flag is cleared.

Retrieval on Cold Miss

When a read request arrives for a cold item:

  1. Look up cold_storage_key from the metadata store.
  2. Fetch the compressed object from S3/HDD.
  3. Decompress and serve the item.
  4. Evaluate whether to re-promote: if the item's access pattern crosses the hot threshold again, schedule a promotion job.

Cold retrieval adds latency (10ms-1s depending on storage backend). Surface this to callers via a cache-hit header or metric so SLAs can be set appropriately.

Warm Tier (Optional Middle Layer)

A warm tier (e.g., HDD local disk) can sit between hot SSD and cold object storage. Items moving from hot to cold pass through warm first, allowing re-promotion without a full S3 round-trip. Warm tier adds operational complexity but reduces re-promotion latency.

SQL Schema

CREATE TABLE DataItem (
    id                  BIGSERIAL PRIMARY KEY,
    item_key            TEXT UNIQUE NOT NULL,
    tier                TEXT NOT NULL DEFAULT 'hot' CHECK (tier IN ('hot','warm','cold')),
    access_count_30d    INT NOT NULL DEFAULT 0,
    last_accessed_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
    cold_storage_key    TEXT,
    size_bytes          BIGINT,
    checksum            TEXT
);

CREATE INDEX idx_item_tier_access ON DataItem (tier, last_accessed_at);

CREATE TABLE TierMigrationJob (
    id              BIGSERIAL PRIMARY KEY,
    item_id         BIGINT REFERENCES DataItem(id),
    from_tier       TEXT NOT NULL,
    to_tier         TEXT NOT NULL,
    started_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    completed_at    TIMESTAMPTZ,
    bytes_migrated  BIGINT,
    status          TEXT NOT NULL DEFAULT 'running' CHECK (status IN ('running','complete','failed'))
);

CREATE TABLE AccessEvent (
    id          BIGSERIAL PRIMARY KEY,
    item_id     BIGINT REFERENCES DataItem(id),
    accessed_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_access_item_time ON AccessEvent (item_id, accessed_at);

Python Implementation Sketch

import boto3, time, gzip, json
from datetime import datetime, timedelta

class TieringManager:
    def __init__(self, db, s3_bucket: str):
        self.db = db
        self.s3 = boto3.client('s3')
        self.s3_bucket = s3_bucket
        self.hot_threshold = 100   # accesses in 30 days
        self.cold_after_days = 30  # migrate if not accessed in N days

    def track_access(self, item_id: int):
        self.db.execute(
            "INSERT INTO AccessEvent (item_id) VALUES (%s)", (item_id,)
        )
        self.db.execute(
            "UPDATE DataItem SET access_count_30d = access_count_30d + 1, last_accessed_at = now() WHERE id = %s",
            (item_id,)
        )

    def evaluate_tier(self, item_id: int) -> str:
        row = self.db.fetchone(
            "SELECT tier, access_count_30d, last_accessed_at FROM DataItem WHERE id = %s", (item_id,)
        )
        if not row:
            return 'unknown'
        if row['tier'] == 'cold' and row['access_count_30d'] >= self.hot_threshold:
            return 'promote'
        if row['tier'] == 'hot' and (datetime.utcnow() - row['last_accessed_at']).days >= self.cold_after_days:
            return 'demote'
        return row['tier']

    def migrate_to_cold(self, item_id: int):
        row = self.db.fetchone("SELECT * FROM DataItem WHERE id = %s", (item_id,))
        job_id = self.db.fetchone(
            "INSERT INTO TierMigrationJob (item_id, from_tier, to_tier) VALUES (%s, %s, %s) RETURNING id",
            (item_id, 'hot', 'cold')
        )['id']
        try:
            data = self._load_from_hot(row['item_key'])
            compressed = gzip.compress(json.dumps(data).encode())
            cold_key = f"cold/{item_id}/{row['item_key']}"
            self.s3.put_object(Bucket=self.s3_bucket, Key=cold_key, Body=compressed)
            self.db.execute(
                "UPDATE DataItem SET tier = 'cold', cold_storage_key = %s WHERE id = %s",
                (cold_key, item_id)
            )
            self.db.execute(
                "UPDATE TierMigrationJob SET completed_at = now(), bytes_migrated = %s, status = 'complete' WHERE id = %s",
                (len(compressed), job_id)
            )
        except Exception as e:
            self.db.execute("UPDATE TierMigrationJob SET status = 'failed' WHERE id = %s", (job_id,))
            raise

    def retrieve_cold(self, item_id: int) -> dict:
        row = self.db.fetchone("SELECT cold_storage_key FROM DataItem WHERE id = %s", (item_id,))
        obj = self.s3.get_object(Bucket=self.s3_bucket, Key=row['cold_storage_key'])
        compressed = obj['Body'].read()
        data = json.loads(gzip.decompress(compressed).decode())
        self.track_access(item_id)
        if self.evaluate_tier(item_id) == 'promote':
            # Schedule async promotion back to hot
            pass
        return data

    def _load_from_hot(self, item_key: str) -> dict:
        return self.db.fetchone("SELECT * FROM hot_storage WHERE item_key = %s", (item_key,))

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

Scroll to Top