Low Level Design: Hot-Cold Data Tiering

⏱ 3 min read

Data tiering organizes data across storage tiers based on access frequency and cost sensitivity. Hot data (frequently accessed) lives on fast, expensive storage; cold data (rarely accessed) on slow, cheap storage. The tiering system automatically moves data between tiers based on access patterns, balancing performance and cost without manual intervention.

Storage Tier Classification

Hot tier: NVMe SSD or in-memory cache (Redis). Sub-millisecond access. Cost: ~$0.10-1.00/GB/month. Use for: active user sessions, recent transactions, real-time feeds, search indexes. Warm tier: HDD or standard SSD (AWS EBS gp3). 1-10ms access. Cost: ~$0.02-0.10/GB/month. Use for: historical data accessed weekly, analytics queries on recent months. Cold tier: object storage (S3 Standard, S3-IA). 10-100ms access. Cost: ~$0.004-0.023/GB/month. Use for: backups, audit logs, data accessed monthly. Archive tier: S3 Glacier. Hours retrieval. Cost: ~$0.001/GB/month. Use for: compliance data, 7-year retention requirements.

Access Pattern Tracking

Track last_accessed_at and access_count per data object. Update access metadata on each read. Classification rules: data accessed in the last 7 days → hot; last 30 days → warm; last 90 days → cold; beyond 90 days → archive. Implement tracking efficiently: batch updates (buffer access events, flush periodically to avoid write amplification), approximate counters (HyperLogLog for unique access counts), and sampling (track access for a percentage of objects to reduce overhead).

Automatic Tiering Policies

S3 Intelligent-Tiering automatically moves objects between access tiers based on access patterns with no retrieval fee. Objects not accessed for 30 days move to Infrequent Access tier; not accessed for 90 days move to Archive Instant Access. Custom tiering policies: a scheduled job scans the access metadata table, identifies objects in the wrong tier, and moves them. Move hot data to fast tier before it is needed (predictive tiering based on access patterns — daily reports accessed every Monday can be pre-warmed to hot tier Sunday evening).

Database-Level Tiering

For databases: partition data by age (PostgreSQL declarative partitioning, TimescaleDB hypertables). Recent partitions reside on fast SSD; older partitions are detached and moved to cheaper storage (compressed columnar format in S3, queryable via Athena or Redshift Spectrum). Queries that span hot and cold data use a query federation layer that routes hot data to the database and cold data to the archive store. Alternatively, use tiered tablespaces in PostgreSQL to map hot and cold partitions to different storage media.

Cache Tiering

Multi-level caches implement tiering within the hot tier: L1 in-process memory (submicrosecond, limited capacity), L2 Redis (sub-millisecond, larger capacity), L3 database. A cache miss in L1 promotes the object from L2 to L1; a miss in L2 promotes from the database to L2. Eviction: L1 uses LRU to evict to L2; L2 uses LRU to evict to the database. This provides hot data near the CPU while keeping larger, less-frequently-accessed data in the next tier.

Read Path and Transparent Access

A tiering proxy or storage abstraction layer makes tiering transparent to application code: the application reads from a logical address without knowing which physical tier holds the data. The proxy checks hot tier first, falls back to warm, then cold. On a cold tier hit, the proxy optionally promotes the object to a warmer tier (based on configurable promotion rules) for faster future access. This avoids the application managing tier selection manually and centralizes tiering policy in the infrastructure layer.

Cost Optimization

Measure cost per tier per team to provide visibility into storage spending. Identify data that has been in the hot tier for 90 days without access — it should be cold. Identify data that is accessed from the cold tier frequently — it should be warm. Set per-team hot-tier quotas to enforce discipline. Alert when hot-tier usage grows unexpectedly (data that should be archived is being retained in hot tier). Budget for tiering transition costs: S3 charges per-PUT for tier transitions, Glacier charges per-retrieval. Factor these into the tiering policy threshold decisions.