Storage tiering assigns data to different storage media based on access frequency: hot data (accessed frequently) lives on fast, expensive storage; cold data (rarely accessed) lives on slow, cheap storage. This dramatically reduces infrastructure cost while maintaining good performance for the common case. Storage tiering is used by cloud providers (S3 Intelligent-Tiering), databases (Cassandra TWCS, TimescaleDB compression), and observability platforms (Datadog, Grafana) for managing large data volumes cost-effectively.
Storage Tiers and Their Characteristics
Four storage tiers with progressively lower cost and higher latency: In-memory (RAM): sub-millisecond access, ~$10/GB/month, best for working set. NVMe SSD: 0.1ms access, ~$0.50/GB/month, best for active data (last 30 days). HDD / Network block storage: 5-10ms access, ~$0.05/GB/month, good for warm data (30-365 days). Object storage (S3/GCS): 50-200ms first-byte latency, ~$0.023/GB/month + retrieval costs, best for cold data (1+ years). For S3, Glacier adds further tiers: Glacier Instant (~$0.004/GB) and Glacier Deep Archive (~$0.00099/GB, 12-hour retrieval).
// S3 Lifecycle policy: automatic tiering by age
{
"Rules": [{
"ID": "IntelligentTieringRule",
"Status": "Enabled",
"Filter": {"Prefix": "logs/"},
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"}, // Infrequent Access
{"Days": 90, "StorageClass": "GLACIER_IR"}, // Glacier Instant
{"Days": 365, "StorageClass": "DEEP_ARCHIVE"} // Deep Archive
],
"Expiration": {"Days": 2555} // Delete after 7 years
}]
}
Database-Level Tiering: Cassandra TWCS
Cassandra Time Window Compaction Strategy (TWCS) groups SSTables by time window (e.g., 1 day). Within a window, compaction merges SSTables. Once a window is complete (all data is older than the window), that SSTable is immutable. Tiering moves old SSTables: recent windows live on NVMe SSDs, older windows on HDDs, very old windows are exported to S3 using Cassandra external compaction tools. Queries against old data hit the HDD/S3 tier (slower) while recent queries hit SSDs. TWCS is designed specifically for time-series workloads where old data is written once and read rarely.
Access Pattern Detection
Data promotion/demotion between tiers requires tracking access patterns. Two approaches: LRU tracking: track last access time per data object or partition key. Background job scans for objects not accessed in N days and demotes them to cold storage. S3 Intelligent-Tiering: AWS monitors access patterns automatically and moves objects between Standard and Infrequent Access tiers. Objects not accessed for 30 days move to IA; accessing a cold object promotes it back to Standard automatically. For database workloads, track partition-level access time in a separate metadata table (avoid adding per-row overhead).
Tiered Query Routing
Queries spanning multiple tiers must route correctly. For time-range queries: if the query range covers only hot data (last 30 days), route to the SSD tier only. If it spans hot and cold data, fan out to both tiers and merge results. For the cold tier, consider pre-warming: before a user runs a historical report, trigger a Glacier restore in advance (restore to S3 Standard takes minutes to hours). Expose tier latency in the UI: warn users that queries covering archived data will take longer. Some systems maintain a metadata index on the hot tier that summarizes cold data, enabling aggregate queries without restoring cold data.
Key Interview Discussion Points
- Cost calculation: if 90% of data is cold (never accessed after 30 days), storing it at Glacier rates saves 95% vs. SSD — calculate ROI for the specific access pattern
- Retrieval costs: S3 Standard-IA and Glacier charge per-GB for retrieval — factor this in for data that is occasionally accessed
- Compression on cold tiers: compress data before moving to cold storage (Parquet + zstd achieves 10x compression on structured data), multiplying the cost savings
- Tiering in observability: metrics/logs/traces older than 30 days move to object storage; dashboards covering recent data load fast, historical dashboards are slower but rare
- Regulatory retention: hot/cold tiering enables cheap long-term retention for compliance (GDPR 7-year financial record requirement, HIPAA 6-year medical record requirement)