Archive System Low-Level Design

Archive System — Low-Level Design

An archive system moves cold data from primary storage to cheaper, slower tiers while keeping it accessible on demand. This pattern appears at every company managing large datasets: email archiving, log archiving, database record archiving, and document storage.

Tiered Storage Model

Hot tier:    Primary PostgreSQL / MySQL — latest 90 days, sub-10ms reads
Warm tier:   Separate "archive" DB or read replica — 90 days to 2 years
Cold tier:   S3 / GCS object storage — 2+ years, minutes to retrieve
Glacier:     AWS Glacier / S3 Glacier — 7+ years, hours to retrieve

Data schema to track location across tiers:

Record
  id              BIGSERIAL PK
  created_at      TIMESTAMPTZ NOT NULL
  archived_at     TIMESTAMPTZ          -- null = still in hot tier
  archive_tier    TEXT                 -- 'warm', 'cold', 'glacier'
  archive_key     TEXT                 -- S3 key or warm DB reference
  data            JSONB                -- null if moved to another tier

ArchiveJob
  id              BIGSERIAL PK
  tier_target     TEXT NOT NULL
  cutoff_date     TIMESTAMPTZ NOT NULL
  status          TEXT DEFAULT 'running'
  records_moved   INT DEFAULT 0
  started_at      TIMESTAMPTZ
  completed_at    TIMESTAMPTZ

Archive Worker: Moving to Cold Tier

def archive_batch_to_s3(cutoff_date, batch_size=1000):
    while True:
        # Fetch a batch of records to archive
        records = db.execute("""
            SELECT * FROM Record
            WHERE created_at < %(cutoff)s
              AND archived_at IS NULL
            ORDER BY created_at
            LIMIT %(batch_size)s
        """, {'cutoff': cutoff_date, 'batch_size': batch_size})

        if not records:
            break

        # Serialize to JSON and write to S3
        year_month = records[0].created_at.strftime('%Y/%m')
        s3_key = f'archive/{year_month}/{records[0].id}-{records[-1].id}.jsonl.gz'

        payload = gzip.compress(
            'n'.join(json.dumps(r) for r in records).encode()
        )
        s3.put_object(Bucket='my-archive', Key=s3_key, Body=payload)

        # Mark records as archived and remove from hot tier
        ids = [r.id for r in records]
        db.execute("""
            UPDATE Record
            SET archived_at=NOW(), archive_tier='cold',
                archive_key=%(key)s, data=NULL
            WHERE id = ANY(%(ids)s)
        """, {'key': s3_key, 'ids': ids})

        db.commit()

Retrieval: Restoring an Archived Record

def get_record(record_id):
    record = db.get(Record, record_id)

    if record.data is not None:
        return record.data  # Hot tier — return immediately

    if record.archive_tier == 'warm':
        return warm_db.get(record_id)

    if record.archive_tier == 'cold':
        return restore_from_s3(record.archive_key, record_id)

    if record.archive_tier == 'glacier':
        # Initiate restore (async — takes hours)
        glacier.restore_object(Key=record.archive_key, Days=7)
        raise ArchiveRestorePending(f'Glacier restore initiated, retry in 3-5 hours')

def restore_from_s3(s3_key, target_id):
    obj = s3.get_object(Bucket='my-archive', Key=s3_key)
    lines = gzip.decompress(obj['Body'].read()).decode().split('n')
    for line in lines:
        r = json.loads(line)
        if r['id'] == target_id:
            return r
    raise NotFound(f'Record {target_id} not found in archive file {s3_key}')

Indexing Archived Data

Problem: archived records have data=NULL in the primary DB. You need to find records without fetching every S3 file.

Solution: Keep a lightweight index in PostgreSQL even after archiving:

-- After archiving, keep id, created_at, archive_key, and key searchable fields
-- Remove only the large data column

ALTER TABLE Record DROP COLUMN data;  -- or set to NULL

-- Add index for date-range queries on archived records
CREATE INDEX idx_record_archived ON Record(created_at)
WHERE archived_at IS NOT NULL;

-- For text search on archived content: Elasticsearch index
-- Pipe records through Kafka to ES before/during archival
-- ES retains searchable fields; full record retrieved from S3

Archive Policy Configuration

ArchivePolicy
  id              BIGSERIAL PK
  record_type     TEXT NOT NULL      -- 'orders', 'logs', 'messages'
  hot_retention   INT NOT NULL       -- days to keep in hot tier
  warm_retention  INT                -- days in warm tier (null = skip)
  cold_retention  INT                -- days in cold tier
  glacier_enabled BOOLEAN DEFAULT false
  compress        BOOLEAN DEFAULT true
  enabled         BOOLEAN DEFAULT true

Store policies in the database — not hardcoded — so they can be updated without code deploys. The archive worker reads the policy table to determine per-record-type cutoff dates.

Safety: Soft Delete Before Hard Delete

-- Phase 1 (day 0): Mark records for archival, do NOT delete yet
UPDATE Record SET archived_at=NOW(), archive_tier='cold', archive_key=:key, data=NULL

-- Phase 2 (day 7): Verify S3 file is accessible and correct count
SELECT count(*) FROM Record WHERE archive_key=:key
-- Compare against line count in S3 file

-- Phase 3 (day 30): Hard delete raw data if verified
-- At this point data=NULL already; the S3 file is the record of truth

Never hard-delete from S3 during the same job that writes to S3. Add a verification step with a delay.

Key Interview Points

  • Separate archival from deletion: Archive first, verify, delete later. Data loss from a buggy archive job is catastrophic.
  • Keep index rows in primary DB: Retain id, created_at, archive_key. The full payload moves to S3; searching remains fast.
  • Batch size tuning: 1,000-5,000 records per batch avoids long transactions and excessive S3 PUTs. S3 PUT costs $0.005/1K requests — batching matters.
  • Glacier retrieval latency: Standard retrieval takes 3-5 hours; Expedited costs $0.03/GB but takes 1-5 min. Design UI to set expectations — no “instant” access for glacier data.
  • Compression: JSONL files gzip at 5-10x typical compression. A 1GB archive batch becomes 100-200MB in S3. Always compress.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What are the tradeoffs between hot, warm, and cold storage tiers?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Hot tier (primary DB): sub-10ms reads, most expensive ($0.10-0.25/GB/month for SSD). Use for data accessed daily. Warm tier (separate DB / read replica): 10-100ms reads, cheaper. Use for data accessed occasionally (weekly/monthly). Cold tier (S3/GCS): 100ms-1s reads, ~$0.023/GB/month. Use for data accessed rarely (quarterly/yearly). Glacier: hours to retrieve, ~$0.004/GB/month. Use for compliance retention with no access expectation. Design: keep the index (IDs, dates) in the hot DB; move only the payload to cold storage.”}},{“@type”:”Question”,”name”:”How do you safely archive records without losing data?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Three-phase approach: (1) Write phase — serialize records to S3 in compressed JSONL batches; mark records in DB as archived_at=NOW(), archive_key=s3_key, data=NULL. (2) Verification phase — 7 days later, verify that the S3 file is accessible and the line count matches the expected number of records. (3) Cleanup phase — only after verification passes, do any additional cleanup. Never hard-delete the original rows in the same transaction that writes to S3. The gap between phases gives you time to detect and recover from S3 write failures.”}},{“@type”:”Question”,”name”:”How do you make archived records searchable without loading all S3 files?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Keep lightweight index rows in the primary database after archiving: retain id, created_at, user_id, and archive_key but NULL the large payload column. Index created_at for date-range queries. For full-text search on archived content: before or during archival, pipe records through Kafka to an Elasticsearch index. Elasticsearch retains searchable fields (name, description, tags); the full record is fetched from S3 on demand. This gives you fast search without loading multi-GB S3 files.”}},{“@type”:”Question”,”name”:”What batch size should you use for archiving records to S3?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”1,000-10,000 records per batch. Too small (100 records): too many S3 PUT operations ($0.005 per 1,000 PUTs). Too large (100,000 records): single transaction takes too long, exceeds PostgreSQL statement timeout, and makes partial failure recovery harder. For each batch: one DB transaction, one S3 multipart upload. S3 multipart requires minimum 5MB per part — gzip-compressed JSONL for 5,000 records typically exceeds this threshold. Use gzip compression: JSONL compresses 5-10x, dramatically reducing S3 storage costs.”}},{“@type”:”Question”,”name”:”How do you handle a request for data that is in Glacier?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Return an immediate 202 Accepted response: the data will be available in 3-5 hours (Standard retrieval) or 1-5 minutes (Expedited, $0.03/GB). Initiate the Glacier restore with a 7-day expiry window. Store the restore request in a RestoreRequest table with status=pending and notify the user via email when complete. Poll the S3 restore status (or use S3 event notifications) to detect when the restore finishes. Never block the HTTP request waiting for Glacier — it will time out.”}}]}

Archive system and S3 tiered storage design is discussed in Amazon system design interview questions.

Data archival and cold storage system design is covered in Google system design interview preparation.

Data archival and tiered storage pipeline design is covered in Databricks system design interview guide.

Scroll to Top