API Key Management Low-Level Design: Generation, Scopes, Rate Limiting, and Rotation

API key management handles the full lifecycle of programmatic access credentials: creation, scoped authorization, usage tracking, rotation, and revocation. API keys authenticate machine-to-machine requests where interactive OAuth flows are impractical. The design challenges are storing keys securely (you should not be able to recover a key after creation), enforcing per-key rate limits and scope restrictions, and supporting rotation without downtime.

Core Data Model

CREATE TABLE ApiKey (
    key_id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         BIGINT NOT NULL REFERENCES User(id),
    name            VARCHAR(100) NOT NULL,         -- human label: "production key", "CI/CD"
    key_prefix      VARCHAR(8) NOT NULL,           -- first 8 chars, shown in UI
    key_hash        VARCHAR(64) NOT NULL UNIQUE,   -- SHA-256 of full key
    scopes          TEXT[] NOT NULL DEFAULT '{}',  -- ['read:users', 'write:orders']
    rate_limit_rpm  INT NOT NULL DEFAULT 1000,     -- requests per minute
    expires_at      TIMESTAMPTZ,                   -- NULL = no expiry
    last_used_at    TIMESTAMPTZ,
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    revoked_at      TIMESTAMPTZ,
    revoke_reason   TEXT
);

CREATE INDEX idx_apikey_user ON ApiKey(user_id) WHERE revoked_at IS NULL;
CREATE INDEX idx_apikey_hash ON ApiKey(key_hash) WHERE revoked_at IS NULL;

-- Usage log for billing and audit
CREATE TABLE ApiKeyUsage (
    id          BIGSERIAL PRIMARY KEY,
    key_id      UUID NOT NULL,
    endpoint    VARCHAR(255) NOT NULL,
    status_code INT NOT NULL,
    latency_ms  INT,
    ip_address  INET,
    occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (occurred_at);

Key Generation and Storage

import secrets
import hashlib

def create_api_key(user_id: int, name: str, scopes: list[str],
                   rate_limit_rpm: int = 1000) -> dict:
    """
    Returns the full key ONCE — never stored in plaintext.
    After this call, only the hash and prefix are retained.
    """
    # Format: prefix_randompart  e.g., "sk_live_aB3kR9xZmQpLnT7vWq2Y"
    prefix = "sk_live"
    random_part = secrets.token_urlsafe(32)  # 256 bits of entropy
    full_key = f"{prefix}_{random_part}"

    key_prefix = full_key[:8]  # show in UI to identify which key
    key_hash = hashlib.sha256(full_key.encode()).hexdigest()

    key_id = db.fetchone("""
        INSERT INTO ApiKey (user_id, name, key_prefix, key_hash, scopes, rate_limit_rpm)
        VALUES (%s, %s, %s, %s, %s, %s)
        RETURNING key_id
    """, [user_id, name, key_prefix, key_hash, scopes, rate_limit_rpm])['key_id']

    return {
        'key_id': str(key_id),
        'key': full_key,           # shown to user ONCE, never again
        'prefix': key_prefix,
        'message': 'Store this key securely. It will not be shown again.'
    }

Request Authentication and Rate Limiting

def authenticate_request(api_key_raw: str, required_scope: str,
                          endpoint: str, ip: str) -> dict:
    key_hash = hashlib.sha256(api_key_raw.encode()).hexdigest()

    # Cache lookups: most keys are valid and hot
    cache_key = f"apikey:{key_hash}"
    cached = redis.get(cache_key)
    if cached:
        key_data = json.loads(cached)
    else:
        key_data = db.fetchone("""
            SELECT key_id, user_id, scopes, rate_limit_rpm, expires_at
            FROM ApiKey
            WHERE key_hash = %s AND revoked_at IS NULL
        """, [key_hash])

        if not key_data:
            raise UnauthorizedError("Invalid API key")

        if key_data['expires_at'] and datetime.utcnow() > key_data['expires_at']:
            raise UnauthorizedError("API key expired")

        # Cache for 5 minutes — revocation takes up to 5 min to propagate
        redis.setex(cache_key, 300, json.dumps(key_data, default=str))

    # Check scope
    if required_scope not in (key_data['scopes'] or []):
        raise ForbiddenError(f"Key lacks required scope: {required_scope}")

    # Per-key rate limiting using sliding window in Redis
    rate_key = f"rate:{key_data['key_id']}:{int(time.time()) // 60}"
    count = redis.incr(rate_key)
    redis.expire(rate_key, 120)  # 2-minute window
    if count > key_data['rate_limit_rpm']:
        raise RateLimitError(f"Rate limit exceeded: {key_data['rate_limit_rpm']} RPM")

    # Async usage logging (non-blocking)
    log_usage_async(key_data['key_id'], endpoint, ip)

    return key_data

def log_usage_async(key_id: str, endpoint: str, ip: str):
    """Fire-and-forget usage log update."""
    # Update last_used_at in DB with write coalescing:
    # Don't write if last_used_at was updated in the last 60 seconds
    redis.set(f"last_used:{key_id}", int(time.time()), ex=60, nx=True)
    usage_queue.enqueue('flush_key_usage', key_id=key_id, endpoint=endpoint)

Key Rotation Without Downtime

def rotate_api_key(key_id: str, user_id: int) -> dict:
    """
    Issue a new key while keeping the old one valid for a grace period.
    Allows callers to update their key without downtime.
    """
    old_key = db.fetchone("""
        SELECT * FROM ApiKey WHERE key_id=%s AND user_id=%s AND revoked_at IS NULL
    """, [key_id, user_id])
    if not old_key:
        raise NotFoundError("Key not found")

    # Issue new key with same config
    new_key_result = create_api_key(
        user_id=user_id,
        name=f"{old_key['name']} (rotated)",
        scopes=old_key['scopes'],
        rate_limit_rpm=old_key['rate_limit_rpm']
    )

    # Schedule old key revocation after grace period (24 hours)
    revoke_at = datetime.utcnow() + timedelta(hours=24)
    db.execute("""
        UPDATE ApiKey SET revoked_at=%s, revoke_reason='rotated'
        WHERE key_id=%s
    """, [revoke_at, key_id])  # soft-revoke with future timestamp

    return {
        'new_key': new_key_result['key'],
        'old_key_revokes_at': revoke_at.isoformat(),
        'message': 'Update your integration within 24 hours'
    }

Key Interview Points

  • Never store raw API keys — store SHA-256 hash. If the database is breached, attackers get only hashes, which cannot be used to authenticate. The raw key is shown once at creation.
  • The key_prefix field (first 8 chars) lets users identify which key made a request in logs without exposing the full key — safe to display in UI and audit logs.
  • Cache key lookups in Redis with a short TTL (5 minutes) — authentication happens on every request; a DB query per request is too expensive at scale. Revocation takes up to one TTL to propagate.
  • Per-key rate limiting isolates misbehaving callers — one runaway integration cannot impact other users’ rate limits.
  • Rotation with a grace period allows zero-downtime key updates — callers have 24 hours to switch to the new key before the old one is revoked.
  • Scopes limit blast radius: a key with only read:reports scope cannot write data even if stolen. Always require callers to request the minimum scope needed.

API key management and developer platform design is discussed in Stripe system design interview questions.

API key management and authentication system design is covered in Google system design interview preparation.

API key management and developer access design is discussed in Atlassian system design interview guide.

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top