API Key Management Low-Level Design: Generation, Scopes, Rate Limiting, and Rotation

API key management handles the full lifecycle of programmatic access credentials: creation, scoped authorization, usage tracking, rotation, and revocation. API keys authenticate machine-to-machine requests where interactive OAuth flows are impractical. The design challenges are storing keys securely (you should not be able to recover a key after creation), enforcing per-key rate limits and scope restrictions, and supporting rotation without downtime.

Core Data Model

CREATE TABLE ApiKey (
    key_id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         BIGINT NOT NULL REFERENCES User(id),
    name            VARCHAR(100) NOT NULL,         -- human label: "production key", "CI/CD"
    key_prefix      VARCHAR(8) NOT NULL,           -- first 8 chars, shown in UI
    key_hash        VARCHAR(64) NOT NULL UNIQUE,   -- SHA-256 of full key
    scopes          TEXT[] NOT NULL DEFAULT '{}',  -- ['read:users', 'write:orders']
    rate_limit_rpm  INT NOT NULL DEFAULT 1000,     -- requests per minute
    expires_at      TIMESTAMPTZ,                   -- NULL = no expiry
    last_used_at    TIMESTAMPTZ,
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    revoked_at      TIMESTAMPTZ,
    revoke_reason   TEXT
);

CREATE INDEX idx_apikey_user ON ApiKey(user_id) WHERE revoked_at IS NULL;
CREATE INDEX idx_apikey_hash ON ApiKey(key_hash) WHERE revoked_at IS NULL;

-- Usage log for billing and audit
CREATE TABLE ApiKeyUsage (
    id          BIGSERIAL PRIMARY KEY,
    key_id      UUID NOT NULL,
    endpoint    VARCHAR(255) NOT NULL,
    status_code INT NOT NULL,
    latency_ms  INT,
    ip_address  INET,
    occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (occurred_at);

Key Generation and Storage

import secrets
import hashlib

def create_api_key(user_id: int, name: str, scopes: list[str],
                   rate_limit_rpm: int = 1000) -> dict:
    """
    Returns the full key ONCE — never stored in plaintext.
    After this call, only the hash and prefix are retained.
    """
    # Format: prefix_randompart  e.g., "sk_live_aB3kR9xZmQpLnT7vWq2Y"
    prefix = "sk_live"
    random_part = secrets.token_urlsafe(32)  # 256 bits of entropy
    full_key = f"{prefix}_{random_part}"

    key_prefix = full_key[:8]  # show in UI to identify which key
    key_hash = hashlib.sha256(full_key.encode()).hexdigest()

    key_id = db.fetchone("""
        INSERT INTO ApiKey (user_id, name, key_prefix, key_hash, scopes, rate_limit_rpm)
        VALUES (%s, %s, %s, %s, %s, %s)
        RETURNING key_id
    """, [user_id, name, key_prefix, key_hash, scopes, rate_limit_rpm])['key_id']

    return {
        'key_id': str(key_id),
        'key': full_key,           # shown to user ONCE, never again
        'prefix': key_prefix,
        'message': 'Store this key securely. It will not be shown again.'
    }

Request Authentication and Rate Limiting

def authenticate_request(api_key_raw: str, required_scope: str,
                          endpoint: str, ip: str) -> dict:
    key_hash = hashlib.sha256(api_key_raw.encode()).hexdigest()

    # Cache lookups: most keys are valid and hot
    cache_key = f"apikey:{key_hash}"
    cached = redis.get(cache_key)
    if cached:
        key_data = json.loads(cached)
    else:
        key_data = db.fetchone("""
            SELECT key_id, user_id, scopes, rate_limit_rpm, expires_at
            FROM ApiKey
            WHERE key_hash = %s AND revoked_at IS NULL
        """, [key_hash])

        if not key_data:
            raise UnauthorizedError("Invalid API key")

        if key_data['expires_at'] and datetime.utcnow() > key_data['expires_at']:
            raise UnauthorizedError("API key expired")

        # Cache for 5 minutes — revocation takes up to 5 min to propagate
        redis.setex(cache_key, 300, json.dumps(key_data, default=str))

    # Check scope
    if required_scope not in (key_data['scopes'] or []):
        raise ForbiddenError(f"Key lacks required scope: {required_scope}")

    # Per-key rate limiting using sliding window in Redis
    rate_key = f"rate:{key_data['key_id']}:{int(time.time()) // 60}"
    count = redis.incr(rate_key)
    redis.expire(rate_key, 120)  # 2-minute window
    if count > key_data['rate_limit_rpm']:
        raise RateLimitError(f"Rate limit exceeded: {key_data['rate_limit_rpm']} RPM")

    # Async usage logging (non-blocking)
    log_usage_async(key_data['key_id'], endpoint, ip)

    return key_data

def log_usage_async(key_id: str, endpoint: str, ip: str):
    """Fire-and-forget usage log update."""
    # Update last_used_at in DB with write coalescing:
    # Don't write if last_used_at was updated in the last 60 seconds
    redis.set(f"last_used:{key_id}", int(time.time()), ex=60, nx=True)
    usage_queue.enqueue('flush_key_usage', key_id=key_id, endpoint=endpoint)

Key Rotation Without Downtime

def rotate_api_key(key_id: str, user_id: int) -> dict:
    """
    Issue a new key while keeping the old one valid for a grace period.
    Allows callers to update their key without downtime.
    """
    old_key = db.fetchone("""
        SELECT * FROM ApiKey WHERE key_id=%s AND user_id=%s AND revoked_at IS NULL
    """, [key_id, user_id])
    if not old_key:
        raise NotFoundError("Key not found")

    # Issue new key with same config
    new_key_result = create_api_key(
        user_id=user_id,
        name=f"{old_key['name']} (rotated)",
        scopes=old_key['scopes'],
        rate_limit_rpm=old_key['rate_limit_rpm']
    )

    # Schedule old key revocation after grace period (24 hours)
    revoke_at = datetime.utcnow() + timedelta(hours=24)
    db.execute("""
        UPDATE ApiKey SET revoked_at=%s, revoke_reason='rotated'
        WHERE key_id=%s
    """, [revoke_at, key_id])  # soft-revoke with future timestamp

    return {
        'new_key': new_key_result['key'],
        'old_key_revokes_at': revoke_at.isoformat(),
        'message': 'Update your integration within 24 hours'
    }

Key Interview Points

  • Never store raw API keys — store SHA-256 hash. If the database is breached, attackers get only hashes, which cannot be used to authenticate. The raw key is shown once at creation.
  • The key_prefix field (first 8 chars) lets users identify which key made a request in logs without exposing the full key — safe to display in UI and audit logs.
  • Cache key lookups in Redis with a short TTL (5 minutes) — authentication happens on every request; a DB query per request is too expensive at scale. Revocation takes up to one TTL to propagate.
  • Per-key rate limiting isolates misbehaving callers — one runaway integration cannot impact other users’ rate limits.
  • Rotation with a grace period allows zero-downtime key updates — callers have 24 hours to switch to the new key before the old one is revoked.
  • Scopes limit blast radius: a key with only read:reports scope cannot write data even if stolen. Always require callers to request the minimum scope needed.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why should API keys be stored as hashes rather than in plaintext?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”An API key is a credential that grants programmatic access to your API — equivalent in sensitivity to a password. If your database is breached and API keys are stored in plaintext, every integration that uses those keys is immediately compromised. Attackers can impersonate your customers, exfiltrate data, and make API calls on their behalf. Storing SHA-256(key) means a breached database yields only hashes that cannot be reversed to the original keys (SHA-256 is a one-way function). The raw key is shown once at creation and never stored — the user is responsible for storing it securely. This is the same pattern used for password storage (bcrypt) applied to API credentials.”}},{“@type”:”Question”,”name”:”How do you implement per-API-key rate limiting efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use Redis sliding window counters keyed by key_id and the current time bucket. For rate limit in requests-per-minute: key = rate:{key_id}:{unix_timestamp // 60}. On each request: INCR the key, set EXPIRE 120 seconds (two windows to handle boundary), check if the count exceeds the limit. This is O(1) per request. Store the rate_limit_rpm in the API key record and cache it with the key data (same Redis lookup that validates the key also returns the limit). Per-key limits allow different tiers: free keys at 100 RPM, paid keys at 10,000 RPM, enterprise keys unlimited. Without per-key limits, a single runaway integration consumes the entire rate budget, degrading the API for all other customers.”}},{“@type”:”Question”,”name”:”What are API key scopes and how do you enforce them?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Scopes define what operations a key is permitted to perform: read:users, write:orders, admin:billing. A key with scope [read:reports] cannot create or delete resources even if stolen. Enforcement: at the authentication middleware, check if required_scope is in the key’s scopes array. Reject with 403 Forbidden if not. Scopes follow the principle of least privilege — callers request the minimum scope needed. Common scope patterns: resource:action pairs (orders:read, orders:write), or capability strings (analytics_access, webhook_management). Store as a text array in PostgreSQL (scopes TEXT[] NOT NULL) for efficient contains checks: scopes @> ARRAY[%s] in SQL. Expose scope selection in the key creation UI so developers can self-service.”}},{“@type”:”Question”,”name”:”How do you implement zero-downtime key rotation?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Zero-downtime rotation requires a grace period: issue the new key immediately, but keep the old key valid for 24 hours. The caller can update their integration at any time during that window. Implementation: (1) Create a new key with the same scopes and rate limit as the old key. (2) Set revoked_at = NOW() + 24 hours on the old key (soft-future revocation). (3) During the grace period, authentication still accepts the old key but returns a response header: Deprecation: true, Sunset: <timestamp>. Callers that monitor response headers know to update. (4) At revoked_at, the old key stops working. This is better than instant revocation, which causes immediate breakage for callers who haven’t yet updated. Document the rotation flow clearly in your developer docs.”}},{“@type”:”Question”,”name”:”How do you surface API key usage to users for billing and debugging?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Log every request to an ApiKeyUsage table (partitioned by day/month to manage size): key_id, endpoint, status_code, latency_ms, ip_address, timestamp. Aggregate in a background job: count requests per key per hour, group by endpoint and status code. Expose in the developer dashboard: total requests today/this month, top endpoints by volume, error rate by endpoint, last used timestamp. For billing: aggregate monthly usage and compare against plan limits. For debugging: "this key returned 401 on 500 requests in the last hour" is immediately actionable. Never log the full request/response body in the usage table — it may contain sensitive data. The key_prefix (shown in logs and UI) lets developers identify which key caused a spike without exposing the full credential.”}}]}

API key management and developer platform design is discussed in Stripe system design interview questions.

API key management and authentication system design is covered in Google system design interview preparation.

API key management and developer access design is discussed in Atlassian system design interview guide.

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top