An audit logging system records every significant action in a platform — who did what to which resource, when, and with what outcome — in a tamper-evident, append-only store. This guide covers the low-level design: structured event schema, append-only enforcement, hash chaining for tamper detection, partition-based retention, and a compliance query API.
1. Structured Event Schema
Each audit event captures five dimensions:
- Who:
actor_id(user or service identifier),actor_type(HUMAN, SERVICE, SYSTEM) - What:
action(CREATE, READ, UPDATE, DELETE, LOGIN, EXPORT, …),resource_type,resource_id - Context:
ip_address,user_agent,request_id(correlates with application trace) - Outcome: SUCCESS or FAILURE (with error code)
- State diff:
before_stateJSONB andafter_stateJSONB — capture only changed fields for UPDATE events to keep storage bounded
-- Audit event table, partitioned by month
CREATE TABLE AuditEvent (
id BIGSERIAL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
actor_id TEXT NOT NULL,
actor_type TEXT NOT NULL,
action TEXT NOT NULL,
resource_type TEXT NOT NULL,
resource_id TEXT NOT NULL,
outcome TEXT NOT NULL, -- SUCCESS | FAILURE
error_code TEXT,
ip_address INET,
user_agent TEXT,
request_id UUID,
before_state JSONB,
after_state JSONB,
prev_hash TEXT NOT NULL, -- SHA-256 of previous record
record_hash TEXT NOT NULL -- SHA-256 of this record fields + prev_hash
) PARTITION BY RANGE (created_at);
-- Monthly partitions (created by automated job or pg_partman)
CREATE TABLE AuditEvent_2025_01
PARTITION OF AuditEvent
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
-- Indexes on common filter columns
CREATE INDEX ON AuditEvent (actor_id, created_at);
CREATE INDEX ON AuditEvent (resource_type, resource_id, created_at);
CREATE INDEX ON AuditEvent (action, created_at);
2. Append-Only Enforcement
Application code must never issue UPDATE or DELETE on AuditEvent. Enforce this at two levels:
- DB trigger: a BEFORE UPDATE OR DELETE trigger raises an exception unconditionally.
- DB role: the application database user is granted INSERT and SELECT only — no UPDATE, DELETE, or TRUNCATE on
AuditEvent.
CREATE OR REPLACE FUNCTION audit_immutable()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
RAISE EXCEPTION 'Audit log is immutable: % not allowed', TG_OP;
END;
$$;
CREATE TRIGGER audit_no_modify
BEFORE UPDATE OR DELETE ON AuditEvent
FOR EACH ROW EXECUTE FUNCTION audit_immutable();
Partition detachment (for archival) is a privileged DBA operation logged separately and requires a second approver in the runbook.
3. Hash Chaining for Tamper Detection
Each record stores prev_hash (the record_hash of the immediately preceding row) and record_hash (SHA-256 of the concatenation of all record fields plus prev_hash). The genesis record uses prev_hash = '0' * 64.
To verify integrity of a range [A, B]:
- Fetch all records ordered by id from A to B.
- For each record, recompute SHA-256(fields + prev_hash) and compare to stored
record_hash. - Verify that record[n].prev_hash == record[n-1].record_hash.
- Any mismatch indicates tampering or gap at that position.
For large tables, verify in chunks of 10,000 rows and publish the chain head hash to an external notary (e.g., a public blockchain timestamp or a signed S3 object) periodically.
4. Partitioning and Retention
- Monthly RANGE partitions allow dropping old data by detaching a partition instead of row-level deletes — avoiding table bloat and lock contention.
- After detachment, the partition is compressed (pg_dump to zstd) and moved to cold storage (S3 Glacier).
- A
RetentionPolicytable maps event types to retention durations. Financial transactions: 7 years (SOX). Auth events: 1 year. Debug events: 90 days. - A nightly job computes which partitions are past retention for all their event types and schedules detachment.
5. Compliance Query API
Queries use a cursor of (created_at, id) for stable pagination that survives concurrent inserts:
-- Fetch next page after cursor
SELECT * FROM AuditEvent
WHERE actor_id = $1
AND created_at BETWEEN $2 AND $3
AND (created_at, id) > ($4, $5) -- cursor
ORDER BY created_at, id
LIMIT 100;
Compound index on (actor_id, created_at, id) satisfies this query without a sort step.
6. Python Reference Implementation
import hashlib, json, time, uuid
from typing import Optional, Any
GENESIS_HASH = "0" * 64
def _compute_hash(prev_hash: str, fields: dict) -> str:
payload = prev_hash + json.dumps(fields, sort_keys=True, default=str)
return hashlib.sha256(payload.encode()).hexdigest()
def log_event(
actor_id: str,
actor_type: str,
action: str,
resource_type: str,
resource_id: str,
outcome: str,
ip_address: Optional[str] = None,
request_id: Optional[str] = None,
before: Optional[Any] = None,
after: Optional[Any] = None,
) -> int:
"""Insert a new audit event; compute and chain hashes."""
# Serialize state diffs — only store changed keys for UPDATE
before_state = json.dumps(before) if before is not None else None
after_state = json.dumps(after) if after is not None else None
# Fetch previous hash with row lock to serialize inserts
row = db.query_one(
"SELECT record_hash FROM AuditEvent ORDER BY id DESC LIMIT 1 FOR UPDATE"
)
prev_hash = row.record_hash if row else GENESIS_HASH
fields = dict(
actor_id=actor_id, actor_type=actor_type, action=action,
resource_type=resource_type, resource_id=resource_id,
outcome=outcome, ip_address=ip_address, request_id=request_id,
before_state=before_state, after_state=after_state,
)
record_hash = _compute_hash(prev_hash, fields)
event_id = db.execute_returning(
"""INSERT INTO AuditEvent
(actor_id, actor_type, action, resource_type, resource_id, outcome,
ip_address, request_id, before_state, after_state, prev_hash, record_hash)
VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s) RETURNING id""",
[actor_id, actor_type, action, resource_type, resource_id, outcome,
ip_address, request_id, before_state, after_state, prev_hash, record_hash]
)
return event_id
def verify_chain(from_id: int, to_id: int) -> bool:
"""Recompute and verify hash chain over a range of records."""
rows = db.query(
"SELECT * FROM AuditEvent WHERE id BETWEEN %s AND %s ORDER BY id",
[from_id, to_id]
)
expected_prev = rows[0].prev_hash
for row in rows:
if row.prev_hash != expected_prev:
print(f"Chain break at id={row.id}: expected prev_hash {expected_prev}")
return False
fields = {k: getattr(row, k) for k in
["actor_id","actor_type","action","resource_type","resource_id",
"outcome","ip_address","request_id","before_state","after_state"]}
computed = _compute_hash(row.prev_hash, fields)
if computed != row.record_hash:
print(f"Hash mismatch at id={row.id}")
return False
expected_prev = row.record_hash
return True
def query_audit(filters: dict, cursor: Optional[tuple] = None, page_size: int = 100) -> list:
"""Paginated compliance query with optional cursor (created_at, id)."""
conditions = []
params = []
if "actor_id" in filters:
conditions.append("actor_id = %s"); params.append(filters["actor_id"])
if "resource_type" in filters:
conditions.append("resource_type = %s"); params.append(filters["resource_type"])
if "action" in filters:
conditions.append("action = %s"); params.append(filters["action"])
if "from_date" in filters:
conditions.append("created_at >= %s"); params.append(filters["from_date"])
if "to_date" in filters:
conditions.append("created_at (%s, %s)")
params.extend(cursor)
where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
params.append(page_size)
return db.query(
f"SELECT * FROM AuditEvent {where} ORDER BY created_at, id LIMIT %s",
params
)
7. Operational Considerations
- Write throughput: the row-level lock for hash chaining serializes inserts. For high-throughput systems, use a dedicated audit writer service with an in-memory queue and micro-batch inserts (batch 100 events, chain within the batch, single transaction). Alternatively, use asynchronous chaining: insert without hash, then a background job chains in order — but this creates a window where records are unchained.
- Index bloat: monthly partition pruning keeps index sizes bounded; use BRIN indexes on
created_atwithin each partition for range scans. - Export for SIEM: tail the
AuditEventtable with logical replication or a Change Data Capture (CDC) tool and stream to Splunk / Elasticsearch.
See also: Atlassian Interview Guide
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems