What is an Audit Log?
An audit log is an immutable, append-only record of every significant action taken in a system: who did what, to which resource, when, and from where. Audit logs are required for compliance (SOC 2, HIPAA, PCI-DSS), security incident investigation, and debugging production issues. The key properties: tamper-evident (entries cannot be modified or deleted), complete (no events silently dropped), and queryable (quickly find all actions by a user or on a resource).
Requirements
- Record every create/update/delete operation with actor, resource, old and new values
- Immutable: no UPDATE or DELETE on audit records
- Query: all actions by user X in the last 30 days; all changes to resource Y
- Retention: 7 years for financial data (SOX), 1 year for general access logs
- 10K events/second at peak; query response <500ms
- Export audit report for compliance officers (date range, actor, resource type)
Data Model
AuditEvent(
event_id UUID PRIMARY KEY,
tenant_id UUID NOT NULL,
actor_id UUID NOT NULL, -- user or service account who performed the action
actor_type ENUM(USER, SERVICE, SYSTEM),
action VARCHAR NOT NULL, -- 'user.login', 'order.cancel', 'permission.grant'
resource_type VARCHAR, -- 'order', 'user', 'document'
resource_id UUID,
before_state JSONB, -- snapshot of resource before change (nullable)
after_state JSONB, -- snapshot of resource after change (nullable)
metadata JSONB, -- ip_address, user_agent, request_id, etc.
created_at TIMESTAMPTZ NOT NULL, -- immutable
-- NO updated_at column — this record never changes
CHECK (created_at IS NOT NULL)
)
-- Indexes for common query patterns
CREATE INDEX idx_audit_actor ON AuditEvent(tenant_id, actor_id, created_at DESC);
CREATE INDEX idx_audit_resource ON AuditEvent(tenant_id, resource_type, resource_id, created_at DESC);
CREATE INDEX idx_audit_action ON AuditEvent(tenant_id, action, created_at DESC);
Capturing Events
Two approaches for capturing audit events:
Application-level: call an audit logging function explicitly in every service operation:
def cancel_order(order_id, actor_id, reason):
before = db.get_order(order_id)
db.update_order(order_id, status='CANCELLED')
after = db.get_order(order_id)
audit_log.record(
actor_id=actor_id,
action='order.cancel',
resource_type='order',
resource_id=order_id,
before_state=before.to_dict(),
after_state=after.to_dict(),
metadata={'reason': reason, 'ip': request.remote_addr}
)
Database triggers: PostgreSQL triggers fire on every INSERT/UPDATE/DELETE and write to the audit table automatically. No risk of missing events if developers forget to call the audit function. Con: triggers run in the same transaction — a slow audit write blocks the main operation.
Async Write Path
For high throughput (10K events/second), don’t write audit events synchronously in the transaction:
# Application writes to Kafka (non-blocking)
def record(event_data):
kafka.produce('audit-events', value=json.dumps({
**event_data,
'event_id': str(uuid4()),
'created_at': datetime.utcnow().isoformat()
}))
# Kafka consumer writes to audit DB (separate process)
for message in kafka.consume('audit-events'):
db.insert(AuditEvent(**json.loads(message.value)))
Trade-off: async = eventual consistency. The audit event may arrive 1-2 seconds after the action. Acceptable for compliance queries; not for real-time security alerting (use a separate streaming pipeline for that).
Immutability Enforcement
-- PostgreSQL: revoke UPDATE and DELETE privileges on audit table REVOKE UPDATE, DELETE ON AuditEvent FROM application_user; -- Only INSERT is allowed; SELECT for queries -- Append-only table: use PostgreSQL partitioning with partition by month -- Old partitions can be moved to cold storage (S3 via foreign data wrapper) -- after their compliance retention window expires
Tamper Detection
For high-assurance audit trails, chain events with a hash of the previous entry:
AuditEvent.chain_hash = SHA256(previous_event_id || event_id || created_at || action || resource_id)
Any modification to a past event breaks the hash chain. Verify chain integrity by replaying all hashes from the beginning. Used in financial systems and blockchain-inspired audit trails.
Key Design Decisions
- JSONB for before/after state — flexible schema for any resource type without altering the audit table
- Kafka for async ingestion — decouples application throughput from audit DB write speed
- DB-level REVOKE — immutability enforced at the database, not just application, layer
- action namespace (resource.verb) — enables precise filtering (all ‘user.*’ actions, all ‘*.delete’ actions)
- Partitioning by month — enables cheap archival of old partitions to cold storage after retention window
Audit log and compliance system design is discussed in Atlassian system design interview questions.
Immutable audit trail and tamper-evident log design is covered in Coinbase system design interview preparation.
Audit logging and financial record compliance design is in Stripe system design interview guide.