Overview
A secrets manager provides secure storage, access control, and lifecycle management for credentials, API keys, and other sensitive values. This LLD covers encrypted storage, key hierarchy, versioning, dynamic secrets, and audit logging.
Core Data Model
Secret Table
secrets (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
path VARCHAR NOT NULL, -- e.g. /prod/db/password
version INT NOT NULL DEFAULT 1,
ciphertext BYTEA NOT NULL, -- AES-256-GCM encrypted value
dek_ref VARCHAR NOT NULL, -- reference to encrypted DEK in KMS
created_by VARCHAR NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ,
UNIQUE (path, version)
)
Access Policy Table
access_policies (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
principal VARCHAR NOT NULL, -- user, role, or service identity
path_pattern VARCHAR NOT NULL, -- supports wildcards: /prod/db/*
actions TEXT[] NOT NULL, -- {read, write, list, delete}
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
)
Lease Table (Dynamic Secrets)
leases (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
secret_path VARCHAR NOT NULL,
principal VARCHAR NOT NULL,
lease_duration INTERVAL NOT NULL,
renewed_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ NOT NULL,
revoked BOOL NOT NULL DEFAULT false
)
Audit Log Table
audit_log (
id BIGSERIAL PRIMARY KEY,
event_type VARCHAR NOT NULL, -- read, write, delete, rotate, revoke
principal VARCHAR NOT NULL,
path VARCHAR NOT NULL,
version INT,
ip INET,
timestamp TIMESTAMPTZ NOT NULL DEFAULT now()
)
Encryption Architecture
Key Hierarchy
A three-tier key hierarchy isolates blast radius and enables key rotation without re-encrypting all secrets:
HSM Master Key
└── KMS Key (encrypted by master key)
└── DEK - Data Encryption Key (encrypted by KMS key, one per secret)
└── Secret Value (encrypted by DEK, AES-256-GCM)
Encryption Flow (Write)
1. Generate random 256-bit DEK
2. Encrypt secret value: ciphertext = AES-256-GCM(DEK, plaintext)
3. Encrypt DEK via KMS API: encrypted_dek = KMS.Encrypt(kms_key_id, DEK)
4. Store (path, version, ciphertext, encrypted_dek) in secrets table
5. Discard plaintext DEK from memory
Decryption Flow (Read)
1. Fetch (ciphertext, dek_ref) for (path, version)
2. Decrypt DEK: DEK = KMS.Decrypt(dek_ref)
3. Decrypt value: plaintext = AES-256-GCM-Decrypt(DEK, ciphertext)
4. Return plaintext to authorized caller
5. Log read event to audit_log
Secret Versioning
Each write creates a new version; old versions are retained for a configurable grace period to allow rotation without downtime:
-- On write, increment version
INSERT INTO secrets (path, version, ciphertext, dek_ref, created_by)
SELECT $1, COALESCE(MAX(version),0)+1, $2, $3, $4
FROM secrets WHERE path = $1;
-- Purge versions older than grace period (e.g. 7 days)
DELETE FROM secrets
WHERE path = $1
AND version < (SELECT MAX(version) FROM secrets WHERE path = $1)
AND created_at < now() - INTERVAL '7 days';
Access Policy Enforcement
-- Check if principal has required action on path
SELECT 1 FROM access_policies
WHERE principal = $1 -- or matching role
AND $2 LIKE replace(path_pattern,'*','%') -- wildcard match
AND $3 = ANY(actions)
LIMIT 1;
Policy evaluation order: explicit deny > explicit allow > default deny.
Dynamic Secrets
For database credentials, the secrets manager generates short-lived credentials on demand rather than storing a static password:
-- On lease request for /prod/db/dynamic
1. Generate random username + password
2. CREATE USER dyn_<lease_id> WITH PASSWORD '...' VALID UNTIL '...'
3. GRANT SELECT ON TABLE orders TO dyn_<lease_id>
4. Insert lease record with expires_at = now() + lease_duration
5. Return {username, password, expires_at} to caller
-- Revocation worker (runs every minute)
SELECT * FROM leases
WHERE expires_at <= now() AND revoked = false;
-- For each expired lease:
DROP USER dyn_<lease_id>;
UPDATE leases SET revoked=true WHERE id=$1;
REST API
GET /secrets/:path -- read latest version
GET /secrets/:path?version=N -- read specific version
GET /secrets/:path/versions -- list all versions + metadata
PUT /secrets/:path -- write new version
DELETE /secrets/:path -- soft-delete (mark all versions inactive)
DELETE /secrets/:path?version=N -- delete specific version
POST /leases -- request dynamic secret lease
GET /leases/:id -- get lease status
PUT /leases/:id/renew -- renew lease
DELETE /leases/:id -- revoke lease immediately
Renewal and Revocation API
-- Renew: extend lease if within renewal window
UPDATE leases
SET expires_at = now() + lease_duration,
renewed_at = now()
WHERE id = $1
AND principal = $2
AND revoked = false
AND expires_at > now();
-- Revoke: immediately expire and clean up
UPDATE leases SET revoked=true WHERE id=$1;
-- Trigger DROP USER in DB
Sequence: Secret Read
Client → API Gateway → AuthN (JWT/mTLS) → Policy Engine
→ Secrets Service → KMS (decrypt DEK) → DB (fetch ciphertext)
→ Return plaintext → Audit Log (async)
High Availability Considerations
- Postgres with streaming replication for secrets/lease/audit tables
- KMS calls cached locally for < 5 minutes to reduce latency
- API tier stateless, horizontally scalable behind load balancer
- HSM in active-passive pair per region
- Rate limiting per principal to prevent enumeration
- TLS required for all API calls; mTLS preferred for service-to-service
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems