Low Level Design: Audit Logging System
5 min read An audit log records a tamper-evident, chronologically ordered history of all significant actions in a system: who did what, when, […] Read article
5 min read An audit log records a tamper-evident, chronologically ordered history of all significant actions in a system: who did what, when, […] Read article
5 min read A real-time leaderboard ranks users or entities by a score that updates continuously. The core challenge is serving low-latency rank Read article
3 min read Zero-downtime deployment updates production services without dropping user requests. Modern techniques — rolling updates, blue-green deployments, and canary releases — Read article
5 min read SLIs (Service Level Indicators), SLOs (Service Level Objectives), and error budgets are the quantitative framework for reliability engineering. An SLI Read article
5 min read Feature flags (feature toggles) decouple code deployment from feature release. Code ships to production with a feature disabled; the flag Read article
3 min read Apache Kafka is a distributed event streaming platform built around a partitioned, replicated, append-only log. Understanding Kafka internals — partitioning Read article
3 min read Write-heavy systems must sustain high write throughput without overwhelming the storage layer. Techniques include write batching, asynchronous writes, write coalescing, Read article
5 min read Read-heavy systems serve many more reads than writes — often 100:1 or higher ratios. Optimizing for reads requires layered caching, Read article
3 min read Cross-region failover reroutes traffic from a failed primary region to a healthy secondary region. The failover must be fast (under Read article
4 min read Binary protocols encode messages as compact byte sequences, achieving lower overhead, faster parsing, and smaller payloads than text-based formats (JSON, Read article
3 min read Data tiering organizes data across storage tiers based on access frequency and cost sensitivity. Hot data (frequently accessed) lives on Read article
4 min read Platform engineering builds an Internal Developer Platform (IDP) that provides self-service infrastructure capabilities to application teams. Instead of every team Read article
3 min read Adaptive concurrency limiting automatically tunes the number of concurrent requests a service allows based on observed performance. Unlike static rate Read article
3 min read Content fingerprinting detects duplicate or near-duplicate content at scale: identifying web pages that have been copied, finding similar images across Read article
4 min read Tail latency (p99, p999 latency) is the response time experienced by the slowest few percent of requests. While average latency Read article