Low Level Design: MapReduce and Batch Processing Design
3 min read MapReduce is a programming model for processing large datasets in parallel across a cluster of commodity machines. Introduced by Google […] Read article
3 min read MapReduce is a programming model for processing large datasets in parallel across a cluster of commodity machines. Introduced by Google […] Read article
6 min read Database indexes are data structures that allow the database engine to find rows matching a query condition without scanning the Read article
4 min read A distributed cache stores frequently accessed data in memory across a cluster of nodes, reducing latency and database load. Redis Read article
4 min read OAuth 2.0 is an authorization framework that allows applications to obtain limited access to user accounts on third-party services without Read article
3 min read Consensus algorithms allow a cluster of nodes to agree on a single value even when some nodes fail or messages Read article
6 min read WebSockets provide full-duplex, persistent communication between a browser and server over a single TCP connection. Unlike HTTP request-response, either side Read article
4 min read An operating system scheduler decides which process or thread runs on each CPU core at any given moment. The scheduler Read article
4 min read Sharding (horizontal partitioning) splits a large dataset across multiple database nodes to scale beyond what a single machine can handle. Read article
3 min read A memory allocator manages heap memory β fulfilling malloc/new requests and returning freed memory for reuse. The C standard library Read article
5 min read TCP and UDP are the two dominant transport-layer protocols underpinning all internet communication. TCP provides reliable, ordered, connection-oriented delivery with Read article
3 min read Database replication copies data from a primary database to one or more replica databases to achieve high availability, read scalability, Read article
5 min read Forward proxies, reverse proxies, and load balancers are often confused because they all sit between clients and servers in a Read article
3 min read A columnar database stores table data organized by column rather than by row, making it dramatically more efficient for analytical Read article
3 min read A distributed lock manager (DLM) coordinates exclusive access to shared resources across multiple processes or services running on different machines. Read article
5 min read Garbage collection (GC) automatically reclaims memory occupied by objects no longer reachable from the application. The JVM provides multiple GC Read article