Low Level Design: Mutex, Semaphore, and Synchronization Primitives

⏱ 8 min read

Mutex Basics

A mutex (mutual exclusion lock) guarantees that only one thread can hold it at any given time. The two fundamental operations are lock(), which blocks the calling thread if the mutex is already held, and unlock(), which releases the mutex and wakes one waiting thread. A mutex has exactly two states: locked and unlocked.

Non-recursive mutexes will deadlock if the same thread tries to lock twice. Recursive (reentrant) mutexes maintain an ownership counter so the same thread can call lock() multiple times without blocking — it must call unlock() an equal number of times. Recursive mutexes are heavier and hide design problems; prefer non-recursive unless re-entrancy is explicitly required.

Spinlock Implementation

A spinlock uses busy-waiting: the thread loops continuously checking whether the lock is free instead of sleeping. The core operation is an atomic test-and-set (TAS) or compare-and-swap (CAS) instruction that atomically reads the current state and sets it to "locked" in one uninterruptible step.

while (!atomic_cas(&lock, 0, 1)) { /* spin */ }

Spinlocks are appropriate when the expected hold time is shorter than a context switch (typically ~1–10 µs on modern hardware). They waste CPU cycles if held longer. OS kernels use spinlocks heavily because sleeping inside interrupt handlers is not permitted. In userspace, prefer a mutex backed by a futex unless you have measured that the contention window is truly tiny.

Futex (Fast Userspace Mutex)

The futex (fast userspace mutex) is the building block of POSIX mutexes on Linux. It uses a two-level design to avoid expensive syscalls in the common uncontended case.

Lock path: Attempt an atomic CAS on a userspace integer from 0 (unlocked) to 1 (locked). If the CAS succeeds, no syscall is needed — the lock is acquired entirely in userspace. If the CAS fails (contended), the thread calls futex(FUTEX_WAIT) to sleep on a kernel-managed wait queue associated with the memory address.

Unlock path: Atomically write 0 to the lock word. If waiters were registered, call futex(FUTEX_WAKE) to wake one sleeping thread. The kernel requeues the woken thread to the run queue.

This design means uncontended lock/unlock costs two atomic instructions and zero syscalls — a critical optimization since most mutex acquisitions in practice are uncontended.

Semaphore

A semaphore is a generalization of a mutex: instead of a binary locked/unlocked state it holds an integer counter. The two operations are traditionally named P() (wait/decrement) and V() (signal/increment), from Dijkstra’s original Dutch terminology.

P() (wait): Decrement the counter. If the result is negative, block the calling thread on the semaphore’s wait queue.

V() (signal): Increment the counter. If any threads are waiting (counter was negative), wake one.

A counting semaphore with initial value N allows up to N threads to proceed concurrently — useful for resource pools (e.g., a database connection pool of size 10). A binary semaphore has a max value of 1 and behaves like a mutex, with one critical difference: a semaphore can be signaled from a different thread than the one that waited, making it suitable for signaling between threads (e.g., producer signals consumer).

Condition Variable

A condition variable lets threads wait for an arbitrary boolean condition to become true, in coordination with a mutex. The three operations are wait(mutex, cond), signal(cond), and broadcast(cond).

wait: Atomically releases the mutex and suspends the thread on the condition variable’s queue. When woken, the thread reacquires the mutex before returning. The atomicity of release + suspend is essential to prevent lost wakeups.

signal / broadcast: signal wakes one waiting thread; broadcast wakes all of them. Both should be called with the mutex held.

Always re-check the condition in a while loop, not an if, because spurious wakeups are permitted by POSIX — a thread may return from wait without anyone calling signal. The canonical producer-consumer pattern uses two conditions: not_full (producer waits when buffer is full) and not_empty (consumer waits when buffer is empty).

Reader-Writer Lock

A reader-writer lock (RWLock) allows multiple concurrent readers or exactly one exclusive writer. Read locks are acquired when no writer holds or is queued for the lock. Write locks are acquired only when no readers or writers are active.

A naive implementation can cause writer starvation: a steady stream of readers means a writer never gets the lock. Writer-preference variants queue incoming readers behind waiting writers to prevent this. Java’s ReentrantReadWriteLock supports both fair and non-fair modes. PostgreSQL uses LWLock (lightweight lock) internally with similar semantics for buffer manager and catalog access.

RWLocks are only beneficial when reads are significantly more frequent than writes and the critical section is non-trivial. For very short critical sections, the overhead of an RWLock (more complex state machine, more atomic operations) can exceed the benefit.

Deadlock Prevention

A deadlock occurs when a cycle of threads each holds a resource and waits for a resource held by the next thread in the cycle. Prevention strategies eliminate one of the four Coffman conditions:

Lock ordering: Define a global total order on all locks. Every thread must acquire locks in that order. No cycle can form because the "holds A, waits for B" edge always points in the same direction.
tryLock with timeout: If a lock cannot be acquired within a deadline, release all held locks and retry. Requires idempotent critical sections or compensation logic.
Lock hierarchy / levels: Assign numeric levels to locks. A thread holding a lock at level N may not acquire a lock at level ≤ N. Enforced at runtime in debug builds (e.g., Linux lockdep).

Deadlock Detection

When prevention is impractical, detection and recovery can be used instead. A wait-for graph has threads as nodes and a directed edge from thread A to thread B if A is waiting for a resource currently held by B. A cycle in this graph indicates a deadlock.

The Banker’s Algorithm tracks maximum resource needs and current allocation, refusing grants that could lead to an unsafe state. It’s used in academic OS courses but too conservative for most real systems.

Databases use timeout-based detection: if a transaction waits longer than a threshold (e.g., PostgreSQL’s deadlock_timeout default 1 s), a deadlock check runs on the wait-for graph. If a cycle is found, the database aborts one transaction (the "victim") to break the cycle. The aborted transaction can be retried by the application.

What is the difference between a mutex and a semaphore?

A mutex (mutual exclusion lock) allows only one thread to hold it at a time and must be released by the same thread that acquired it. A semaphore is a counter that allows N concurrent holders; it can be signaled by any thread. A binary semaphore (count=1) is similar to a mutex but lacks ownership semantics, making it suitable for signaling between threads rather than mutual exclusion.

What is a futex and why is it used?

A futex (fast userspace mutex) avoids kernel syscalls in the uncontended case. The lock state is stored in a shared integer in userspace. A thread atomically checks and sets the lock with CAS; if successful, no kernel call is needed. Only on contention does it call the kernel to sleep. Linux pthreads, Java monitors, and most modern runtimes use futex-based synchronization.

What is a reader-writer lock and when should it be used?

A reader-writer lock allows multiple concurrent readers or exactly one writer, but not both simultaneously. It is appropriate when reads are frequent and writes are rare. In Java, ReentrantReadWriteLock implements this. Care must be taken to prevent writer starvation — if readers continuously hold the lock, writers may never proceed. A fair variant alternates between readers and writers.

How do you detect and prevent deadlock?

Deadlock requires four conditions: mutual exclusion, hold and wait, no preemption, and circular wait. Prevention strategies: lock ordering (always acquire locks in a consistent global order eliminates circular wait), lock timeout (tryLock with timeout), or lock-free data structures. Detection uses a resource allocation graph; a cycle indicates deadlock. Java thread dumps show lock ownership chains.

What is a condition variable?

A condition variable allows a thread to atomically release a mutex and wait until signaled by another thread. The waiting thread re-acquires the mutex before returning. Spurious wakeups can occur, so the wait must always be in a loop checking the condition. In Java, Object.wait/notify or Condition.await/signal implement this. Condition variables decouple the signaling logic from the lock.