Question 1

How do you implement a distributed lock with Redis?

Accepted Answer

Use SET lock:{resource} {unique_owner_id} NX PX {ttl_ms}. NX ensures the SET only succeeds if the key does not exist (atomic acquisition). PX sets a TTL so the lock auto-expires if the holder crashes (prevents deadlock). The unique_owner_id (UUID) is critical for safe release: use a Lua script to atomically check ownership before deleting. Without the check, a slow holder could release a lock that has expired and been re-acquired by another process. Typical TTL: 30 seconds for short operations, longer for batch jobs (with heartbeat renewal). Heartbeat renewal: extend the TTL every 10 seconds while the holder is still active using PEXPIRE with a new TTL.

Question 2

What is the Redlock algorithm and when should you use it?

Accepted Answer

Redlock uses N independent Redis instances (N=5 recommended). To acquire a lock: try SET NX PX on all N instances with the same lock name and owner ID. If a majority (3 of 5) succeed within a small timeout (10ms), the lock is acquired. The effective TTL is reduced by acquisition time. To release: run the Lua delete-if-owner script on all N instances. Redlock tolerates (N-1)/2 instance failures. Use Redlock when: you need high availability and cannot accept a single Redis SPOF. Do not use when: strong safety guarantees are critical and clock drift is a concern (use ZooKeeper or etcd instead). For most applications where lock loss probability is acceptable (e.g., job scheduling, not financial transactions), simple Redis SET NX is sufficient.

Question 3

What is a fencing token and why is it needed?

Accepted Answer

Even with a perfect distributed lock, a process can experience: GC pause, network partition, or slow disk -- during which its lock expires and another process acquires the same lock. Now two processes believe they hold the lock simultaneously. A fencing token solves this: the lock server returns a monotonically increasing token on each acquisition (implemented with Redis INCR or ZooKeeper zxid). When writing to the protected resource, the holder includes its token. The resource rejects any write with a token older than the last accepted token. This means even if an old lock holder resumes after a pause, its writes are rejected because a newer token has been accepted. Fencing tokens make distributed locking safe even under arbitrary delays.

Question 4

How does ZooKeeper implement distributed locking?

Accepted Answer

ZooKeeper uses ephemeral sequential znodes. To acquire a lock on /locks/myresource: create a node /locks/myresource/lock- with the EPHEMERAL_SEQUENTIAL flag. ZooKeeper assigns a sequence number (e.g., lock-0000000003). List all children of /locks/myresource. If your node has the lowest sequence number, you have the lock. If not: watch the node with the next-lower sequence number. When that node is deleted (its holder released or crashed), you are notified and re-check. Ephemeral nodes: ZooKeeper automatically deletes ephemeral nodes when the client session expires (client crash = lock released). This provides reliable crash detection without TTL expiry uncertainty.

Question 5

How do you prevent deadlock in distributed locking?

Accepted Answer

Deadlock occurs when two processes each hold a lock the other needs: A holds lock1, needs lock2; B holds lock2, needs lock1. Prevention strategies: (1) Lock ordering: always acquire locks in the same order (alphabetical by resource name). If everyone acquires lock1 before lock2, deadlock is impossible. (2) TTL-based expiry: all distributed locks have a TTL. Deadlock is bounded -- the locks will expire within the TTL window. (3) Timeout with backoff: if acquisition times out, release all held locks and retry with exponential backoff and jitter. (4) Single lock for the entire operation: if possible, acquire one lock covering all resources rather than multiple fine-grained locks. TTL expiry is the most common approach in production -- deadlock is simply bounded rather than prevented.

Mechanism	Latency	Reliability	Complexity	Use When
Simple Redis SET NX	Very Low	Medium	Low	Single Redis, tolerate rare failures
Redlock	Low	High	Medium	Multi-datacenter, high reliability needed
ZooKeeper / etcd	Medium	Very High	High	Critical coordination, strong consistency
Database lock	Medium	High	Low	Simple use case, DB already available

System Design: Distributed Locking — Redis Redlock, ZooKeeper, and Database Locks

Why Distributed Locks?

Redis-based Locking (Simple)

Redlock Algorithm

ZooKeeper / etcd-based Locking

Database-based Locking

Fencing Tokens

When to Use Each