Sliding Window Counter Rate Limiter Low-Level Design: Hybrid Algorithm, Redis Sorted Sets, and Accuracy Trade-offs

Fixed Window Counter and Its Problem

The simplest rate limiter increments a counter per time window and resets it at window boundaries:

key = "ratelimit:{user_id}:{current_minute}"
count = INCR key
EXPIRE key 60
if count > limit: reject

The critical flaw: a client can send limit requests at 11:59:59 and another limit requests at 12:00:00, achieving 2x the limit in a two-second window. This window boundary burst problem motivates sliding window approaches.

Sliding Window Log

The exact sliding window stores a timestamp for every request in a sorted set and counts entries within the window:

-- Redis sorted set: score = timestamp, member = unique request ID
ZADD ratelimit:{user_id} {now_ms} {request_id}
ZREMRANGEBYSCORE ratelimit:{user_id} 0 {now_ms - window_ms}
count = ZCARD ratelimit:{user_id}
EXPIRE ratelimit:{user_id} {window_seconds}
if count > limit: reject

Pros: Perfectly accurate. No boundary burst possible. The count always reflects exactly the requests in the last window seconds.

Cons: Memory cost is O(requests per window per user). At 1000 req/min per user with 100,000 users, that is 100 million entries in Redis. For high-volume APIs with many users, this is prohibitive.

Sliding Window Counter Hybrid

The hybrid approach approximates a sliding window using two fixed-window counters, trading a small accuracy loss for constant memory per user:

window_size = 60 seconds
current_window_start = floor(now / window_size) * window_size
prev_window_start = current_window_start - window_size

prev_count  = GET ratelimit:{user_id}:{prev_window_start}   or 0
curr_count  = GET ratelimit:{user_id}:{current_window_start} or 0

elapsed_in_current = now - current_window_start
overlap_fraction   = 1 - (elapsed_in_current / window_size)

estimated_count = prev_count * overlap_fraction + curr_count

if estimated_count + 1 > limit: reject
INCR ratelimit:{user_id}:{current_window_start}
EXPIRE ratelimit:{user_id}:{current_window_start} {window_size * 2}

The formula weights the previous window's count by how much it overlaps with the current sliding window. If you are 15 seconds into the current minute, the previous minute contributes 75% of its count to the estimate.

Accuracy: Cloudflare measured this approximation has at most ~0.003% error rate in practice — negligible for rate limiting. Memory cost is O(1) per user (two integers).

Redis Sorted Set Implementation with Pipeline

pipeline = redis.pipeline()
now_ms = int(time.time() * 1000)
window_ms = 60_000
key = f"sw:{user_id}"

pipeline.zadd(key, {str(uuid4()): now_ms})
pipeline.zremrangebyscore(key, 0, now_ms - window_ms)
pipeline.zcard(key)
pipeline.expire(key, 70)   # slightly longer than window to handle clock skew

results = pipeline.execute()
count = results[2]
if count > limit:
  # remove the just-added entry since we are rejecting
  reject()

Pipeline batches the four commands into a single round trip but is not atomic — a race condition exists between the ZCARD read and the decision. For strict accuracy, wrap in a Lua script:

local key = KEYS[1]
local now = ARGV[1]
local window = ARGV[2]
local limit = ARGV[3]
local req_id = ARGV[4]

redis.call('ZADD', key, now, req_id)
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
local count = redis.call('ZCARD', key)
redis.call('EXPIRE', key, math.ceil(window/1000) + 10)

if count > tonumber(limit) then
  redis.call('ZREM', key, req_id)
  return 0
end
return 1

Accuracy vs Memory Trade-off Summary

Fixed window: O(1) memory, O(1) time, up to 2x burst at boundary
Sliding window log: O(requests in window) memory, O(log N) time, exact accuracy
Sliding window counter (hybrid): O(1) memory, O(1) time, ~0% error in practice

For most production API rate limiting, the hybrid sliding window counter is the correct choice: near-perfect accuracy with constant memory cost per user.

Redis Cluster Considerations

In Redis Cluster mode, all keys for a given user must hash to the same shard to allow multi-key operations. Use hash tags to force co-location:

key = f"sw:{{user_id}}:{window_start}"
                 ^^^^^^^^^^^ hash tag — only this part determines shard

Without hash tags, ZADD and ZCARD on different keys may target different shards, requiring cross-slot coordination that Redis Cluster does not support in a single command. Alternatively, use a dedicated Redis instance (not cluster) for rate limiting state — the data is small and the isolation is operationally simpler.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does the sliding window log rate limiter work with a Redis sorted set?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each request is recorded as a member in a Redis sorted set keyed by client ID, with the current Unix timestamp (in milliseconds) as the score. Before admitting a request the limiter calls ZREMRANGEBYSCORE to evict entries older than the window, then ZCARD to count remaining entries; if the count is below the limit it adds the new entry with ZADD, making the entire operation atomic via a Lua script or pipeline.”
}
},
{
“@type”: “Question”,
“name”: “How does the hybrid sliding window counter approximate sliding behavior?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The hybrid approach keeps two fixed-window counters — the current window and the previous window — and estimates the sliding window count as: previous_count * (1 – elapsed_fraction) + current_count, where elapsed_fraction is how far the current window has progressed. This trades a small approximation error (up to ~0.003% in practice) for O(1) memory per client instead of O(requests_per_window).”
}
},
{
“@type”: “Question”,
“name”: “How is atomic check-and-increment implemented in Redis?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Atomicity is achieved by embedding the read-check-write sequence in a Redis Lua script executed with EVAL, which Redis guarantees runs without interruption. Alternatively, a pipeline of MULTI/EXEC (optimistic locking with WATCH) can be used, but the Lua approach is preferred because it avoids the retry loop needed when another client modifies the key between WATCH and EXEC.”
}
},
{
“@type”: “Question”,
“name”: “What is the memory cost trade-off between sliding window log and fixed window?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The sliding window log stores one sorted-set entry per request, so memory scales linearly with allowed throughput — a limit of 10,000 req/min per user means up to 10,000 entries per user in Redis. Fixed window counters use a single integer per user per window regardless of request volume, making them O(1) in memory but susceptible to the boundary burst problem that the sliding log eliminates.”
}
}
]
}