Refresh Token Rotation Service Low-Level Design: Family Invalidation, Reuse Detection, and Binding

What Is Refresh Token Rotation?

Refresh tokens authorize the issuance of new access tokens after the short-lived access token expires. Because refresh tokens are long-lived and stored on the client, they are high-value targets for theft. Refresh token rotation mitigates this risk by issuing a new refresh token on every use and invalidating the old one. A stolen token that is used before the legitimate client can will trigger reuse detection, causing the entire token family to be invalidated.

Requirements

Functional Requirements

Issue a new refresh token and invalidate the old one atomically on every refresh grant.
Group tokens into families; using a token from a revoked family invalidates all living members.
Detect reuse of a previously rotated token (stolen token signal) and invalidate the entire family.
Bind refresh tokens to the originating device fingerprint; cross-device use triggers invalidation.
Support explicit revocation (logout) that invalidates the entire family.

Non-Functional Requirements

Rotation and reuse detection must be atomic; two concurrent refreshes with the same token must not both succeed.
Lookup latency under 5 ms; refresh tokens are on the hot path of every API call cycle.
Token records must be durable; loss of an active token would log out users unexpectedly.

Data Model

Token Record

token_id — UUID, primary key; the value given to the client is a signed opaque reference to this ID.
family_id — UUID; shared by all tokens in a rotation chain originating from the same login.
user_id, client_id, device_fingerprint.
status — ENUM: ACTIVE, USED, REVOKED.
parent_token_id — reference to the token this was rotated from; forms a linked list for audit.
issued_at, expires_at, used_at.

Family Record

family_id — primary key.
status — ENUM: ACTIVE, COMPROMISED, REVOKED.
user_id, created_at, revoked_at, revoke_reason.

Core Algorithm: Rotation with Reuse Detection

Step 1 — Token Decode and Lookup

Verify the HMAC signature on the opaque token value. Extract token_id. Fetch the token record and its family record from the database within a serializable transaction.

Step 2 — Family Status Check

If family status is COMPROMISED or REVOKED, reject immediately with 401. Log the attempt with device fingerprint for fraud analysis. This is the key protection: once a family is compromised, no token from it can be used.

Step 3 — Reuse Detection

If the token status is USED: a previously rotated token is being replayed. This indicates token theft. Atomically set family status to COMPROMISED, revoke all ACTIVE tokens in the family, log the event with both device fingerprints (original and current), and return 401. Optionally, notify the user via email or push alert.

Step 4 — Device Binding Check

Compute the device fingerprint from the current request (user agent, IP subnet, TLS client hello hash, or a client-provided device_id). Compare against the stored fingerprint. On mismatch, apply configurable policy: challenge (step-up auth) or revoke. Do not hard-reject on IP change alone, as mobile clients frequently roam.

Step 5 — Atomic Rotation

Within the same transaction: set old token status to USED, record used_at. Create a new token record with status ACTIVE, same family_id, parent_token_id pointing to old token. Commit. Issue new access token and return both tokens to the client.

API Design

POST /token with grant_type=refresh_token — perform rotation; returns new access_token and refresh_token.
POST /token/revoke — explicit logout; revokes the entire family.
GET /sessions — lists active families (sessions) for a user; supports session management UI.
DELETE /sessions/{family_id} — revoke a specific session (e.g., sign out of another device).

Scalability Considerations

Serialize rotation operations per family_id using a distributed lock (Redis SETNX on family_id) to prevent two concurrent refresh calls from both succeeding and corrupting the rotation chain. Hold the lock only for the duration of the database transaction. Store the token table in PostgreSQL with an index on (family_id, status) for efficient family-wide revocation queries. Cache the family status in Redis (TTL equal to the shortest remaining token expiry in the family) to avoid a database read on every token lookup. Purge expired token records with a partitioned table and range-delete jobs.

Summary

Refresh token rotation protects against token theft through atomic single-use enforcement and family-based invalidation. Reuse detection converts a stolen token event into a security response that protects the user even if the attacker acts first. Device binding adds a second dimension of verification, and distributed locking ensures the rotation invariant holds under concurrent load.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is family-based refresh token invalidation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Every refresh token is assigned a family_id at initial issuance (e.g., at login). Each rotation issues a new token with the same family_id and records the parent-child chain. The family forms an invalidation tree: revoking the family_id immediately invalidates all tokens in the lineage regardless of individual expiry. This allows logging out a specific device (revoke its family) without affecting other active sessions.”
}
},
{
“@type”: “Question”,
“name”: “How does reuse detection trigger full family compromise in refresh token rotation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “When a refresh token is rotated, the old token is marked consumed. If the server receives a request using an already-consumed token, it indicates either a replay attack or that the legitimate token was stolen and used first. The server responds by revoking the entire family immediately — invalidating all tokens in the lineage — and should alert the user. This is the key security property of refresh token rotation: reuse is a compromise signal.”
}
},
{
“@type”: “Question”,
“name”: “How do you bind refresh tokens to device fingerprints?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “At issuance, compute a device fingerprint hash from stable client signals: User-Agent, IP subnet, and a device-generated identifier stored in local storage or a secure cookie. Store the fingerprint hash alongside the token family in the database. On each refresh request, recompute the fingerprint and compare to the stored hash. A mismatch triggers elevated scrutiny — step-up authentication or family revocation — without blocking legitimate roaming users outright (consider partial matching tolerances).”
}
},
{
“@type”: “Question”,
“name”: “Why use a distributed lock per refresh token family?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Without a lock, two concurrent requests using the same refresh token can both pass the consumed check and each receive a new token, creating a forked family tree and breaking reuse detection. Acquire a distributed lock (e.g., Redis SET family_id NX PX 5000) before reading and rotating the token. Release the lock after writing the new token and marking the old one consumed. Concurrent requests block on the lock; the second request sees the token already consumed and is rejected cleanly.”
}
}
]
}