OAuth 2.0 Authorization Server Low-Level Design: Grant Flows, Token Issuance, and Introspection

An OAuth 2.0 authorization server is the trust anchor of any modern authentication infrastructure. Designing one correctly means handling multiple grant flows, issuing tokens securely, enforcing scopes, and providing introspection endpoints that resource servers can rely on. This guide walks through the key design decisions you will face in a system design interview.

Requirements

Functional Requirements

  • Support Authorization Code + PKCE flow for user-delegated access
  • Support Client Credentials flow for machine-to-machine access
  • Issue short-lived access tokens and longer-lived refresh tokens
  • Expose a token introspection endpoint for resource servers
  • Support scope-based authorization with per-client scope whitelists
  • Allow token revocation by user or administrator

Non-Functional Requirements

  • Token issuance latency under 50ms at p99
  • Introspection endpoint must support 50,000 requests per second
  • High availability with no single point of failure
  • Audit log every authorization event for compliance

Data Model

The core entities are clients, authorization codes, tokens, and scopes. The clients table stores client_id, hashed client_secret, allowed grant types, redirect URI whitelist, and allowed scopes. The authorization_codes table stores a short-lived code (TTL 60 seconds), the code_challenge and method for PKCE validation, the requested scopes, and the authenticated user ID. The tokens table stores a token_id, the client_id, user_id (nullable for client credentials), scopes granted, issued_at, expires_at, and a revoked flag. Refresh tokens are stored similarly but with a longer expiry and a reference to the access token family to support rotation detection.

Token values themselves are opaque random strings (32 bytes, base64url encoded). The database stores a SHA-256 hash of the token value, never the raw value. Lookup by token uses the hash as a secondary index.

Core Algorithms

Authorization Code + PKCE Flow

  • Client generates a random code_verifier (43-128 chars) and computes code_challenge = BASE64URL(SHA256(code_verifier))
  • Authorization request includes code_challenge and method=S256
  • Server stores code_challenge with the authorization code record
  • Token exchange: server computes SHA256(received code_verifier) and compares to stored challenge
  • Mismatch rejects the request, preventing authorization code interception attacks

Token Introspection

Resource servers call the introspection endpoint with the raw token. The server hashes the token, looks up the hash in the tokens table, checks the revoked flag and expiry, then returns an RFC 7662 response with active, scope, sub, exp, and client_id fields. To avoid hammering the database, the introspection layer maintains a short TTL cache (30 seconds) keyed on the token hash. Revocation invalidates the cache entry immediately via a pub/sub message to all introspection nodes.

Refresh Token Rotation

Each use of a refresh token issues a new refresh token and invalidates the old one. The server tracks a token family. If an already-revoked refresh token is presented, the entire family is revoked, signaling a replay attack. This limits the blast radius of a stolen refresh token.

API Design

  • POST /authorize — Validates client, redirects user to login, issues authorization code on consent
  • POST /token — Exchanges code or client credentials for access and refresh tokens
  • POST /introspect — Returns token metadata for resource servers; requires client authentication
  • POST /revoke — Accepts token or token_hint; invalidates token and associated family
  • GET /.well-known/oauth-authorization-server — Publishes server metadata per RFC 8414

Scalability and Infrastructure

Token issuance is stateless after the database write. Authorization servers run as stateless pods behind a load balancer. The tokens table lives in a primary-replica PostgreSQL cluster. Token hashes are indexed; the table is partitioned by issued_at for efficient expiry purges.

Introspection is the hot path. Cache token validation results in Redis with a 30-second TTL. The cache key is the token hash. A revocation event publishes to a Redis pub/sub channel; subscribers evict the affected cache key immediately, keeping the staleness window near zero.

For very high introspection throughput, consider self-contained JWT access tokens signed with RS256. Resource servers validate the signature and expiry locally without a network call. Revocation is handled via a short-lived signed JWT (max 5-minute TTL) plus a token status list endpoint polled periodically. This eliminates the introspection bottleneck at the cost of slightly delayed revocation.

Security Considerations

  • Always use PKCE even for confidential clients to guard against code interception
  • Bind authorization codes to the redirect URI; reject mismatches at token exchange
  • Rotate signing keys periodically; publish JWKS endpoint for async key discovery
  • Rate-limit the /token endpoint per client_id to slow credential stuffing
  • Store client secrets as bcrypt hashes; never log raw secrets or tokens

Interview Talking Points

Interviewers want to see that you understand the difference between the authorization server and the resource server, can explain why PKCE replaces the implicit flow, and know how to make introspection scale. Be ready to discuss the trade-off between opaque tokens with central introspection versus self-contained JWTs with local validation. The right answer depends on how quickly you need revocation to propagate.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does the authorization code flow with PKCE work in an OAuth server design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The client generates a random code_verifier and hashes it (SHA-256) to produce a code_challenge, which is sent with the authorization request. The server stores the challenge and issues an auth code. At token exchange the client sends the original code_verifier; the server hashes it and compares to the stored challenge, preventing interception attacks without requiring a client secret.”
}
},
{
“@type”: “Question”,
“name”: “When should you use the client credentials flow instead of the authorization code flow?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Client credentials is machine-to-machine: there is no end user. A service authenticates directly with the authorization server using its client_id and client_secret and receives an access token scoped to its own resources. Use it for back-end service APIs, cron jobs, and daemon processes where a user context is absent.”
}
},
{
“@type”: “Question”,
“name”: “What is token introspection and why is it needed?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Token introspection (RFC 7662) lets a resource server query the authorization server to determine whether a presented access token is active and to retrieve its metadata (scope, expiry, subject). It is essential when tokens are opaque strings rather than self-contained JWTs, allowing centralized revocation checks without exposing signing keys to every resource server.”
}
},
{
“@type”: “Question”,
“name”: “How do you design token revocation in an OAuth server?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “RFC 7009 defines a revocation endpoint that accepts a token and optional token_type_hint. For opaque tokens, remove them from the active-token store. For JWTs, maintain a deny-list keyed by jti (JWT ID) with TTL matching the token expiry. Revoke both the access token and its associated refresh token to fully terminate a session. Propagate revocation events to resource servers via a pub/sub channel or short cache TTLs.”
}
}
]
}

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top