What Is Session Consistency?
Session consistency is a client-centric consistency model that provides guarantees within the scope of a single client session. It is weaker than linearizability but stronger than eventual consistency. The two core guarantees are:
- Read-your-writes: Within a session, reads always reflect all prior writes made in that same session.
- Monotonic reads: Within a session, reads are monotonically non-decreasing — once a client reads a particular version of data, it never observes an older version in a subsequent read.
Session consistency allows reads to be served from replicas, enabling horizontal read scaling, while still providing meaningful per-client guarantees.
Session Token Design
The session token is an opaque value issued to the client that encodes the consistency state of the session. It carries two fields:
- last_write_lsn: The log sequence number (LSN) of the most recent write the client made. This is used to enforce read-your-writes.
- last_read_lsn: The LSN of the most recent read served to the client. This is used to enforce monotonic reads.
The token is typically base64-encoded or JWT-encoded so it is opaque to the client but parseable by servers. It travels in an HTTP header (e.g., X-Session-Token) on every request.
Read-Your-Writes Enforcement
When the client sends a read request with a session token, the server extracts last_write_lsn from the token and compares it to its own apply_lsn (the LSN up to which the replica has applied changes from the write-ahead log).
- If apply_lsn >= last_write_lsn, the replica can serve the read immediately — the client's write is visible.
- If apply_lsn < last_write_lsn, the server either waits (with a configurable timeout) for replication to catch up, or routes the request to the primary or a more up-to-date replica.
On completion, the server returns the current apply_lsn in the response, allowing the client to update its session token.
Monotonic Read Enforcement
The session token also encodes last_read_lsn. When the client reads, the server must serve the read at an LSN that is at least last_read_lsn. This prevents the client from observing a version that is older than what it already read.
The required LSN for a read is therefore: max(last_write_lsn, last_read_lsn). After serving the read, the server reports its current apply_lsn, and the client updates last_read_lsn to max(last_read_lsn, returned_lsn).
Session Migration
When a client connects to a different backend server (due to load balancing, failover, or reconnection), it presents its session token. The new server must either:
- Verify that its apply_lsn satisfies the token's required LSN and serve the request directly.
- Wait for replication to catch up before serving.
- Route the request to the primary if the required LSN is too far ahead.
This means session tokens enable consistent behavior across server migrations without requiring sticky sessions at the load balancer level.
Session Expiry
Session tokens carry a TTL (time-to-live). When a session expires:
- The server no longer honors LSN requirements from the expired token.
- The client is issued a fresh token with no LSN constraints.
- The client effectively degrades to eventual consistency until it makes a new write.
Session expiry prevents unbounded accumulation of session state and allows the system to reclaim resources for inactive sessions.
Tradeoffs vs. Linearizability
Session consistency is significantly less strict than linearizability. Under linearizability, every operation appears to take effect at a precise real-time point, requiring coordination (quorum reads, leader reads) on every operation. Session consistency allows reads from any up-to-date replica and only requires coordination when the replica lags behind the session's LSN requirements. This makes session consistency well-suited for read-heavy workloads with geographic distribution.
SQL Schema
-- Tracks per-client session state
CREATE TABLE ClientSession (
session_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
last_write_lsn BIGINT NOT NULL DEFAULT 0,
last_read_lsn BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_clientsession_expires ON ClientSession(expires_at);
-- Audit log of reads served under session consistency
CREATE TABLE SessionRead (
id BIGSERIAL PRIMARY KEY,
session_id UUID NOT NULL REFERENCES ClientSession(session_id),
requested_lsn BIGINT NOT NULL,
served_lsn BIGINT NOT NULL,
served_by VARCHAR(64) NOT NULL, -- replica host
latency_ms INTEGER NOT NULL,
read_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_sessionread_session ON SessionRead(session_id, read_at);
Python Implementation
import base64
import json
import time
from dataclasses import dataclass
from typing import Optional
@dataclass
class SessionToken:
write_lsn: int
read_lsn: int
expires_at: float
class SessionConsistencyClient:
def __init__(self, servers: list, session_ttl_s: int = 3600):
self.servers = servers
self.session_ttl_s = session_ttl_s
self.write_lsn: int = 0
self.read_lsn: int = 0
def encode_token(self, write_lsn: int, read_lsn: int) -> str:
payload = {
"write_lsn": write_lsn,
"read_lsn": read_lsn,
"expires_at": time.time() + self.session_ttl_s
}
return base64.b64encode(json.dumps(payload).encode()).decode()
def decode_token(self, token: str) -> SessionToken:
payload = json.loads(base64.b64decode(token).decode())
return SessionToken(
write_lsn=payload["write_lsn"],
read_lsn=payload["read_lsn"],
expires_at=payload["expires_at"]
)
def wait_for_lsn(self, server: str, required_lsn: int, timeout_ms: int = 500) -> bool:
"""Poll server until its apply_lsn >= required_lsn or timeout."""
deadline = time.time() + timeout_ms / 1000
while time.time() = required_lsn:
return True
time.sleep(0.02)
return False
def _get_apply_lsn(self, server: str) -> int:
# Placeholder: HTTP call to /internal/apply_lsn
# Returns the server's current replication apply position
return 0
def write(self, key: str, value: str) -> int:
"""Write to primary and update session write LSN."""
primary = self.servers[0]
# Placeholder: POST /write {key, value} -> returns {"lsn": N}
returned_lsn = self._do_write(primary, key, value)
self.write_lsn = max(self.write_lsn, returned_lsn)
return returned_lsn
def read(self, key: str, timeout_ms: int = 500) -> Optional[str]:
"""Read from any server that satisfies session LSN requirements."""
required_lsn = max(self.write_lsn, self.read_lsn)
for server in self.servers:
if self.wait_for_lsn(server, required_lsn, timeout_ms):
value, served_lsn = self._do_read(server, key)
self.read_lsn = max(self.read_lsn, served_lsn)
return value
# Fall back to primary
value, served_lsn = self._do_read(self.servers[0], key)
self.read_lsn = max(self.read_lsn, served_lsn)
return value
def _do_write(self, server: str, key: str, value: str) -> int:
return 0 # Placeholder
def _do_read(self, server: str, key: str):
return None, 0 # Placeholder
FAQ
- Read-your-writes vs linearizability: Session consistency only constrains the client's own reads; much cheaper than full linearizability.
- Session token content: last_write_lsn, last_read_lsn, and expiry timestamp, base64 or JWT encoded.
- LSN wait timeout: Configurable per request; on timeout, route to primary or return error.
- Session expiry behavior: Token invalidated; client degrades to eventual consistency until next write.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety