Cursor Pagination — Low-Level Design
Cursor pagination (also called keyset pagination) navigates large datasets stably and efficiently by using a pointer to the last seen row rather than a numeric offset. It is the correct pagination strategy for APIs, feeds, and any dataset that changes while users browse. This pattern is asked at every company building a data-heavy API.
Why OFFSET Pagination Fails at Scale
-- OFFSET pagination
SELECT * FROM Post ORDER BY created_at DESC LIMIT 20 OFFSET 1000;
Problems:
1. Performance: OFFSET 1000 scans and discards 1000 rows before returning 20.
At OFFSET 1,000,000: scans 1M rows — seconds of latency.
2. Stability: if a new post is inserted at the top while a user is on page 3,
all subsequent pages shift by 1. Page 4 now contains items that were
on page 3 before — the user sees the same item twice.
3. Database work grows linearly with page number: O(offset + limit).
Cursor Pagination by Single Column
-- First page: no cursor
SELECT * FROM Post
WHERE author_id = %(uid)s
ORDER BY created_at DESC
LIMIT 21; -- fetch 21, return 20 to detect if there is a next page
-- Extract cursor from last returned item:
next_cursor = base64(json({'created_at': last_item.created_at.isoformat()}))
-- Next page: use cursor
SELECT * FROM Post
WHERE author_id = %(uid)s
AND created_at < %(cursor_created_at)s -- strictly less than cursor
ORDER BY created_at DESC
LIMIT 21;
-- Index that makes this efficient:
CREATE INDEX idx_post_author_time ON Post(author_id, created_at DESC);
-- Query becomes: index seek to author_id + created_at < cursor, then forward scan
Compound Cursor (Stable Tiebreaker)
-- Problem: sorting by created_at alone is unstable if two rows share a timestamp
-- (batch imports, millisecond collisions). The same row may appear on two pages.
-- Solution: compound cursor on (created_at DESC, id DESC)
-- id is globally unique, so the compound is always stable.
SELECT * FROM Post
WHERE author_id = %(uid)s
AND (created_at, id) < (%(cursor_ts)s, %(cursor_id)s)
ORDER BY created_at DESC, id DESC
LIMIT 21;
-- Encode cursor:
def encode_cursor(row):
return base64.b64encode(json.dumps({
'ts': row.created_at.isoformat(),
'id': row.id
}).encode()).decode()
def decode_cursor(cursor_str):
data = json.loads(base64.b64decode(cursor_str))
return data['ts'], data['id']
# The (created_at, id) < (ts, id) condition uses PostgreSQL's
# row value comparison — equivalent to:
# created_at < ts OR (created_at = ts AND id < cursor_id)
Cursor for “Sort by Score” (Non-Monotonic)
-- Sorting by upvote_count: scores can change while paginating
-- (a post gains votes between page 1 and page 2 fetches)
-- Accept some instability; use compound cursor as best-effort.
SELECT * FROM Post
WHERE (upvote_count, id) < (%(cursor_votes)s, %(cursor_id)s)
ORDER BY upvote_count DESC, id DESC
LIMIT 20;
-- Index:
CREATE INDEX idx_post_votes ON Post(upvote_count DESC, id DESC);
-- For "sort by score" feeds: accept that new viral posts
-- may appear unexpectedly on later pages. This is unavoidable
-- without fully materializing the sorted list.
API Response Shape
def paginate(query_fn, cursor_str=None, limit=20):
cursor = decode_cursor(cursor_str) if cursor_str else None
items = query_fn(cursor=cursor, limit=limit + 1)
has_next = len(items) > limit
if has_next:
items = items[:limit]
next_cursor = encode_cursor(items[-1]) if has_next else None
return {
'items': items,
'next_cursor': next_cursor, # null on last page
'has_next': has_next,
}
# Response:
{
"items": [...],
"next_cursor": "eyJ0cyI6IjIwMjQtMDEtMTVUMTA6MzA6MDAiLCJpZCI6MTIzNH0=",
"has_next": true
}
# Client passes next_cursor as ?cursor= on the next request
Bi-Directional Cursor Pagination
-- Supporting both "next page" and "previous page" requires two cursor types
-- Forward cursor (next page): items where (ts, id) cursor, ORDER ASC, then reverse
def get_page(cursor=None, direction='next', limit=20):
if direction == 'next' or cursor is None:
rows = db.execute("""
SELECT * FROM Post WHERE (created_at, id) (%(ts)s, %(id)s)
ORDER BY created_at ASC, id ASC LIMIT %(lim)s
""", ...)
rows = list(reversed(rows)) # Re-reverse to consistent DESC order
return rows
Key Interview Points
- Cursor pagination is O(1) regardless of page depth: The database seeks to the cursor position using the index and scans forward. No rows before the cursor are visited. OFFSET pagination is O(offset) — it gets slower as the user goes deeper.
- Always use a tiebreaker: A cursor on a non-unique column (created_at) can skip or duplicate rows if multiple rows share the same value. Add the primary key as a tiebreaker to make the cursor unambiguous.
- Encode cursors as opaque tokens: Base64-encode the cursor data so clients treat it as an opaque string. This allows changing the cursor format (e.g., adding fields) without breaking clients that just pass it back.
- Fetch limit+1 to detect next page: Fetching one extra row and checking if it exists is cheaper than a COUNT(*) query. If N+1 rows are returned, there is a next page; return only N rows to the client.
Cursor-based feed pagination and infinite scroll design is discussed in Twitter system design interview questions.
Feed pagination and cursor-based API design is covered in LinkedIn system design interview preparation.
News feed pagination and cursor API design is discussed in Meta system design interview guide.