Q: What must a cursor encode and why should it be opaque to the client?

A cursor must encode all columns used in the ORDER BY clause — typically {id, created_at} for a time-ordered feed. Both columns are needed because created_at alone is not unique (multiple rows can have the same timestamp); the id breaks ties. The cursor should be base64-encoded JSON, not a raw integer or exposed column value, for two reasons: (1) opacity prevents clients from constructing cursors manually and coupling to your DB schema, (2) you can change the cursor internals (add a new sort column) without breaking the client API contract. Always validate decoded cursors on the server before using them in queries.

Q: How do you return has_next without a COUNT(*) query?

Request LIMIT + 1 rows. If the result set has limit+1 rows, there is a next page — set has_next=true and return only the first limit rows. If the result set has limit or fewer rows, has_next=false. This costs exactly one extra row of work rather than a full COUNT(*) which scans the entire result set. Never use SELECT COUNT(*) for pagination — on a table with millions of rows, it is a full table scan (or full index scan) that adds hundreds of milliseconds to every paginated request.

Q: What index is required for cursor pagination to be efficient?

The index must exactly match the WHERE clause and ORDER BY clause. For SELECT * FROM Post WHERE (created_at, id) < (%s, %s) ORDER BY created_at DESC, id DESC LIMIT 20, create: CREATE INDEX idx_post_pagination ON Post(created_at DESC, id DESC). PostgreSQL can use this index for both the keyset WHERE condition and the ORDER BY, making the query an index range scan starting from the cursor position rather than a sequential scan. Without this index, PostgreSQL performs a full sequential scan and sort on every request, making cursor pagination no faster than offset.

Q: How does cursor pagination work with filtered queries?

The filter column must be added to the index as the leading column. For a feed filtered by user_id: CREATE INDEX idx_post_user ON Post(user_id, created_at DESC, id DESC). The query becomes: WHERE user_id = :uid AND (created_at, id) < (:ts, :id) ORDER BY created_at DESC, id DESC LIMIT 20. The cursor is valid only within the context of the same filter — a cursor from user_id=5's feed cannot be used for user_id=6. Encode any filter parameters in the cursor or require the client to pass them consistently. Changing filters requires starting from the first page.

Question 1

Why does offset pagination break at scale and what replaces it?

Accepted Answer

OFFSET N scans and discards N rows before returning results — at OFFSET 100000, the database does 100,000 units of work to return 20 rows. Performance degrades linearly with page depth. Additionally, concurrent inserts cause data drift: a new row inserted between page 1 and page 2 fetches shifts the offset boundary, causing the same row to appear on both pages. Cursor pagination replaces this: the cursor encodes the exact position of the last seen row (e.g., {id, created_at}), and the query uses a keyset condition WHERE (created_at, id) < (cursor_ts, cursor_id) ORDER BY created_at DESC, id DESC LIMIT 20. This is O(log N) regardless of page depth and immune to data drift.

Question 2

What must a cursor encode and why should it be opaque to the client?

Accepted Answer

A cursor must encode all columns used in the ORDER BY clause — typically {id, created_at} for a time-ordered feed. Both columns are needed because created_at alone is not unique (multiple rows can have the same timestamp); the id breaks ties. The cursor should be base64-encoded JSON, not a raw integer or exposed column value, for two reasons: (1) opacity prevents clients from constructing cursors manually and coupling to your DB schema, (2) you can change the cursor internals (add a new sort column) without breaking the client API contract. Always validate decoded cursors on the server before using them in queries.

Question 3

How do you return has_next without a COUNT(*) query?

Accepted Answer

Request LIMIT + 1 rows. If the result set has limit+1 rows, there is a next page — set has_next=true and return only the first limit rows. If the result set has limit or fewer rows, has_next=false. This costs exactly one extra row of work rather than a full COUNT(*) which scans the entire result set. Never use SELECT COUNT(*) for pagination — on a table with millions of rows, it is a full table scan (or full index scan) that adds hundreds of milliseconds to every paginated request.

Question 4

What index is required for cursor pagination to be efficient?

Accepted Answer

The index must exactly match the WHERE clause and ORDER BY clause. For SELECT * FROM Post WHERE (created_at, id) < (%s, %s) ORDER BY created_at DESC, id DESC LIMIT 20, create: CREATE INDEX idx_post_pagination ON Post(created_at DESC, id DESC). PostgreSQL can use this index for both the keyset WHERE condition and the ORDER BY, making the query an index range scan starting from the cursor position rather than a sequential scan. Without this index, PostgreSQL performs a full sequential scan and sort on every request, making cursor pagination no faster than offset.

Question 5

How does cursor pagination work with filtered queries?

Accepted Answer

The filter column must be added to the index as the leading column. For a feed filtered by user_id: CREATE INDEX idx_post_user ON Post(user_id, created_at DESC, id DESC). The query becomes: WHERE user_id = :uid AND (created_at, id) < (:ts, :id) ORDER BY created_at DESC, id DESC LIMIT 20. The cursor is valid only within the context of the same filter — a cursor from user_id=5's feed cannot be used for user_id=6. Encode any filter parameters in the cursor or require the client to pass them consistently. Changing filters requires starting from the first page.

Cursor Pagination Low-Level Design

Compound Cursor (Stable Tiebreaker)

Cursor for “Sort by Score” (Non-Monotonic)

API Response Shape

Key Interview Points