Question 1

Why does offset pagination degrade at large page numbers?

Accepted Answer

OFFSET N tells the database to scan and discard N rows before returning results. Even with an index, PostgreSQL must traverse N index entries to skip past them — O(N) cost that grows linearly with the page number. At OFFSET 100,000, the database discards 100,000 rows to return 20. This is slow and wastes I/O. Additionally, concurrent inserts shift all subsequent rows, causing duplicate items on page 2 if a row is inserted between fetching page 1 and page 2. Cursor pagination avoids both problems: it seeks directly to the cursor position using the index (O(log N)), never scanning rows to discard them.

Question 2

How do you implement cursor pagination for a multi-column sort order?

Accepted Answer

For ORDER BY score DESC, id ASC, the cursor must encode both columns: base64({score: 0.95, id: 12345}). The next page WHERE clause uses tuple comparison: WHERE (score  12345). This correctly handles ties in the sort key — rows with score=0.95 and id12345 are on the next page. A composite index on (score DESC, id ASC) makes this efficient. Without the composite index, the OR condition requires a full scan. Always expose cursors as opaque base64 tokens — this decouples the client from the internal cursor format.

Question 3

How do you provide a total count with cursor-based pagination?

Accepted Answer

Cursor pagination cannot efficiently provide exact total counts without a full table scan. Options: estimated count from pg_class.reltuples (fast, ~1-10% error for large tables — acceptable for 'about N results'), capped count (SELECT COUNT(*) ... LIMIT 1000 — exact up to the cap, then return '1000+'), pre-computed denormalized count (maintain a counter updated on INSERT/DELETE — exact but adds write complexity), or no count at all (infinite scroll UX with 'load more' eliminates the need for totals). For search results where 'showing N of M' matters, use the capped count with a reasonable cap (1000 or 10000).

Question 4

What is the Relay pagination spec and why follow it?

Accepted Answer

The Relay spec defines a standard cursor pagination interface for GraphQL: list fields return a connection type with edges (containing nodes and per-edge cursors) and pageInfo (hasNextPage, hasPreviousPage, startCursor, endCursor). Arguments: first/after for forward pagination, last/before for backward pagination. Following Relay means: client libraries (React Relay, Apollo Client) automatically handle pagination state; frontend engineers recognize the pattern immediately; the API is predictable across all paginated resources. Deviating from Relay means inventing a custom format that every client must re-learn. Even for REST APIs, the concepts (opaque cursors, hasNextPage, pageInfo) are worth adopting.

Low Level Design: Cursor-Based vs Offset Pagination

Opaque Cursor Encoding

Total Count and Estimated Counts

When to Use Each Approach

Offset Pagination

Cursor-Based Pagination

Opaque Cursor Encoding

Keyset Pagination for Complex Sorts

Total Count and Estimated Counts

Relay GraphQL Pagination Spec

When to Use Each Approach