Pagination enables clients to retrieve large result sets page by page instead of in one response. Two primary approaches exist: offset-based pagination (LIMIT N OFFSET M) and cursor-based pagination (WHERE id > last_seen_id LIMIT N). The choice dramatically affects performance, consistency, and user experience at scale. Offset pagination is simple but breaks down at large page numbers; cursor pagination is more complex but scales to arbitrarily large datasets.
Offset Pagination
Offset pagination skips a fixed number of rows: SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 1000. The database scans and discards the first 1000 rows, then returns the next 20. Problems: performance degrades with page number (OFFSET 100000 scans 100,000 rows to discard them — O(offset) cost); items shift when data is inserted or deleted between page fetches (a new post inserted between page 1 and page 2 pushes all posts down one position — page 2 may contain duplicates or skip items); total count queries (SELECT COUNT(*) FROM posts with a WHERE clause) are expensive for large tables. Use offset pagination only for small, stable result sets (< 10,000 rows) or admin interfaces where performance is not critical.
Cursor-Based Pagination
Cursor pagination uses a stable cursor (typically the last-seen record’s unique ID or a composite of sort key + ID) to efficiently resume where the previous page ended. SELECT * FROM posts WHERE id < :last_seen_id ORDER BY id DESC LIMIT 20. The database uses the index on id to seek directly to the cursor position — O(1) seek regardless of cursor position. No rows are scanned and discarded. Stable: new inserts don't shift pages because each page picks up from where the cursor left off. Limitation: no random page access (cannot jump to page 50) — must follow cursors sequentially. Cannot trivially provide a total result count without a full scan.
Opaque Cursor Encoding
Expose cursors as opaque tokens, not raw IDs or offsets. Encode the cursor as base64(JSON({id: 12345, ts: 1716000000})). This allows changing the cursor implementation without breaking clients (the cursor is treated as opaque). Validate cursors on receipt (decode, verify format, reject malformed cursors). For multi-field sort orders (ORDER BY score DESC, id ASC), the cursor must encode all sort fields: base64({score: 0.95, id: 12345}). The WHERE clause uses the tuple comparison: (score 12345). This correctly handles ties in the sort key and avoids missing rows at sort key boundaries.
Keyset Pagination for Complex Sorts
Keyset pagination generalizes cursor pagination for arbitrary sort orders. For ORDER BY date DESC, title ASC: the cursor is (date, title) at the boundary. Next page: WHERE (date :cursor_title). This requires a composite index on (date, title) to use the index efficiently — without the index, the WHERE clause requires a full scan. For nullable sort columns, handle NULL values explicitly in the cursor comparison (NULLs sort last or first depending on NULLS FIRST/LAST). Keyset pagination is the correct implementation for infinite scrolling feeds (social media timelines, news feeds) where offset pagination would degrade at large page numbers.
Total Count and Estimated Counts
Cursor pagination cannot provide accurate total result counts without an expensive full-table scan — a fundamental trade-off. Alternatives: estimated counts (SELECT reltuples FROM pg_class WHERE relname = ‘posts’ — PostgreSQL’s row estimate; fast, but approximate), separate pre-computed count (maintain a denormalized count updated on INSERT/DELETE — exact but adds write overhead), limit-based count (count up to a maximum: SELECT COUNT(*) FROM posts WHERE user_id = X LIMIT 1000 — exact up to the limit, “1000+” for larger results), and no count at all (“load more” UX instead of “page X of Y” — eliminates the need for a count). For most infinite scroll UIs, total counts are not needed.
Relay GraphQL Pagination Spec
The Relay specification defines a standard cursor pagination interface for GraphQL APIs. The connection type wraps list fields: posts { edges { node { id, title } cursor } pageInfo { hasNextPage, hasPreviousPage, startCursor, endCursor } }. Arguments: first (take first N), after (cursor after which to start), last (take last N), before (cursor before which to end). This supports both forward and backward pagination. The cursor is opaque (base64 encoded). This standard is widely adopted — React applications using Relay or Apollo Client work automatically with any server implementing this specification. Following the Relay spec avoids inventing a custom pagination format that clients must re-learn.
When to Use Each Approach
Offset pagination: small result sets (< 10,000 rows), random page access required (jump to page 50), total page count needed in the UI, and data is mostly static (inserts/deletes are infrequent). Cursor pagination: large result sets (millions of rows), infinite scroll UX (no page numbers), data changes frequently (real-time feeds, social media), and performance is critical. Hybrid: use cursor pagination for the primary data feed and a separate count API (with a cap) for the "showing N results" message. Avoid combining cursor and offset — mixing the two approaches creates consistency problems when data changes between requests.