Pagination is not optional for any endpoint that returns a list — without it, a single request can return millions of rows, exhaust server memory, and timeout the client. There are three meaningful strategies: offset pagination (simple but broken at scale), cursor pagination (scalable and consistent), and keyset pagination (fastest for very large datasets). The right choice depends on whether your clients need random page access and how large the dataset is.
Offset Pagination (and Why It Breaks)
-- Client sends: GET /posts?page=5&per_page=20
SELECT * FROM Post
ORDER BY created_at DESC
LIMIT 20 OFFSET 100; -- skip 100 rows, return 20
Problems: (1) Performance degrades with depth — OFFSET 100000 scans and discards 100,000 rows before returning 20. At page 5000, a single query takes seconds. (2) Data drift — if a new post is inserted between page 1 and page 2 fetches, the user sees a duplicate (the post that was on the boundary shifts to the next page). (3) No stable cursor — if rows are deleted, page numbers shift.
Use offset pagination only for small datasets (<10,000 rows) or admin interfaces where users rarely go past page 10.
Cursor Pagination (Standard for APIs)
-- Client sends: GET /posts?limit=20&cursor=eyJpZCI6MTIzNH0=
-- Server decodes cursor: {"id": 1234, "created_at": "2024-01-15T10:00:00Z"}
SELECT * FROM Post
WHERE (created_at, id) < ('2024-01-15T10:00:00Z', 1234) -- keyset condition
ORDER BY created_at DESC, id DESC
LIMIT 20;
-- Server encodes next cursor from last row of results:
next_cursor = base64(json({"id": last_row.id, "created_at": last_row.created_at}))
def paginate_posts(cursor_token: str | None, limit: int = 20) -> dict:
limit = min(limit, 100) # cap at 100
if cursor_token:
cursor = json.loads(base64.decode(cursor_token))
rows = db.fetchall("""
SELECT id, title, created_at FROM Post
WHERE (created_at, id) limit
rows = rows[:limit]
next_cursor = None
if has_next and rows:
last = rows[-1]
next_cursor = base64.encode(json({'id': last['id'], 'created_at': last['created_at']}))
return {'data': rows, 'next_cursor': next_cursor, 'has_next': has_next}
Required Index for Cursor Pagination
-- Composite index matching the ORDER BY and WHERE clause
CREATE INDEX idx_post_pagination ON Post(created_at DESC, id DESC);
-- This makes the keyset condition (created_at, id) < (ts, id) an index scan
-- bounded by the cursor position — O(log N + page_size), not O(N)
Bi-directional Pagination
Some UIs need both forward and backward navigation (e.g., a chat thread). Encode direction in the cursor and flip the inequality:
// Cursor encodes: {id, created_at, direction}
// Forward: WHERE (created_at, id) (cursor.created_at, cursor.id) ORDER BY ASC
// then reverse the result in application code to maintain descending display order
// Also return a prev_cursor from the first row and next_cursor from the last row
return {
data: rows,
next_cursor: encode({...last_row, direction: 'forward'}),
prev_cursor: encode({...first_row, direction: 'backward'})
}
Filtering with Cursor Pagination
When filters are applied, the cursor must be stable with those filters. The cursor encodes the position within the filtered result set, not the full table. The filter columns must be part of the index:
-- Filter by user_id (filtered feed)
CREATE INDEX idx_post_user_pagination ON Post(user_id, created_at DESC, id DESC);
SELECT * FROM Post
WHERE user_id = :user_id
AND (created_at, id) < (:cursor_ts, :cursor_id)
ORDER BY created_at DESC, id DESC
LIMIT 20;
Key Interview Points
- Always return
has_next(boolean) — computed by fetchinglimit+1rows and checking if the extra row exists. Never make the client guess. - Cursors must be opaque to clients — base64-encoded JSON. Never expose raw IDs or offsets; that couples clients to your DB structure.
- For total count: avoid
COUNT(*)on large tables. Instead, return an approximate count frompg_class.reltuplesor a separate counter table, or omit the count entirely (most mobile feeds don’t show “page 42 of 8,293”). - The cursor is a position, not a page number — it’s valid only for the current sort order and filter set. Changing the sort order requires a new first-page request.
- Keyset pagination (the WHERE clause above) is O(log N) — performance is identical whether you’re on page 1 or page 10,000.
- For search results (Elasticsearch), use
search_afterinstead offrom/size— it’s the ES equivalent of keyset pagination.
API pagination and infinite scroll feed design is discussed in Twitter system design interview questions.
API pagination and feed pagination design is covered in LinkedIn system design interview preparation.
API pagination and product catalog pagination design is discussed in Shopify system design interview guide.