Cursor Pagination Low-Level Design

Cursor Pagination — Low-Level Design

Cursor pagination (also called keyset pagination) navigates large datasets stably and efficiently by using a pointer to the last seen row rather than a numeric offset. It is the correct pagination strategy for APIs, feeds, and any dataset that changes while users browse. This pattern is asked at every company building a data-heavy API.

Why OFFSET Pagination Fails at Scale

-- OFFSET pagination
SELECT * FROM Post ORDER BY created_at DESC LIMIT 20 OFFSET 1000;

Problems:
1. Performance: OFFSET 1000 scans and discards 1000 rows before returning 20.
   At OFFSET 1,000,000: scans 1M rows — seconds of latency.

2. Stability: if a new post is inserted at the top while a user is on page 3,
   all subsequent pages shift by 1. Page 4 now contains items that were
   on page 3 before — the user sees the same item twice.

3. Database work grows linearly with page number: O(offset + limit).

Cursor Pagination by Single Column

-- First page: no cursor
SELECT * FROM Post
WHERE author_id = %(uid)s
ORDER BY created_at DESC
LIMIT 21;  -- fetch 21, return 20 to detect if there is a next page

-- Extract cursor from last returned item:
next_cursor = base64(json({'created_at': last_item.created_at.isoformat()}))

-- Next page: use cursor
SELECT * FROM Post
WHERE author_id = %(uid)s
  AND created_at < %(cursor_created_at)s  -- strictly less than cursor
ORDER BY created_at DESC
LIMIT 21;

-- Index that makes this efficient:
CREATE INDEX idx_post_author_time ON Post(author_id, created_at DESC);
-- Query becomes: index seek to author_id + created_at < cursor, then forward scan

Compound Cursor (Stable Tiebreaker)

-- Problem: sorting by created_at alone is unstable if two rows share a timestamp
-- (batch imports, millisecond collisions). The same row may appear on two pages.

-- Solution: compound cursor on (created_at DESC, id DESC)
-- id is globally unique, so the compound is always stable.

SELECT * FROM Post
WHERE author_id = %(uid)s
  AND (created_at, id) < (%(cursor_ts)s, %(cursor_id)s)
ORDER BY created_at DESC, id DESC
LIMIT 21;

-- Encode cursor:
def encode_cursor(row):
    return base64.b64encode(json.dumps({
        'ts': row.created_at.isoformat(),
        'id': row.id
    }).encode()).decode()

def decode_cursor(cursor_str):
    data = json.loads(base64.b64decode(cursor_str))
    return data['ts'], data['id']

# The (created_at, id) < (ts, id) condition uses PostgreSQL's
# row value comparison — equivalent to:
# created_at < ts OR (created_at = ts AND id < cursor_id)

Cursor for “Sort by Score” (Non-Monotonic)

-- Sorting by upvote_count: scores can change while paginating
-- (a post gains votes between page 1 and page 2 fetches)
-- Accept some instability; use compound cursor as best-effort.

SELECT * FROM Post
WHERE (upvote_count, id) < (%(cursor_votes)s, %(cursor_id)s)
ORDER BY upvote_count DESC, id DESC
LIMIT 20;

-- Index:
CREATE INDEX idx_post_votes ON Post(upvote_count DESC, id DESC);

-- For "sort by score" feeds: accept that new viral posts
-- may appear unexpectedly on later pages. This is unavoidable
-- without fully materializing the sorted list.

API Response Shape

def paginate(query_fn, cursor_str=None, limit=20):
    cursor = decode_cursor(cursor_str) if cursor_str else None
    items = query_fn(cursor=cursor, limit=limit + 1)

    has_next = len(items) > limit
    if has_next:
        items = items[:limit]

    next_cursor = encode_cursor(items[-1]) if has_next else None

    return {
        'items': items,
        'next_cursor': next_cursor,    # null on last page
        'has_next': has_next,
    }

# Response:
{
  "items": [...],
  "next_cursor": "eyJ0cyI6IjIwMjQtMDEtMTVUMTA6MzA6MDAiLCJpZCI6MTIzNH0=",
  "has_next": true
}
# Client passes next_cursor as ?cursor= on the next request

Bi-Directional Cursor Pagination

-- Supporting both "next page" and "previous page" requires two cursor types

-- Forward cursor (next page): items where (ts, id)  cursor, ORDER ASC, then reverse

def get_page(cursor=None, direction='next', limit=20):
    if direction == 'next' or cursor is None:
        rows = db.execute("""
            SELECT * FROM Post WHERE (created_at, id)  (%(ts)s, %(id)s)
            ORDER BY created_at ASC, id ASC LIMIT %(lim)s
        """, ...)
        rows = list(reversed(rows))  # Re-reverse to consistent DESC order

    return rows

Key Interview Points

  • Cursor pagination is O(1) regardless of page depth: The database seeks to the cursor position using the index and scans forward. No rows before the cursor are visited. OFFSET pagination is O(offset) — it gets slower as the user goes deeper.
  • Always use a tiebreaker: A cursor on a non-unique column (created_at) can skip or duplicate rows if multiple rows share the same value. Add the primary key as a tiebreaker to make the cursor unambiguous.
  • Encode cursors as opaque tokens: Base64-encode the cursor data so clients treat it as an opaque string. This allows changing the cursor format (e.g., adding fields) without breaking clients that just pass it back.
  • Fetch limit+1 to detect next page: Fetching one extra row and checking if it exists is cheaper than a COUNT(*) query. If N+1 rows are returned, there is a next page; return only N rows to the client.

Cursor-based feed pagination and infinite scroll design is discussed in Twitter system design interview questions.

Feed pagination and cursor-based API design is covered in LinkedIn system design interview preparation.

News feed pagination and cursor API design is discussed in Meta system design interview guide.

Scroll to Top