Cursor-Based Sync Low-Level Design: Incremental Delta Pull, Conflict Resolution, and Offline Support

A cursor-based incremental sync system lets clients efficiently download only the data that has changed since their last sync, rather than re-fetching the entire dataset. Used in mobile apps, offline-first systems, and integration APIs. Core challenges: defining a reliable sync cursor (what counts as “changed since X”), handling deletes without tombstoning every table, resolving conflicts when both sides mutate the same record, and scaling the sync endpoint without full-table scans.

Core Data Model

-- Monotonically increasing sequence number assigned to every write
-- Acts as the sync cursor
CREATE SEQUENCE global_sync_seq START 1 INCREMENT 1;

-- Every mutable table gets a sync_seq column
ALTER TABLE Product ADD COLUMN sync_seq BIGINT DEFAULT nextval('global_sync_seq');
ALTER TABLE Order   ADD COLUMN sync_seq BIGINT DEFAULT nextval('global_sync_seq');
ALTER TABLE User    ADD COLUMN sync_seq BIGINT DEFAULT nextval('global_sync_seq');

-- Trigger: update sync_seq on every INSERT or UPDATE
CREATE OR REPLACE FUNCTION set_sync_seq()
RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
    NEW.sync_seq := nextval('global_sync_seq');
    RETURN NEW;
END;
$$;

CREATE TRIGGER trg_product_sync BEFORE INSERT OR UPDATE ON Product
    FOR EACH ROW EXECUTE FUNCTION set_sync_seq();

-- Soft-delete for sync visibility
ALTER TABLE Product ADD COLUMN deleted_at TIMESTAMPTZ;
-- Rows with deleted_at IS NOT NULL appear in sync with a deleted flag
-- Physical deletion only happens after a retention window (e.g., 90 days)

-- Client sync state (persisted server-side for diagnostic purposes)
CREATE TABLE ClientSyncState (
    client_id    TEXT NOT NULL,
    user_id      UUID NOT NULL,
    last_sync_seq BIGINT NOT NULL DEFAULT 0,
    synced_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (client_id, user_id)
);

Sync API: Delta Pull

from dataclasses import dataclass
from datetime import datetime, timezone

@dataclass
class SyncResponse:
    changes: list[dict]
    cursor: int          # new sync position for next call
    has_more: bool       # true if more changes available in this batch

PAGE_SIZE = 500

def get_changes_since(conn, user_id: str, since_seq: int) -> SyncResponse:
    """
    Return all records modified after since_seq, across all synced tables.
    Returns a unified list of changes with type, operation, and payload.
    """
    changes = []

    # Sync Products (user-scoped)
    with conn.cursor() as cur:
        cur.execute("""
            SELECT product_id, name, price_cents, sync_seq,
                   deleted_at IS NOT NULL AS is_deleted
            FROM Product
            WHERE user_id = %s AND sync_seq > %s
            ORDER BY sync_seq ASC
            LIMIT %s
        """, (user_id, since_seq, PAGE_SIZE))
        for row in cur.fetchall():
            changes.append({
                "type": "product",
                "id": str(row[0]),
                "operation": "delete" if row[4] else "upsert",
                "payload": None if row[4] else {
                    "product_id": str(row[0]),
                    "name": row[1],
                    "price_cents": row[2]
                },
                "seq": row[3]
            })

    # Sync Orders
    with conn.cursor() as cur:
        cur.execute("""
            SELECT order_id, status, total_cents, sync_seq,
                   deleted_at IS NOT NULL AS is_deleted
            FROM Order
            WHERE user_id = %s AND sync_seq > %s
            ORDER BY sync_seq ASC
            LIMIT %s
        """, (user_id, since_seq, PAGE_SIZE))
        for row in cur.fetchall():
            changes.append({
                "type": "order",
                "id": str(row[0]),
                "operation": "delete" if row[4] else "upsert",
                "payload": None if row[4] else {
                    "order_id": str(row[0]),
                    "status": row[1],
                    "total_cents": row[2]
                },
                "seq": row[3]
            })

    # Sort all changes by seq, take first PAGE_SIZE
    changes.sort(key=lambda x: x["seq"])
    has_more = len(changes) >= PAGE_SIZE
    if has_more:
        changes = changes[:PAGE_SIZE]

    new_cursor = changes[-1]["seq"] if changes else since_seq

    # Update client sync state
    with conn.cursor() as cur:
        cur.execute("""
            INSERT INTO ClientSyncState (client_id, user_id, last_sync_seq)
            VALUES (%s, %s, %s)
            ON CONFLICT (client_id, user_id) DO UPDATE
            SET last_sync_seq=%s, synced_at=NOW()
        """, ("client", user_id, new_cursor, new_cursor))
    conn.commit()

    return SyncResponse(changes=changes, cursor=new_cursor, has_more=has_more)

Conflict Resolution: Last-Write-Wins

def apply_client_update(conn, user_id: str, resource_type: str,
                          resource_id: str, client_payload: dict,
                          client_version: int) -> dict:
    """
    Client pushes a local change back to the server.
    Last-write-wins: server version wins if server_seq > client_version.
    Returns {"status": "accepted"} or {"status": "conflict", "server_record": ...}
    """
    if resource_type == "product":
        with conn.cursor() as cur:
            cur.execute(
                "SELECT sync_seq, name, price_cents FROM Product WHERE product_id=%s AND user_id=%s",
                (resource_id, user_id)
            )
            server_row = cur.fetchone()

        if server_row and server_row[0] > client_version:
            # Server is newer — conflict. Return server record; client must merge.
            return {
                "status": "conflict",
                "server_record": {
                    "product_id": resource_id,
                    "name": server_row[1],
                    "price_cents": server_row[2],
                    "sync_seq": server_row[0]
                }
            }

        # Client version is current or newer — accept the update
        with conn.cursor() as cur:
            cur.execute("""
                UPDATE Product
                SET name=%s, price_cents=%s
                WHERE product_id=%s AND user_id=%s
            """, (client_payload['name'], client_payload['price_cents'],
                  resource_id, user_id))
        conn.commit()
        return {"status": "accepted"}

    raise ValueError(f"Unknown resource type: {resource_type}")

Key Interview Points

  • Sequence number vs timestamp cursor: Timestamps have two problems: clock skew (two writes at the same millisecond get the same timestamp) and clock drift (NTP corrections can cause time to go backwards). A monotonically increasing database sequence (nextval) is gapless and strictly ordered within a transaction. The sync cursor is a sequence number, not a timestamp — no duplicates, no backward movement possible. In multi-region setups, use a logical clock (Hybrid Logical Clock) or a globally-ordered sequence service.
  • Soft delete for sync visibility: If you physically delete a row, the client syncing after the deletion will never see the delete — their local copy is stale forever. Soft-delete (deleted_at IS NOT NULL) keeps the row visible in the sync stream with operation=’delete’. The client removes it from their local store. Physical cleanup: after a 90-day retention window, a nightly job permanently deletes soft-deleted rows and removes them from the sync stream.
  • Multi-table unified sync stream: Clients need to sync Products, Orders, and Users in a single request. Maintain a unified sync_seq across all tables (single Postgres sequence shared by all triggers). The query filters each table with sync_seq > since, merges the results, sorts by seq, and returns the top PAGE_SIZE. This ensures correct ordering of cross-table changes: if a Product was updated at seq=100 and an Order at seq=101, they arrive in that order.
  • Pagination and resume: For users with large datasets (first sync after 6 months offline), a single sync request may need to return 100K records. PAGE_SIZE=500 means 200 round trips — correct behavior. The client calls the API in a loop: while has_more: call sync(cursor=new_cursor). On each call, the cursor advances. If interrupted mid-sync, resume from the last successful cursor. Store the cursor durably on the client (SQLite local DB, not in-memory).
  • Conflict resolution strategies: Last-write-wins (LWW) by server sequence is the simplest: server data always wins if newer. This is appropriate for most CRUD apps. For collaborative editing (both client and server can independently edit the same field), use three-way merge: compare client change, server change, and common ancestor. For financial data: reject client updates that conflict — require the client to re-fetch and re-apply their change. Choose the strategy per resource type, not globally.

Cursor-based sync and offline-first data design is discussed in Google system design interview questions.

Cursor-based sync and cross-device data synchronization design is covered in Apple system design interview preparation.

Cursor-based sync and distributed data replication design is discussed in Amazon system design interview guide.

Scroll to Top