Bookmark System Low-Level Design: Folders, Sync, and Full-Text Search

A bookmark (save) system lets users collect and organize content they want to revisit — articles, products, listings, posts. Core design challenges: supporting folders and collections, syncing across devices in real time, fast retrieval by folder, handling very large bookmark counts (power users with 50,000+ bookmarks), and enabling full-text search across saved content metadata.

Core Data Model

CREATE TABLE BookmarkFolder (
    folder_id   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL,
    name        TEXT NOT NULL,
    parent_id   UUID REFERENCES BookmarkFolder(folder_id),  -- nested folders
    sort_order  INT NOT NULL DEFAULT 0,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (user_id, name, parent_id)
);
CREATE INDEX idx_folder_user ON BookmarkFolder (user_id, parent_id);

CREATE TABLE Bookmark (
    bookmark_id   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id       UUID NOT NULL,
    folder_id     UUID REFERENCES BookmarkFolder(folder_id) ON DELETE SET NULL,
    content_type  TEXT NOT NULL,    -- 'article', 'product', 'post', 'video'
    content_id    TEXT NOT NULL,
    -- Snapshot metadata for display without re-fetching
    title         TEXT,
    thumbnail_url TEXT,
    url           TEXT,
    -- Sync
    sort_order    INT NOT NULL DEFAULT 0,
    created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    UNIQUE (user_id, content_type, content_id)   -- one bookmark per content per user
);
CREATE INDEX idx_bookmark_user_folder ON Bookmark (user_id, folder_id, sort_order);
CREATE INDEX idx_bookmark_user_recent ON Bookmark (user_id, created_at DESC);

Bookmark CRUD

from uuid import uuid4
from datetime import datetime, timezone
import psycopg2

MAX_BOOKMARKS_PER_USER = 50_000

def add_bookmark(conn, user_id: str, content_type: str, content_id: str,
                  folder_id: str | None = None, metadata: dict | None = None) -> str:
    """
    Add a bookmark. Returns existing bookmark_id if already bookmarked (idempotent).
    Enforces per-user limit.
    """
    # Check existing
    with conn.cursor() as cur:
        cur.execute(
            "SELECT bookmark_id FROM Bookmark WHERE user_id=%s AND content_type=%s AND content_id=%s",
            (user_id, content_type, content_id)
        )
        existing = cur.fetchone()
    if existing:
        return str(existing[0])

    # Check limit
    with conn.cursor() as cur:
        cur.execute("SELECT COUNT(*) FROM Bookmark WHERE user_id=%s", (user_id,))
        count = cur.fetchone()[0]
    if count >= MAX_BOOKMARKS_PER_USER:
        raise ValueError(f"Bookmark limit ({MAX_BOOKMARKS_PER_USER}) reached")

    meta = metadata or {}
    bookmark_id = str(uuid4())
    with conn.cursor() as cur:
        cur.execute("""
            INSERT INTO Bookmark
            (bookmark_id, user_id, folder_id, content_type, content_id,
             title, thumbnail_url, url)
            VALUES (%s,%s,%s,%s,%s,%s,%s,%s)
        """, (bookmark_id, user_id, folder_id, content_type, content_id,
              meta.get('title'), meta.get('thumbnail_url'), meta.get('url')))
    conn.commit()
    return bookmark_id

def move_bookmark(conn, bookmark_id: str, user_id: str, new_folder_id: str | None):
    """Move a bookmark to a different folder (or to root if folder_id=None)."""
    with conn.cursor() as cur:
        cur.execute(
            "UPDATE Bookmark SET folder_id=%s, updated_at=NOW() WHERE bookmark_id=%s AND user_id=%s",
            (new_folder_id, bookmark_id, user_id)
        )
        if cur.rowcount == 0:
            raise ValueError("Bookmark not found or not owned by user")
    conn.commit()

def get_bookmarks(conn, user_id: str, folder_id: str | None = None,
                   limit: int = 50, before: datetime | None = None) -> list[dict]:
    """Cursor-paginated bookmark list for a folder (or root/all)."""
    with conn.cursor() as cur:
        params = [user_id]
        folder_clause = "AND folder_id IS NULL" if folder_id is None else "AND folder_id = %s"
        if folder_id:
            params.append(folder_id)
        cursor_clause = ""
        if before:
            cursor_clause = "AND created_at < %s"
            params.append(before)

        cur.execute(f"""
            SELECT bookmark_id, content_type, content_id, title, thumbnail_url, url,
                   folder_id, created_at
            FROM Bookmark
            WHERE user_id = %s {folder_clause} {cursor_clause}
            ORDER BY created_at DESC
            LIMIT %s
        """, params + [limit])

        cols = [d[0] for d in cur.description]
        return [dict(zip(cols, row)) for row in cur.fetchall()]

Sync Across Devices (Last-Write-Wins with Change Log)

-- Change log for sync (replicated to client on reconnect)
CREATE TABLE BookmarkChangeLog (
    change_id    BIGSERIAL PRIMARY KEY,
    user_id      UUID NOT NULL,
    operation    TEXT NOT NULL,   -- 'add', 'remove', 'move', 'rename_folder'
    bookmark_id  UUID,
    folder_id    UUID,
    payload      JSONB,
    occurred_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_changelog_user ON BookmarkChangeLog (user_id, change_id ASC);

def get_changes_since(conn, user_id: str, since_change_id: int) -> list[dict]:
    """
    Return all bookmark changes since the client's last sync point.
    Client stores last seen change_id; sends it on reconnect.
    """
    with conn.cursor() as cur:
        cur.execute("""
            SELECT change_id, operation, bookmark_id, folder_id, payload, occurred_at
            FROM BookmarkChangeLog
            WHERE user_id = %s AND change_id > %s
            ORDER BY change_id ASC
            LIMIT 1000
        """, (user_id, since_change_id))
        cols = [d[0] for d in cur.description]
        return [dict(zip(cols, row)) for row in cur.fetchall()]

Key Interview Points

  • Snapshot metadata for offline display: When a user views their bookmarks, the bookmarked URLs may be dead (404), paywalled, or slow. Store a snapshot of the title and thumbnail at bookmark time in the Bookmark row. This enables fast, reliable list display without re-fetching the original URL. Refresh snapshots in a background job (weekly) — detect dead links and mark them.
  • UNIQUE constraint deduplication: UNIQUE (user_id, content_type, content_id) prevents the same content from being bookmarked twice. The INSERT … ON CONFLICT pattern makes add_bookmark idempotent — double-tapping the bookmark icon from the client won’t create two rows. Return the existing bookmark_id on conflict instead of an error.
  • Change log for sync: Mobile apps go offline and come back. On reconnect, send the last seen change_id; receive all changes since then. This is cheaper than comparing full bookmark lists (O(changes since disconnect) vs O(total bookmarks)). Prune the change log after 30 days — devices offline longer than 30 days do a full sync instead. change_id is a BIGSERIAL (monotonically increasing), so ordering by it gives a reliable event sequence.
  • Folder hierarchy depth: Cap nesting depth at 3–5 levels (similar to browser bookmarks). Deep hierarchies add complexity to tree queries without user benefit. Validate on folder creation: SELECT COUNT(*) traversing ancestors via parent_id until NULL — reject if depth would exceed limit. Use the materialized path pattern (from lld-organization-hierarchy) if deeper hierarchies are needed.
  • Full-text search across bookmarks: SELECT * FROM Bookmark WHERE user_id = X AND to_tsvector(title || ‘ ‘ || url) @@ plainto_tsquery($query). Create a GIN index: CREATE INDEX ON Bookmark USING gin(to_tsvector(‘english’, coalesce(title,”) || ‘ ‘ || coalesce(url,”))). This enables full-text search across saved content titles. For richer search (search inside saved content body), index via Elasticsearch where the document text is stored at bookmark time.

Bookmark and content organization system design is discussed in Atlassian system design interview questions.

Bookmark and saved content system design is covered in LinkedIn system design interview preparation.

Bookmark and cross-device sync system design is discussed in Google system design interview guide.

Scroll to Top