A bookmark (save) system lets users collect and organize content they want to revisit — articles, products, listings, posts. Core design challenges: supporting folders and collections, syncing across devices in real time, fast retrieval by folder, handling very large bookmark counts (power users with 50,000+ bookmarks), and enabling full-text search across saved content metadata.
Core Data Model
CREATE TABLE BookmarkFolder (
folder_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name TEXT NOT NULL,
parent_id UUID REFERENCES BookmarkFolder(folder_id), -- nested folders
sort_order INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (user_id, name, parent_id)
);
CREATE INDEX idx_folder_user ON BookmarkFolder (user_id, parent_id);
CREATE TABLE Bookmark (
bookmark_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
folder_id UUID REFERENCES BookmarkFolder(folder_id) ON DELETE SET NULL,
content_type TEXT NOT NULL, -- 'article', 'product', 'post', 'video'
content_id TEXT NOT NULL,
-- Snapshot metadata for display without re-fetching
title TEXT,
thumbnail_url TEXT,
url TEXT,
-- Sync
sort_order INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (user_id, content_type, content_id) -- one bookmark per content per user
);
CREATE INDEX idx_bookmark_user_folder ON Bookmark (user_id, folder_id, sort_order);
CREATE INDEX idx_bookmark_user_recent ON Bookmark (user_id, created_at DESC);
Bookmark CRUD
from uuid import uuid4
from datetime import datetime, timezone
import psycopg2
MAX_BOOKMARKS_PER_USER = 50_000
def add_bookmark(conn, user_id: str, content_type: str, content_id: str,
folder_id: str | None = None, metadata: dict | None = None) -> str:
"""
Add a bookmark. Returns existing bookmark_id if already bookmarked (idempotent).
Enforces per-user limit.
"""
# Check existing
with conn.cursor() as cur:
cur.execute(
"SELECT bookmark_id FROM Bookmark WHERE user_id=%s AND content_type=%s AND content_id=%s",
(user_id, content_type, content_id)
)
existing = cur.fetchone()
if existing:
return str(existing[0])
# Check limit
with conn.cursor() as cur:
cur.execute("SELECT COUNT(*) FROM Bookmark WHERE user_id=%s", (user_id,))
count = cur.fetchone()[0]
if count >= MAX_BOOKMARKS_PER_USER:
raise ValueError(f"Bookmark limit ({MAX_BOOKMARKS_PER_USER}) reached")
meta = metadata or {}
bookmark_id = str(uuid4())
with conn.cursor() as cur:
cur.execute("""
INSERT INTO Bookmark
(bookmark_id, user_id, folder_id, content_type, content_id,
title, thumbnail_url, url)
VALUES (%s,%s,%s,%s,%s,%s,%s,%s)
""", (bookmark_id, user_id, folder_id, content_type, content_id,
meta.get('title'), meta.get('thumbnail_url'), meta.get('url')))
conn.commit()
return bookmark_id
def move_bookmark(conn, bookmark_id: str, user_id: str, new_folder_id: str | None):
"""Move a bookmark to a different folder (or to root if folder_id=None)."""
with conn.cursor() as cur:
cur.execute(
"UPDATE Bookmark SET folder_id=%s, updated_at=NOW() WHERE bookmark_id=%s AND user_id=%s",
(new_folder_id, bookmark_id, user_id)
)
if cur.rowcount == 0:
raise ValueError("Bookmark not found or not owned by user")
conn.commit()
def get_bookmarks(conn, user_id: str, folder_id: str | None = None,
limit: int = 50, before: datetime | None = None) -> list[dict]:
"""Cursor-paginated bookmark list for a folder (or root/all)."""
with conn.cursor() as cur:
params = [user_id]
folder_clause = "AND folder_id IS NULL" if folder_id is None else "AND folder_id = %s"
if folder_id:
params.append(folder_id)
cursor_clause = ""
if before:
cursor_clause = "AND created_at < %s"
params.append(before)
cur.execute(f"""
SELECT bookmark_id, content_type, content_id, title, thumbnail_url, url,
folder_id, created_at
FROM Bookmark
WHERE user_id = %s {folder_clause} {cursor_clause}
ORDER BY created_at DESC
LIMIT %s
""", params + [limit])
cols = [d[0] for d in cur.description]
return [dict(zip(cols, row)) for row in cur.fetchall()]
Sync Across Devices (Last-Write-Wins with Change Log)
-- Change log for sync (replicated to client on reconnect)
CREATE TABLE BookmarkChangeLog (
change_id BIGSERIAL PRIMARY KEY,
user_id UUID NOT NULL,
operation TEXT NOT NULL, -- 'add', 'remove', 'move', 'rename_folder'
bookmark_id UUID,
folder_id UUID,
payload JSONB,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_changelog_user ON BookmarkChangeLog (user_id, change_id ASC);
def get_changes_since(conn, user_id: str, since_change_id: int) -> list[dict]:
"""
Return all bookmark changes since the client's last sync point.
Client stores last seen change_id; sends it on reconnect.
"""
with conn.cursor() as cur:
cur.execute("""
SELECT change_id, operation, bookmark_id, folder_id, payload, occurred_at
FROM BookmarkChangeLog
WHERE user_id = %s AND change_id > %s
ORDER BY change_id ASC
LIMIT 1000
""", (user_id, since_change_id))
cols = [d[0] for d in cur.description]
return [dict(zip(cols, row)) for row in cur.fetchall()]
Key Interview Points
- Snapshot metadata for offline display: When a user views their bookmarks, the bookmarked URLs may be dead (404), paywalled, or slow. Store a snapshot of the title and thumbnail at bookmark time in the Bookmark row. This enables fast, reliable list display without re-fetching the original URL. Refresh snapshots in a background job (weekly) — detect dead links and mark them.
- UNIQUE constraint deduplication: UNIQUE (user_id, content_type, content_id) prevents the same content from being bookmarked twice. The INSERT … ON CONFLICT pattern makes add_bookmark idempotent — double-tapping the bookmark icon from the client won’t create two rows. Return the existing bookmark_id on conflict instead of an error.
- Change log for sync: Mobile apps go offline and come back. On reconnect, send the last seen change_id; receive all changes since then. This is cheaper than comparing full bookmark lists (O(changes since disconnect) vs O(total bookmarks)). Prune the change log after 30 days — devices offline longer than 30 days do a full sync instead. change_id is a BIGSERIAL (monotonically increasing), so ordering by it gives a reliable event sequence.
- Folder hierarchy depth: Cap nesting depth at 3–5 levels (similar to browser bookmarks). Deep hierarchies add complexity to tree queries without user benefit. Validate on folder creation: SELECT COUNT(*) traversing ancestors via parent_id until NULL — reject if depth would exceed limit. Use the materialized path pattern (from lld-organization-hierarchy) if deeper hierarchies are needed.
- Full-text search across bookmarks: SELECT * FROM Bookmark WHERE user_id = X AND to_tsvector(title || ‘ ‘ || url) @@ plainto_tsquery($query). Create a GIN index: CREATE INDEX ON Bookmark USING gin(to_tsvector(‘english’, coalesce(title,”) || ‘ ‘ || coalesce(url,”))). This enables full-text search across saved content titles. For richer search (search inside saved content body), index via Elasticsearch where the document text is stored at bookmark time.
Bookmark and content organization system design is discussed in Atlassian system design interview questions.
Bookmark and saved content system design is covered in LinkedIn system design interview preparation.
Bookmark and cross-device sync system design is discussed in Google system design interview guide.