Overview
A collaborative whiteboard allows multiple users to simultaneously draw, add shapes, write text, and move objects on a shared infinite canvas. The core challenges are real-time synchronization of concurrent edits, conflict resolution, undo/redo history, and efficient rendering of a potentially massive canvas.
Requirements
Functional Requirements
- Draw freehand strokes, add shapes (rectangle, ellipse, line, arrow), insert text and sticky notes
- Move, resize, and delete objects
- Multiple users edit simultaneously with cursor visibility
- Infinite canvas with pan and zoom
- Undo/redo per user
- Persistent board state: users can rejoin and see the full history
- Export to PNG/SVG/PDF
Non-Functional Requirements
- Operations applied locally immediately (optimistic); synchronized within 100ms to all collaborators
- Support 100+ concurrent editors on one board
- Board can contain tens of thousands of objects
- No data loss on concurrent conflicting edits
Canvas State Representation
Scene Graph
The whiteboard state is a flat map of objects (shapes), each with a unique ID, type, geometry, and style properties. A flat map is preferred over a tree for simplicity; z-order is a separate sorted list of IDs.
Board {
boardId: string
objects: Map<objectId, WhiteboardObject>
zOrder: objectId[] // bottom to top
viewport: { x, y, zoom }
}
WhiteboardObject {
id: string // UUID
type: stroke | rect | ellipse | arrow | text | image | sticky
x: number
y: number
width: number
height: number
rotation: number // degrees
style: { strokeColor, fillColor, strokeWidth, opacity, fontSize, fontFamily }
points: [x,y][] // for freehand strokes
content: string // for text/sticky
createdBy: userId
createdAt: timestamp
updatedAt: timestamp
}
CRDT-Based Synchronization
Conflict-free Replicated Data Types (CRDTs) allow concurrent edits to be merged automatically without a central arbiter, ensuring eventual consistency.
Why CRDTs over OT
Operational Transformation (OT) requires a central server to serialize and transform operations. CRDTs are peer-friendly, work offline, and have simpler correctness proofs. The trade-off is slightly larger metadata overhead.
LWW-Map for Object Properties
Each object property uses Last-Write-Wins (LWW) semantics with a Hybrid Logical Clock (HLC) timestamp. HLC combines physical time with a logical counter, providing causally ordered timestamps even across machines with clock skew.
LWWRegister<T> {
value: T
timestamp: HLC // { wallTime: uint64, logical: uint32, nodeId: string }
}
merge(a: LWWRegister, b: LWWRegister) -> LWWRegister:
return hlc_compare(a.timestamp, b.timestamp) >= 0 ? a : b
Add-Remove Set for Object Membership
Object creation and deletion use an Observed-Remove Set (OR-Set). Each add carries a unique tag; removal explicitly tombstones that tag. This prevents the ABA problem (add, remove, add) from losing the second add.
OR-Set operations: add(objectId, tag=uuid4()) -> adds (objectId, tag) to add-set remove(objectId) -> moves all observed tags to remove-set contains(objectId) -> (add-set - remove-set).has(objectId)
Fractional Indexing for Z-Order
Z-order (stacking) is maintained as a sorted list using fractional indexing. Each object gets a string key between its neighbors (e.g., between "a" and "b" insert "am"). This avoids rewriting the entire list on every reorder. Libraries like fractional-indexing handle key generation.
Operation Log
All mutations are expressed as operations appended to an immutable log. The current state is derived by replaying the log (or from a snapshot + subsequent log entries).
Operation {
opId: string // UUID
boardId: string
userId: string
sessionId: string
type: add_object | update_object | delete_object | move_object | reorder
objectId: string
delta: Partial<WhiteboardObject> // only changed fields
hlcTimestamp: HLC
parentOpId: string // causal parent (for undo chains)
}
Operation Application
- Client applies operation locally (optimistic update) and renders immediately.
- Client sends operation to the server via WebSocket.
- Server appends to the operation log (Kafka topic or Postgres append-only table).
- Server broadcasts operation to other clients in the board session.
- Other clients apply operation using CRDT merge rules.
Cursor Broadcasting
Each user’s cursor position on the canvas is broadcast to all collaborators in real time. Cursor positions are ephemeral — not persisted to the operation log.
CursorEvent {
userId: string
displayName: string
color: string // user-specific color for cursor and selection highlight
x: number // canvas coordinates (not screen coordinates)
y: number
timestamp: uint64
}
Cursor events are sent via WebSocket at up to 30 times/second. On the receiver, linear interpolation smooths cursor movement between received events. Cursors disappear after 5 seconds of inactivity.
Undo/Redo History
Per-User Undo Stack
Each user maintains their own undo stack. Undoing reverses only that user’s operations, not other users’ concurrent changes. This is selective undo — standard in collaborative editors.
UndoStack {
userId: string
undoStack: Operation[]
redoStack: Operation[]
}
Generating Inverse Operations
Each operation type has an inverse:
add_objectinverts todelete_objectdelete_objectinverts toadd_object(restoring the full object state)update_objectinverts toupdate_objectwith the previous values (stored at operation time)move_objectinverts tomove_objectwith the previous position
Undo in the Presence of Concurrent Edits
If User A undoes moving an object that User B has since edited, the undo must not overwrite B’s edits. The inverse operation uses LWW merge: if B’s timestamp is newer, B’s version wins. A’s undo effectively becomes a no-op for properties B changed, but still undoes properties only A changed.
Infinite Canvas and Pagination
Viewport Model
The canvas coordinate system is unbounded. The viewport is a window into the canvas defined by an offset (x, y) and a zoom level. Canvas coordinates are device-independent pixels at zoom=1.
screenToCanvas(sx, sy, viewport): cx = (sx - viewport.offsetX) / viewport.zoom cy = (sy - viewport.offsetY) / viewport.zoom canvasToScreen(cx, cy, viewport): sx = cx * viewport.zoom + viewport.offsetX sy = cy * viewport.zoom + viewport.offsetY
Spatial Indexing
With tens of thousands of objects, rendering everything every frame is too slow. Use an R-tree (or quadtree) spatial index to query only objects intersecting the current viewport. On pan/zoom, re-query the index.
visibleObjects = spatialIndex.query(viewportBoundingBox) render(visibleObjects)
Chunked Loading
On initial load, only objects near the viewport are fetched. As the user pans, additional chunks are loaded on demand. The canvas is divided into a grid of chunks; each chunk is fetched and cached independently. Chunks that leave the viewport are evicted from memory after a grace period.
Chunk {
chunkX: int // grid coordinate
chunkY: int
objects: WhiteboardObject[]
lastUpdated: timestamp
}
Real-Time Transport
WebSocket Rooms
Each board maps to a WebSocket room. All clients in the room receive broadcasts. The server maintains a room registry (Redis pub/sub or a dedicated room service) so broadcasts reach clients connected to any server node.
Message Types
Client -> Server: join_board, leave_board, operation, cursor_move, undo, redo Server -> Client: board_snapshot, operation_ack, operation_broadcast, cursor_update, user_joined, user_left, error
Presence
Active users on a board are tracked in Redis with a TTL refreshed by a heartbeat every 30 seconds. The presence list is broadcast to all clients on join, leave, and reconnect events.
Snapshot and Persistence
Event Sourcing
The operation log is the source of truth. Current board state is rebuilt by replaying all operations. For boards with long history, full replay is expensive.
Periodic Snapshots
A background job periodically computes a full state snapshot and stores it. On load, the server sends the latest snapshot plus operations after the snapshot timestamp. This bounds replay cost to recent operations only.
Snapshot {
snapshotId: UUID
boardId: string
state: Board // full serialized board state
lastOpId: string // highest opId included in snapshot
createdAt: timestamp
}
Database Schema
boards(
board_id UUID PK,
owner_user_id BIGINT,
title VARCHAR(255),
created_at TIMESTAMP,
updated_at TIMESTAMP,
is_public BOOL
)
board_operations(
op_id VARCHAR(64) PK,
board_id UUID FK,
user_id BIGINT,
session_id VARCHAR(64),
type VARCHAR(32),
object_id VARCHAR(64),
delta JSONB,
hlc_timestamp VARCHAR(64),
parent_op_id VARCHAR(64),
created_at TIMESTAMP
)
board_snapshots(
snapshot_id UUID PK,
board_id UUID FK,
state JSONB,
last_op_id VARCHAR(64),
created_at TIMESTAMP
)
board_members(
board_id UUID FK,
user_id BIGINT FK,
role ENUM('owner','editor','viewer'),
added_at TIMESTAMP,
PRIMARY KEY (board_id, user_id)
)
Rendering Architecture
Canvas vs SVG vs WebGL
- SVG — DOM-based, easy to implement, but slow with thousands of objects. Good for simple boards.
- Canvas 2D — Fast for moderate object counts. Simple API. No hardware acceleration for individual objects.
- WebGL / WebGPU — GPU-accelerated. Required for very large boards with complex effects. Higher implementation complexity. Use a library like Pixi.js or a custom renderer.
Dirty Rect Rendering
Only re-render regions that changed. Maintain a dirty rect list updated on each operation. On the next animation frame, clear and redraw only the dirty regions. This reduces GPU bandwidth for boards with sparse activity.
Layer Architecture
Layer 0: Background (grid lines, infinite canvas pattern) — rarely redrawn Layer 1: Objects (shapes, strokes, images) — redrawn on operations Layer 2: Selection handles — redrawn on selection change Layer 3: Cursor overlays — redrawn at 30fps
Each layer is a separate canvas element composited by the browser, avoiding full redraws of all layers on any change.
Export
Export renders the board to an off-screen canvas at the desired resolution, then encodes it:
- PNG — canvas.toBlob(‘image/png’)
- SVG — serialize the object graph to SVG elements; text objects map to <text>, strokes to <polyline>, shapes to <rect>/<ellipse>
- PDF — use a client-side PDF library (jsPDF, pdfmake) to render SVG or canvas into a PDF page
For very large boards, export is done server-side: a headless Chromium instance renders the board and produces the output file, which is uploaded to object storage and returned as a download link.
Failure Handling
- Client disconnect — Operations buffered locally during disconnect are sent on reconnect. The server deduplicates by opId. Clients receive all missed operations since their last known opId.
- Operation conflicts — CRDT merge handles concurrent edits automatically. No manual conflict resolution needed for supported operation types.
- Large boards — If a board exceeds a configured object count threshold, writes are rate-limited per user. Oldest un-snapshotted operations are archived to cold storage and excluded from real-time sync but remain queryable for history.
Summary
A collaborative whiteboard combines an OR-Set CRDT for object membership, LWW registers for property updates, fractional indexing for z-order, and per-user undo stacks. Real-time transport uses WebSocket rooms with Redis pub/sub for multi-node fan-out. Cursor positions are broadcast ephemerally. An infinite canvas uses spatial indexing and chunked loading to keep rendering fast at scale. Periodic snapshots bound replay cost for long-lived boards.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does a collaborative whiteboard synchronize drawing operations across users?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A collaborative whiteboard synchronizes drawing operations by broadcasting operation events — such as strokeStart, strokePoint, strokeEnd, and shapeAdd — to all connected clients through a real-time pub/sub channel (WebSocket or WebRTC data channel). Each operation is assigned a logical timestamp (Lamport clock or hybrid logical clock) so clients can order concurrent operations deterministically. An authoritative server or CRDT merge function resolves conflicts so that all clients converge to the same canvas state regardless of the order messages arrive. Clients apply operations optimistically for low-latency local feedback and reconcile with the server state on acknowledgment.”
}
},
{
“@type”: “Question”,
“name”: “What CRDT data structures are used in a collaborative whiteboard?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Collaborative whiteboards commonly use a combination of CRDTs. The canvas object graph is typically represented as an RGA (Replicated Growable Array) or a list CRDT for ordered elements, allowing concurrent insertions and deletions to merge without conflicts. Object properties (position, color, size) use Last-Write-Wins registers keyed by object ID and a Hybrid Logical Clock timestamp so the most recent update wins. Freehand strokes are append-only sets (G-Set) since individual points are never deleted mid-stroke. Some systems (e.g., Figma’s approach) use a simpler operational transform model, but CRDT-based designs such as Yjs and Automerge are increasingly popular because they work well in peer-to-peer and offline-first scenarios.”
}
},
{
“@type”: “Question”,
“name”: “How does undo/redo work in a multi-user collaborative whiteboard?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Undo/redo in a multi-user whiteboard is scoped per user rather than global, because undoing another user’s action would be disruptive and confusing. Each client maintains its own undo stack of operations it has sent. When a user triggers undo, the client generates an inverse operation (e.g., delete a shape that was added, or restore the previous property value) and submits it as a new operation, which is then broadcast and merged like any other. This means undo is itself a CRDT-compatible operation. The complexity arises when an object a user wants to undo has since been modified by another user; in that case the system must decide whether to partially undo (reverting only the original user’s contribution) or skip the undo if the object no longer exists.”
}
},
{
“@type”: “Question”,
“name”: “How does an infinite canvas handle viewport management and lazy loading?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An infinite canvas divides the 2D space into a spatial index (typically an R-tree or a quadtree) so the system can quickly query which objects intersect the current viewport rectangle. Only objects within or near the viewport are loaded into the client’s in-memory scene graph; objects outside are evicted or never fetched. As the user pans or zooms, the client computes a new viewport AABB and fetches any newly visible objects from the server or local cache. Rendering uses canvas 2D or WebGL with a camera transform matrix so that panning and zooming are pure matrix operations that do not require re-layout. For very large boards, tile-based background rendering (like map tiles) can pre-render static content at multiple zoom levels and serve them as bitmaps, reducing the number of individual objects the renderer must process.”
}
}
]
}
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Atlassian Interview Guide