Question 1

What is the difference between Operational Transformation and CRDTs for collaborative editing?

Accepted Answer

Operational Transformation (OT) transforms concurrent operations against each other to maintain consistency. If User A inserts a character at position 5 while User B deletes character at position 3, OT adjusts A's operation to account for B's deletion (now insert at position 4). OT requires a central server to serialize and transform all operations — there is no correct decentralized OT. Used by Google Docs for text editing. CRDTs (Conflict-Free Replicated Data Types) are data structures with merge operations that commute — the result is the same regardless of operation order. No central coordinator needed. Last-Write-Wins registers, G-Sets (grow-only), and OR-Sets are CRDTs. For whiteboards: element positions use LWW (latest timestamp wins on concurrent moves); element presence uses Add-Wins Set (concurrent add+delete: add wins). CRDTs are simpler for multi-type canvas collaboration; OT is preferred for rich text editing where character position matters precisely.

Question 2

How do you implement per-user undo in a collaborative whiteboard?

Accepted Answer

Per-user undo reverts the individual user's last action without affecting other users' edits. This is fundamentally different from global undo (which would be disruptive in collaborative settings). Implementation: maintain a per-user operation stack. Each operation is stored as an invertible action: move shape from (x1,y1) to (x2,y2) → undo = move from (x2,y2) to (x1,y1); add element → undo = delete element; delete → undo = restore element. When the user triggers undo: pop their last operation, apply the inverse to the current board state, broadcast the inverse operation to other clients. The inverse is applied to the current state (not the state at the time of the original operation), so if another user moved the shape in between, the undo still works — it moves from wherever the shape currently is back to the pre-operation position relative to the current context.

Question 3

How do you efficiently load a large whiteboard with millions of objects?

Accepted Answer

Loading millions of objects at once is impractical — it would take too long and overwhelm the browser rendering engine. Two optimizations: (1) Viewport-based loading: only load elements within (or near) the user's current viewport bounding box. Use a spatial index (R-tree or quadtree on the server) for efficient bounding-box queries. As the user pans or zooms, incrementally load newly visible elements. (2) Snapshot + operation log: instead of replaying the full operation history on load, periodically (e.g., every 1000 operations or hourly) snapshot the board state as a serialized JSON blob. On load: fetch the latest snapshot + only the operations applied after it. The number of operations to replay is bounded by the snapshot frequency, making load time O(snapshot_size + ops_since_snapshot) rather than O(total_ops). Store snapshots in S3 and serve them via CDN for fast global access.

System Design Interview: Design a Real-Time Collaborative Whiteboard (Miro/Figma)

What Is a Real-Time Collaborative Whiteboard?

System Requirements

Functional

Non-Functional

Core Data Model

Real-Time Sync: WebSocket Architecture

Conflict Resolution: OT vs. CRDTs

Operational Transformation (OT)

CRDTs (Conflict-Free Replicated Data Types)

Element Versioning and Undo

Canvas State Loading

Cursor Presence

Viewport and Large Boards

Interview Tips