Figma’s collaborative editor is the canonical frontend system design problem of the 2020s. The question — design a multi-user editor where many people can simultaneously edit a shared canvas without conflicts — touches every dimension of frontend architecture: real-time sync, canvas rendering, state management, conflict resolution, performance at scale. It is the modern equivalent of “design Twitter” for frontend specifically, and senior+ frontend interviews at companies like Figma, Notion, Linear, Vercel, and others ask versions of it routinely.
This piece walks through the problem, the architectural decisions, and what senior interviewers grade.
The problem
Build a collaborative editor where multiple users can edit a shared design canvas (vector shapes, text, images) simultaneously. Users see each others’ cursors and selections in real time. Edits propagate to all connected clients within ~50ms. The system must handle:
- Multiple users editing the same object simultaneously.
- Network latency and brief disconnections.
- Hundreds of objects on the canvas without lag.
- Persistent storage so the document survives reconnection.
The architectural decisions
1. Sync model: CRDT vs OT
Two main approaches to real-time collaboration:
- Operational Transformation (OT). Used by Google Docs. Each operation is transformed against concurrent operations to maintain consistency. Centralized server determines the canonical order. Complex to implement; requires a central authority.
- Conflict-Free Replicated Data Types (CRDTs). Operations commute by design — order doesn’t matter. Used by Figma, Notion, Linear. Mathematically guaranteed eventual consistency. More memory overhead but simpler conceptually.
Senior candidates pick CRDTs for Figma’s use case — for graphics editing with potentially long disconnections, eventual consistency without coordination is the right model. They articulate that OT works for plain text but is harder for arbitrary tree structures like a Figma document.
2. Document model
The document is a tree of objects: the page contains frames; frames contain shapes; shapes have properties. The CRDT representation:
- Each object has a unique ID (e.g., UUID).
- Each property change is an operation tagged with a timestamp and a logical clock (Lamport or HLC).
- The current state is computed by applying all operations in causal order.
- Tree structure changes (adding/removing objects, reparenting) are operations themselves.
3. Network protocol
WebSocket for real-time sync. The protocol:
- Client → server: “I made these edits at logical time T.”
- Server → all clients: broadcast the edits.
- Each client applies the edits to its local document.
- Clients periodically send heartbeats.
- On disconnect/reconnect, the client requests missed operations since its last logical time.
4. Canvas rendering
How to render hundreds of objects efficiently:
- Canvas vs SVG vs WebGL. SVG is convenient but doesn’t scale to thousands of objects. Canvas is faster for bulk rendering but loses some accessibility. WebGL (Figma’s choice) is fastest and supports complex effects.
- Spatial indexing. An R-tree or quadtree to quickly find which objects are visible in the viewport.
- Dirty region tracking. Only redraw the parts of the canvas that changed.
- Off-main-thread rendering. Web Workers + OffscreenCanvas to keep rendering off the main thread.
5. State management
Multiple sources of state:
- The document state (CRDT, large, persisted).
- The selection state (per-user, in memory only).
- The viewport state (zoom, pan; per-user, occasionally persisted).
- The presence state (cursors, names, selections of other users; ephemeral).
Each has different sync requirements. Document state syncs through CRDT operations. Presence syncs through a separate ephemeral channel (often a Redis pub/sub on the backend).
6. Performance: avoiding stutter at scale
- Throttle high-frequency updates (cursor moves: 30 Hz, not on every mouse event).
- Batch operations on the wire to reduce round-trips.
- Use requestAnimationFrame for rendering.
- Avoid re-renders of components that don’t need to update.
Common candidate mistakes
- Using OT for arbitrary document structure. CRDTs are the modern choice for tree-structured documents.
- Storing document state in React component state. Document state is too large; should live in an external store with React subscribing to slices.
- SVG-everything rendering. Doesn’t scale to hundreds of objects.
- Forgetting offline / reconnect handling. Real-time sync needs reliable reconnection logic.
- Treating presence and document state the same. They have different consistency and persistence requirements.
- Synchronous server validation. Sync edits should be optimistic (apply locally first, send to server, reconcile).
Stretch topics for senior+ rounds
- How to handle a 5000-object document.
- How to support undo/redo with collaboration.
- How to handle permissions (read-only collaborators, view-only links).
- How to implement comments anchored to specific objects.
- How to support animations / interactive prototypes (Figma’s prototype mode).
- How to design the export pipeline (PNG, SVG, PDF).
What scores well
- Picking CRDTs over OT for the right reasons.
- Separating document state, presence state, and ephemeral UI state with appropriate sync models.
- Articulating the canvas rendering tradeoff (SVG vs Canvas vs WebGL) with reasoning about scale.
- Discussing the offline / reconnect story explicitly.
- Thinking about performance at the multi-thousand-object scale.
What scores poorly
- Generic “use Redux for state” without depth.
- Ignoring conflict resolution.
- Not distinguishing real-time channels (WebSocket) from data channels (HTTP/REST).
- Treating the rendering layer as if it’s just React DOM.
- Forgetting that latency tolerance for cursor sync is much tighter than for object edits.
How to prepare
- Read about CRDTs (Yjs and Automerge are the canonical libraries; their docs are educational).
- Read about Figma’s actual architecture (their engineering blog has multiple posts on multiplayer).
- Read about WebGL rendering basics if you’re targeting Figma specifically.
- Practice the design end-to-end on a whiteboard or in a doc, walking through every section narrating tradeoffs.
Frequently Asked Questions
Is this question only at Figma?
No. Notion, Linear, design-tool startups, and any company building collaborative editing asks versions of it. Even general frontend interviews use it as a system design probe.
Is OT outdated?
Not outdated, but less suited for arbitrary document structures. Google Docs still uses OT for plain text. CRDTs are the modern choice for richer structures.
Do I need to know WebGL?
For senior frontend at Figma yes. For senior frontend elsewhere, the high-level tradeoff (when to use canvas vs WebGL) is enough.
How long does this discussion run?
60 minutes is typical. The discussion can comfortably fill that time at senior+ levels.
Can I use a CRDT library?
For production yes (Yjs, Automerge). For the interview, you should articulate the underlying concepts; you do not need to implement the CRDT yourself.