Frontend System Design: Build Google Docs Real-Time Editor

“Design Google Docs” is the canonical frontend system-design interview at senior+ levels. It tests whether you can think about distributed state, real-time conflict resolution, and the tradeoffs between latency, consistency, and offline support. This guide covers what interviewers actually probe.

Clarify the scope

Don’t solve all of Docs in 45 minutes. Pin down:

  • Just text, or also images, tables, comments, suggestions?
  • How many concurrent collaborators? (Docs caps around 100; design for 10–50)
  • Offline support? (Yes for production; the interviewer may say no for scope)
  • Mobile apps included, or web only?

The document model

The document is a tree of blocks, each with inline content and marks. Selection and cursor state are first-class. Edits are expressed as transactions on the document tree, not as raw DOM mutations. (Same model as ProseMirror, Lexical, etc.)

The conflict-resolution choice

Two real approaches:

Operational Transform (OT)

  • Each edit is an operation: insert at position 10, delete 3 chars at position 25
  • When op A and op B happen concurrently, the server transforms B against A so it makes sense in the post-A world
  • The original Google Docs algorithm. Heavy server logic.
  • Requires a central server; cannot work peer-to-peer

CRDT (Conflict-free Replicated Data Type)

  • Each character (or block) has a unique stable ID; ordering is preserved by the data structure itself
  • Operations commute: applying ops in any order gives the same result
  • Modern default. Yjs and Automerge are the dominant libraries.
  • Works peer-to-peer; supports offline naturally; merges without conflicts
  • Tradeoff: per-character metadata can be large; “garbage collection” needed

For an interview in 2026, recommend CRDT with Yjs and explain why: simpler offline story, easier to reason about, less server logic. Note OT is still the choice for some legacy systems.

The transport layer

WebSocket connection per client to a session server. The server fans out updates to all participants in a doc. Snapshot-and-update protocol: client gets the latest state on connect, then receives delta ops. On disconnect, queue ops locally; on reconnect, send queued ops.

Presence (cursors, selections, names)

Presence is ephemeral state — separate from the document. Each client publishes its cursor position and metadata over the same channel. Other clients render colored carets and selection ranges. No persistence needed; on disconnect, presence vanishes.

The rendering layer

  • The editor (Lexical/ProseMirror/CodeMirror/custom) renders the document tree to the DOM
  • Edits update the document state; the editor diffs and patches the DOM
  • Remote ops are applied through the same path so local and remote edits look the same to the renderer

Offline mode

Persist the CRDT state in IndexedDB. On reconnect, sync local ops with the server (CRDT handles the merge). Handle conflicts gracefully — Docs shows a “you were offline” notice; you should mention this UX detail.

What interviewers reward

  • Naming OT vs CRDT and explaining the tradeoff
  • Recognizing presence as separate from document state
  • Discussing reconnect and offline replay
  • Mentioning IndexedDB for local persistence
  • Discussing scaling: connection pooling, sharding sessions across servers
  • Touching on undo/redo as transaction history (and the harder question of “what does undo mean in a multi-user doc?”)

Common omissions

  • Forgetting that selection state must survive remote edits arriving
  • Mixing presence with persistent state
  • No story for paste, drag-drop, or images
  • No discussion of access control, comments, suggestions
  • Treating it as a backend system-design question instead of frontend

Frequently Asked Questions

Yjs or Automerge?

Yjs is the production default in 2026 (used by JupyterLab, Notion-clones, many tools). Automerge is more academically pure but heavier per-byte. Pick Yjs unless you have a specific reason.

What about undo in a multi-user doc?

“Local undo only” is the practical answer — undo your own edits, ignore others’. CRDTs make this tricky; Yjs provides a UndoManager that scopes to a client.

How do I handle 1000 concurrent editors?

You don’t — most products cap at 50–100. Beyond that, switch to a different model (read-mostly with suggested edits, or chunked editing).

Scroll to Top