System Design Interview: Real-Time Collaboration (Google Docs)

⏱ 7 min read

The Core Challenge

Real-time collaborative editing requires multiple users to edit the same document simultaneously and see each other changes within 100-200ms, with every client eventually converging to the same document state. The hard part: two users edit the same position at the same time — their changes must be merged without losing either edit and without corrupting the document. This question is asked at Google, Figma, Notion, and any company building collaborative tools.

Operational Transformation (OT)

Google Docs uses Operational Transformation. Every edit is an operation: Insert(“a”, position=5) or Delete(position=5, length=1). When two users make concurrent operations, OT transforms one operation against the other so both can be applied in any order to produce the same result.


# User A inserts "X" at position 3
op_a = Insert("X", pos=3)

# User B (concurrently) inserts "Y" at position 5
op_b = Insert("Y", pos=5)

# If A is applied first:
# B must be transformed: B position shifts +1 (because A inserted before B)
op_b_prime = Insert("Y", pos=6)  # transformed

# Result: document has both "X" at 3 and "Y" at 6 — consistent regardless of order

OT requires a central server to establish a total ordering of operations. The server is the source of truth: all clients send operations to the server; the server applies them in order, transforms each against concurrently received operations, and broadcasts the transformed operations to all other clients. Clients apply operations optimistically (show immediately in their local UI) then reconcile when they receive the server-confirmed operations.

CRDTs: The Alternative Approach

Conflict-free Replicated Data Types (CRDTs) are a mathematically defined family of data structures that always merge consistently without a central server. Figma uses a CRDT approach for its collaborative canvas. For text: each character is assigned a globally unique ID (node ID + timestamp). The ordering of characters is determined by their IDs, not their positions, so two concurrent inserts can always be merged deterministically. CRDTs enable peer-to-peer collaboration and work naturally offline — edits accumulate locally and sync when reconnected. The tradeoff: CRDTs use more memory (metadata per character) and are harder to implement correctly for complex data types like nested structures.

Architecture

WebSocket Connection Management

Each document editing session opens a WebSocket connection to a document server. The document server holds the in-memory document state and the operation log. Users editing the same document connect to the same server (sticky sessions via consistent hashing on document ID). When the server receives an operation from a client: (1) Apply it to the in-memory document. (2) Persist it to the operation log (append-only, durable). (3) Transform it against any concurrent operations received since the client last synced. (4) Broadcast the confirmed (possibly transformed) operation to all other connected clients.

Persistence and Document Snapshots

The operation log grows unboundedly. Compact it with periodic snapshots: every 1000 operations, persist a full snapshot of the document state. On server restart or new client joining, load the latest snapshot and replay only operations since that snapshot — typically under 100 operations. Snapshots are stored in object storage (S3). The operation log is stored in a database ordered by (document_id, operation_sequence_number). This enables: (1) Fast cold start, (2) Full edit history for version history features, (3) Point-in-time document recovery.

Presence and Cursor Sharing

Real-time presence (showing collaborators names and cursor positions) is a separate, ephemeral system — no durability required. Use a pub/sub channel per document. Each client broadcasts its cursor position every 100ms. The server fans out to all other clients in the document session. Cursor positions expire after 5 seconds if not refreshed (handles disconnection). This traffic pattern (fan-out, high frequency, no persistence) is ideal for Redis pub/sub or a dedicated WebSocket message broker, separate from the durable operation pipeline.

Offline Editing

When a client loses connectivity, it continues accepting local edits. Operations are queued locally (IndexedDB in the browser). On reconnection: (1) Client sends all queued operations to the server with the last-known server state ID. (2) Server fetches all operations that happened while the client was offline. (3) Server transforms the client operations against the offline operations and applies them. (4) Server sends all missed operations to the client. (5) Client applies the server operations to reconcile. This is a full OT merge — the same algorithm as concurrent editing, just covering a longer offline period.

Interview Tips

Mention both OT (Google Docs approach) and CRDTs (Figma approach) — shows breadth
The central server for OT is not a weakness — it simplifies correctness at the cost of requiring connectivity
Separate the concerns: operation sync (durable, ordered) vs presence (ephemeral, high-frequency)
Snapshots + operation log is the standard persistence pattern — interviewers expect this
Offline support differentiates a senior answer

Frequently Asked Questions

What is Operational Transformation and how does Google Docs use it?

Operational Transformation (OT) is the algorithm that enables real-time collaborative editing. When two users make concurrent edits (e.g., User A inserts "X" at position 3 while User B inserts "Y" at position 5), OT transforms operations against each other so they can be applied in any order and still produce the same result. Google Docs uses a central server to establish a total ordering of operations. Each client sends operations to the server; the server applies them in sequence, transforms each against concurrently received operations, and broadcasts the (possibly transformed) result to all other clients. Clients apply operations optimistically to their local state for instant UI response, then reconcile when the server confirms. This architecture requires the server as a serialization point — it is not peer-to-peer.

What is the difference between OT and CRDTs for collaborative editing?

Operational Transformation (OT) requires a central server to serialize concurrent operations. The server establishes a global operation order; clients transform their pending operations against server-confirmed operations to stay in sync. OT is proven and used by Google Docs, but requires always-online server connectivity for collaboration. CRDTs (Conflict-free Replicated Data Types) are mathematically defined data structures that always merge consistently without central coordination. Each character in a CRDT text document has a globally unique identifier; ordering is determined by IDs rather than positions. CRDTs enable peer-to-peer collaboration and natural offline editing — edits accumulate locally and sync when reconnected. Figma uses a CRDT-inspired approach. The tradeoff: CRDTs use more memory (per-character metadata) and are more complex to implement correctly for rich document structures.

How do you handle offline editing in a collaborative document editor?

When a client loses connectivity, it continues accepting local edits. Operations are queued in browser storage (IndexedDB). On reconnection: the client sends all queued operations with its last-known server sequence number. The server fetches all operations that occurred while the client was offline. It transforms the client operations against the missed server operations (same OT algorithm used for concurrent edits), applies them, and broadcasts the result. The client receives all missed server operations and applies them to reconcile its local state. This produces the same outcome as if all edits had been made while online — OT handles the merge regardless of the gap duration. For very long offline periods (days of edits from hundreds of collaborators), the merge can involve thousands of transformation steps, but the algorithm is correct by construction.

LinkedIn Interview Guide

Snap Interview Guide

Twitter Interview Guide

Airbnb Interview Guide

Meta Interview Guide