Low-Level Design: Collaborative Document Editor — Operational Transform, CRDT, and Conflict Resolution

Core Requirements

A collaborative document editor allows multiple users to edit the same document simultaneously, with changes reflected in near-real-time on all clients. Key challenges: (1) Conflict resolution — two users edit the same part of the document simultaneously. (2) Consistency — all clients converge to the same document state. (3) Offline support — users can edit while disconnected; changes sync when reconnected. (4) Latency — local edits feel instantaneous (optimistic updates); remote edits arrive with slight delay. Two main approaches: Operational Transformation (OT) — used by Google Docs; Conflict-free Replicated Data Type (CRDT) — used by Figma, Notion.

Data Model

Document: doc_id, title, owner_id, created_at, updated_at, version (monotonic counter). Operation: op_id (UUID), doc_id, user_id, op_type (INSERT, DELETE, RETAIN), position, content (for INSERT), length (for DELETE), parent_version (the document version this op was based on), applied_at. Snapshot: snapshot_id, doc_id, version, content (full document text at that version), created_at. Collaborator: user_id, doc_id, permission (VIEW, COMMENT, EDIT), cursor_position, last_seen_at. Revision history: sequence of operations that can reconstruct any historical version.

Operational Transformation

OT resolves conflicts by transforming operations against concurrent operations. Example: Document “Hello”. User A inserts ” World” at position 5. Concurrently, User B deletes “H” at position 0. Without OT: applying both as-is gives “elloHello World” (wrong). With OT: transform B’s delete against A’s insert. After A’s insert, position 5 is now “Hello World”; B’s delete at position 0 is still correct (position 0 didn’t shift). Server applies all operations sequentially: every client sends operations with a base version. Server transforms the operation against all operations applied since that version, then applies the transformed operation. Clients apply server-confirmed operations. The server is the arbiter: it defines the canonical order.

def transform_insert_against_insert(op1, op2):
    # op1: INSERT at pos1, op2: INSERT at pos2 (concurrent)
    # Transform op1 to account for op2 being applied first
    if op2.position <= op1.position:
        op1.position += len(op2.content)
    return op1

def transform_insert_against_delete(op1, op2):
    # op1: INSERT at pos1, op2: DELETE at pos2 length len2
    if op2.position < op1.position:
        op1.position = max(op2.position,
                           op1.position - op2.length)
    return op1

CRDT Alternative

CRDTs (Conflict-free Replicated Data Types) are data structures designed to be merged without conflicts. For text editing: Logoot or LSEQ assigns each character a unique fractional position (e.g., between position 1 and 2, insert at 1.5; between 1.5 and 2, insert at 1.75). Positions are globally unique and totally ordered. Two clients inserting at the same position get different fractional positions — no conflicts. Delete = mark character as tombstone (don’t remove from position list immediately). Merge = union of all character sets, sort by position. CRDT pros: no central server needed for conflict resolution (works peer-to-peer), true offline support, simpler convergence guarantees. CRDT cons: document representation grows (tombstones accumulate), positions can become very long (precision needed for deeply nested inserts), harder to implement correctly. Modern systems (Yjs, Automerge) use efficient CRDT implementations that compact tombstones periodically.

Real-Time Synchronization

WebSocket connection per active collaborator. Server maintains a presence map: doc_id → {user_id → {cursor_position, last_active}}. On edit: client sends operation over WebSocket → server applies OT transform → broadcasts transformed operation to all other clients in the document room. Cursor broadcast: send cursor position updates at 100ms intervals (throttled). Show collaborator cursors in the editor UI with their name and color. Reconnection: client stores a local operation queue. On reconnect, send all pending operations with the last acknowledged version. Server replays missed operations since that version. Snapshot compaction: after N operations (e.g., 1000), create a snapshot of the full document text. This bounds the replay cost on reconnect and the storage cost of operation history.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the core problem that Operational Transformation solves?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “When two users edit a document simultaneously, their operations are based on different document states. User A (at version 5) inserts “X” at position 3. User B (also at version 5) deletes the character at position 1. If both operations are applied in sequence without transformation, the second operation acts on the wrong position — the document has changed due to the first operation. OT transforms the second operation to account for the effects of the first: the delete at position 1 must be adjusted if the insert at position 3 came first (no adjustment needed in this case — position 1 is before position 3). But if the insert was at position 0, the delete’s position shifts to 2. The transformation function (transform(op1, op2) u2192 op1′) adjusts op1 as if op2 had already been applied. The server defines the canonical order of operations and transforms each incoming operation against all concurrent operations that were applied before it. The result: all clients converge to the same document state.”
}
},
{
“@type”: “Question”,
“name”: “What is the difference between OT and CRDT for collaborative editing?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “OT (Operational Transformation): requires a central server to define the canonical order of operations and perform transformations. Operations are position-based (insert at index 5). Transformation functions are complex and error-prone (there are known bugs in published OT algorithms). Requires server coordination for convergence. Used by: Google Docs. CRDT (Conflict-free Replicated Data Type): operations are commutative and idempotent by design — any order of application yields the same result. No central server needed (peer-to-peer possible). For text: each character gets a globally unique ID and a fractional position. Operations are character-based, not position-based. Merge = union of characters sorted by position. No transformation needed. Used by: Figma (multiplayer), Notion (partial), Linear. Trade-offs: OT has lower storage overhead (operations are compact); CRDT accumulates tombstones (deleted characters remain in the data structure). OT transformations are harder to implement correctly; CRDT convergence is mathematically guaranteed. For new systems: CRDT is generally preferred for its simplicity of correctness guarantees.”
}
},
{
“@type”: “Question”,
“name”: “How do you implement undo/redo in a collaborative document editor?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Single-user undo/redo: maintain an undo stack of operations. Undo = apply the inverse operation (insert u2192 delete, delete u2192 insert). Collaborative undo is much harder: undoing your own operation in a document that others have modified since. The naive approach (revert to a previous snapshot) is unacceptable — it would also undo collaborators’ changes. Selective undo: undo only your own operations, preserving others’ changes. Algorithm: to undo operation A (which was applied at version V), generate the inverse of A and transform it against all operations applied after V (by any user). Apply the transformed inverse to the current document. This is complex to implement correctly with OT. For most collaborative editors: undo only reverts your changes, and the undo history is per-user. If collaborators have made changes that conflict with your undo, the undo is blocked or results in a merge that requires acknowledgment. Simpler approach: treat the document as an append-only log; undo = insert a compensating operation, not revert state. This keeps the operation log intact.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle cursor positions when remote edits change the document?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A collaborator’s cursor is a position in the document. When a remote edit is received (insert N characters at position P), all cursors at positions >= P shift right by N. When a deletion of N characters at position P is received: cursors at positions > P+N shift left by N. Cursors at positions within the deleted range are clamped to P (they were in deleted content). Implementation: maintain each collaborator’s cursor as an index. On receiving a remote operation, transform all remote cursor positions using the same transformation logic as the text itself. Broadcast cursor positions as a separate real-time channel (WebSocket). Don’t include cursor updates in the operation log (they’re ephemeral). Presence awareness: show each collaborator’s cursor with their name and a unique color. Throttle cursor broadcast to 50-100ms to avoid flooding the channel. Remove a collaborator’s cursor when they disconnect or become inactive for 30 seconds.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle offline editing and sync-on-reconnect in a collaborative editor?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Offline editing: the client stores all locally-generated operations in an IndexedDB (browser) or local SQLite (mobile) operation log. The user continues editing — changes appear immediately in the local UI (optimistic update). The local document state = server-confirmed document + all pending local operations. On reconnect: send all pending operations to the server with the base version (last confirmed version before going offline). Server applies OT: transform each pending operation against all server operations applied since the base version. Returns the transformed confirmations and the server operations the client missed. Client applies server-missed operations to its confirmed state, then re-applies the (now-transformed) local operations. The client and server converge. Conflict display: for severe conflicts (e.g., someone deleted a paragraph the offline user heavily edited), show a diff/merge UI and let the user manually reconcile. Sync status indicator: show “synced”, “saving…”, or “offline — changes saved locally” in the UI so users always know their edit state.”
}
}
]
}

Asked at: Atlassian Interview Guide

Asked at: Meta Interview Guide

Asked at: LinkedIn Interview Guide

Asked at: Cloudflare Interview Guide

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Scroll to Top