Question 1

What are the tradeoffs between Operational Transformation (OT) and CRDTs for collaborative editing?

Accepted Answer

OT maintains a single authoritative document state on the server: clients send operations, the server transforms them against concurrent ops and broadcasts the resolved result. This gives strong consistency guarantees and compact wire representations, but requires a central server and a correct (notoriously difficult to implement) transform function for every operation type. CRDTs (Conflict-free Replicated Data Types) embed merge semantics into the data structure itself so any two replicas can merge without coordination, enabling true peer-to-peer and offline editing. The cost is metadata overhead — each character or element must carry a unique logical identifier (typically a Lamport timestamp or site ID tuple), inflating document size by 2-10x compared to plain text. OT is the right choice when you control the server and want minimal storage cost; CRDTs are right when you need offline-first or decentralized topologies.

Question 2

How does the server authority model work in Operational Transformation?

Accepted Answer

Each client maintains a local copy of the document and a revision counter. When a user makes an edit, the client immediately applies it locally (optimistic update) and sends the operation along with its current revision to the server. The server serializes all incoming operations: if the client's revision matches the server's current revision, the operation is applied directly; if other operations were applied in the meantime, the server transforms the incoming operation against each intervening operation using the transform function (e.g., if client inserted at index 5 but a concurrent delete shifted content, adjust the index). The server then broadcasts the transformed operation to all other clients at the new revision. Clients apply broadcast operations using a symmetric transform against any locally pending (unacknowledged) operations.

Question 3

How significant is the metadata overhead in CRDT-based collaborative editors?

Accepted Answer

In sequence CRDTs like RGA (Replicated Growable Array) or YATA (used by Yjs), every character is stored as a node containing the character value, a unique ID (site ID + sequence number, typically 16-24 bytes), and pointers to its left and right neighbors. For a 10,000-character document this can mean 240-480 KB of metadata versus 10 KB for the raw text — a 25-48x overhead in the worst case. LSEQ and tree-based CRDTs reduce this by using variable-length positional identifiers, but overhead remains significant. In practice, implementations compress the identifier space and garbage-collect tombstoned deletions to keep memory manageable. The overhead is most painful during initial load of large documents; streaming the CRDT state incrementally mitigates perceived latency.

Question 4

How should an offline editing queue work when a client reconnects?

Accepted Answer

While offline, the client persists all local operations to IndexedDB (or equivalent durable local storage) with monotonically increasing sequence numbers. On reconnect, the client sends its last-known server revision and its pending operation log to the server. For OT systems, the server replays each offline operation through the standard transform pipeline against all operations that were applied during the disconnect window, then broadcasts the resolved operations. For CRDT systems, the client simply merges its local state with the server's current state using the CRDT merge function — no transform logic is needed. In both cases, the client must detect and surface unresolvable semantic conflicts (e.g., simultaneous deletion and modification of the same paragraph) to the user rather than silently discarding one side.

Question 5

How do you implement cursor presence (showing other users' cursors) using Redis TTL?

Accepted Answer

Store each user's cursor position as a Redis key with a short TTL, for example presence:{doc_id}:{user_id}, with a JSON value containing the user's display name, color, and cursor offset. The client sends a heartbeat (SETEX with TTL reset) every 2-3 seconds while the document is focused. When the TTL expires — because the user closed the tab, lost connectivity, or went idle — the key disappears automatically without any explicit cleanup. Other clients poll or subscribe via Redis keyspace notifications to detect presence changes. To avoid polling, broadcast cursor updates over the existing WebSocket channel and use Redis only as the authoritative fallback for clients that join mid-session; on join, HGETALL or SCAN the presence keys for the document to initialize the cursor layer.

Low Level Design: Collaborative Document Editor

Introduction

Operational Transformation (OT)

CRDT Alternative

Document Model

Real-Time Sync

Presence and Cursors

Snapshots and Version History

Offline Editing

Frequently Asked Questions: Collaborative Document Editor