Real-time document collaboration lets multiple users edit the same document simultaneously, with each user’s changes appearing on others’ screens within milliseconds. The core problem is handling concurrent edits that conflict: if Alice deletes character 5 while Bob inserts at position 5, the resulting document depends on the order operations are applied. Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs) are the two standard approaches, with CRDTs now preferred for distributed systems.
Core Data Model
CREATE TABLE Document (
doc_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
owner_id BIGINT NOT NULL,
title VARCHAR(500) NOT NULL,
content TEXT NOT NULL DEFAULT '',
version BIGINT NOT NULL DEFAULT 0, -- increments on every edit
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Append-only operation log (source of truth for collaboration)
CREATE TABLE DocumentOperation (
op_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
doc_id UUID NOT NULL REFERENCES Document(doc_id),
user_id BIGINT NOT NULL,
client_id VARCHAR(50) NOT NULL, -- unique per browser tab
op_type VARCHAR(10) NOT NULL, -- 'insert', 'delete', 'retain'
position INT NOT NULL,
content TEXT, -- for insert operations
length INT, -- for delete operations
base_version BIGINT NOT NULL, -- version this op was based on
server_version BIGINT, -- assigned by server after OT
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_docop_doc_version ON DocumentOperation(doc_id, server_version);
-- Active sessions (for presence/cursor tracking)
CREATE TABLE DocumentSession (
session_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
doc_id UUID NOT NULL,
user_id BIGINT NOT NULL,
client_id VARCHAR(50) NOT NULL UNIQUE,
cursor_pos INT,
color VARCHAR(7), -- user's cursor color hex
last_seen TIMESTAMPTZ DEFAULT NOW()
);
Operational Transformation (OT)
OT transforms an incoming operation against all operations that were applied concurrently (operations the client hadn’t seen yet when it generated its operation). The transform ensures that applying operations in different orders produces the same final document.
def transform_insert_against_insert(op1: dict, op2: dict) -> dict:
"""Transform op1 (insert) against op2 (insert) that was applied first."""
if op2['position'] dict:
"""Transform insert op1 against delete op2 that was applied first."""
if op2['position'] + op2['length'] = op1['position']:
# Delete was after insert — insert unchanged
return op1
else:
# Delete overlaps insert position — move insert to delete start
return {**op1, 'position': op2['position']}
def apply_operation_on_server(doc_id: str, incoming_op: dict) -> dict:
with db.transaction():
doc = db.fetchone("SELECT * FROM Document WHERE doc_id=%s FOR UPDATE", [doc_id])
# Get all operations applied after the client's base_version
concurrent_ops = db.fetchall("""
SELECT * FROM DocumentOperation
WHERE doc_id=%s AND server_version > %s
ORDER BY server_version ASC
""", [doc_id, incoming_op['base_version']])
# Transform incoming op against each concurrent op
transformed_op = incoming_op
for concurrent in concurrent_ops:
transformed_op = transform(transformed_op, concurrent)
# Apply the transformed op to the document
new_content = apply_to_string(doc['content'], transformed_op)
new_version = doc['version'] + 1
db.execute("""
UPDATE Document SET content=%s, version=%s, updated_at=NOW()
WHERE doc_id=%s
""", [new_content, new_version, doc_id])
db.execute("""
INSERT INTO DocumentOperation (..., server_version)
VALUES (..., %s)
""", [new_version])
return {**transformed_op, 'server_version': new_version}
CRDT Approach (Yjs / Automerge)
CRDTs assign a globally unique ID to every character insertion. Instead of tracking positions (which shift as text changes), the CRDT tracks “insert after character X.” Concurrent insertions at the same position are ordered by a deterministic tiebreaker (user ID, timestamp). Delete operations mark characters as “tombstoned” rather than removing them. The document state is the set of all non-tombstoned characters sorted by their position IDs.
# Yjs (JavaScript CRDT library) — Python server integration
# Server stores the Yjs document binary state
def apply_yjs_update(doc_id: str, update_bytes: bytes, client_id: str):
"""Apply a Yjs binary update to the server document state."""
import y_py as Y # Python bindings for Yjs
# Load or create the document state
state = redis.get(f"doc:yjs:{doc_id}")
ydoc = Y.YDoc()
if state:
Y.apply_update(ydoc, state)
# Apply the incoming update
Y.apply_update(ydoc, update_bytes)
# Persist updated state
new_state = Y.encode_state_as_update(ydoc)
redis.setex(f"doc:yjs:{doc_id}", 3600, new_state)
# Broadcast to other clients via WebSocket pub/sub
redis.publish(f"doc:room:{doc_id}", json.dumps({
'type': 'update',
'update': update_bytes.hex(),
'sender': client_id
}))
Cursor and Presence Tracking
def broadcast_cursor(doc_id: str, user_id: int, client_id: str, position: int):
"""Broadcast cursor position to all collaborators via WebSocket."""
redis.publish(f"doc:room:{doc_id}", json.dumps({
'type': 'cursor',
'user_id': user_id,
'client_id': client_id,
'position': position,
'timestamp': int(time.time() * 1000)
}))
# Update session record
db.execute("""
UPDATE DocumentSession SET cursor_pos=%s, last_seen=NOW()
WHERE client_id=%s
""", [position, client_id])
Key Interview Points
- OT requires a central server to serialize and transform operations — it cannot be fully peer-to-peer. CRDTs work without a central authority and are the basis of modern collaborative editors (Figma, Notion, Linear).
- The operation log is the source of truth — the document content field is a derived, cached value. On server restart, rebuild from the operation log.
- Tombstoning deletes (marking as deleted rather than removing) is essential for CRDTs — it prevents conflicts when two clients delete the same character simultaneously.
- Save document snapshots periodically — replaying 1 million operations on load is too slow. Store a full snapshot every N operations and replay only the delta.
- Awareness (presence/cursor) data is ephemeral and eventually consistent — use a separate pub/sub channel with no persistence, not the operation log.
- Conflict resolution for non-text content (moving shapes in a diagram, reordering list items) requires different CRDT types: LWW registers for single-value fields, RGA for ordered sequences.
Document collaboration and real-time editing design is discussed in Google system design interview questions.
Document collaboration and real-time co-editing design is covered in Atlassian system design interview preparation.
Document collaboration and CRDT-based sync design is discussed in Meta system design interview guide.