Build a Spreadsheet: Cell Editor, Formulas, and Large Grids

⏱ 3 min read

“Build a spreadsheet” is one of the harder senior-frontend interview prompts because it combines a non-trivial data model, a virtualized 2D grid, formula parsing and evaluation, and selection mechanics that mimic Excel. Even a partial answer in 60 minutes is a strong signal. This guide covers what interviewers actually probe.

Clarify scope

How large is the grid? (10K rows? 1M cells?)
Formulas — basic arithmetic only or full Excel-like (SUM, IF, INDEX, MATCH)?
Multi-sheet workbook or single sheet?
Multi-user collaboration?
Formatting (bold, color, borders) in scope?

The data model

Sparse map, not a 2D array. Most cells are empty in a typical sheet:

{
  cellsByAddress: Map<"A1", { value, formula?, format? }>,
  rowSizes: Map<rowIndex, height>,
  colSizes: Map<colIndex, width>,
  selection: { anchor: "A1", focus: "C5" }
}

Address as string (“B12”) is convenient; row/col integers under the hood. The sparse map keeps memory bounded.

Rendering 1M cells

Virtualization is mandatory. Approach:

Compute visible row range from scroll top + viewport height
Compute visible col range from scroll left + viewport width
Render only those cells (typical: ~50 rows × ~20 cols visible)
Use absolute positioning or CSS Grid with calculated offsets
Handle variable row heights and column widths via prefix sums

Formula parsing and evaluation

The hardest correctness problem. Steps:

Tokenize: =A1+B2*3 → tokens
Parse to AST: +(A1, *(B2, 3))
Build dependency graph from AST: A1 and B2 are dependencies of this cell
Topological sort: evaluate in dependency order
Detect cycles: A1 = B1; B1 = A1 → mark as #CIRCULAR

For an interview, hand-rolled recursive-descent parser is the move. SLY, peggy, or chevrotain for production.

Reactive recalculation

When a cell changes:

Find all cells that depend on it (reverse dependency map)
Re-evaluate them in topo order
Cascade — their dependents may also need re-evaluation
Memoize evaluated values; invalidate only what changed

Naive “re-evaluate everything” works for 100 cells; collapses past that.

Selection mechanics

Click — select single cell
Click + drag — range selection
Shift+click — extend from anchor to clicked
Cmd+click — discontinuous selection
Arrow keys — move focus; Shift+Arrow — extend
Cmd+A — select all (or current region)
Tab — move right; Shift+Tab — move left; Enter — move down

Cell editing

Double-click or F2 enters edit mode
Typing replaces existing content; Esc cancels; Enter commits
Formulas highlighted with referenced cells colored
Drag-select to insert range references in formulas

Copy / paste

Copy a range — serialize as TSV to clipboard (compatible with Excel/Sheets)
Paste — parse TSV, write to current selection
Internal copy/paste preserves formulas with relative references adjusted
Mixed clipboard: HTML for formatted, TSV as fallback

Undo / redo

Each action is a transaction with forward and inverse
Stack-based; cap depth to bound memory
Group small typing operations into a single undo step
Cmd+Z / Cmd+Shift+Z (or Cmd+Y) for redo

Performance pitfalls

Re-rendering all visible cells on every formula change — memoize per-cell components
Recomputing prefix sums for row/col sizes on every render — cache and invalidate
Storing full 2D array of cells — sparse map only
Naive scroll handler — throttle with rAF; do not query layout in hot path

Multi-user (if asked)

Operations as transactions; sent to server
Server applies and broadcasts
Conflict resolution: cell-level last-write-wins or operational transform
Cursors and selections shown for other users in their color
Same architecture as Google Docs — see the standalone Docs design question

Accessibility

Grid uses role="grid"
Each cell as role="gridcell" with row/col headers (aria-rowindex, aria-colindex)
Live region announces selection changes
Keyboard navigation full WAI-ARIA grid pattern

What separates senior from staff

Senior: builds the data model and a basic virtualized render. Staff: handles formula parsing, dependency graph, and reactive recalc. Principal: discusses multi-user collaboration, performance under formula-heavy sheets, and the Excel compatibility story.

Frequently Asked Questions

Library options for production?

HyperFormula (formula engine), AG Grid (commercial grid), Univer (newer OSS spreadsheet), Handsontable. For interview, build the basics yourself.

How do I handle very large sheets?

Virtualize aggressively. Lazy-load cell data per visible region. Compute formulas server-side for very heavy sheets and stream results.

What about LookOut / array formulas / lambda?

Out of scope unless the interviewer asks. Mention you would extend the formula engine to support these incrementally; the architecture supports it.