Low Level Design: Stock Trading System

Order Book Structure

The order book is the central data structure of any trading system. It maintains two sides: the bid side (buy orders) sorted in descending price order, and the ask side (sell orders) sorted in ascending price order. Each price level contains a FIFO queue of orders, preserving arrival order for time priority matching. The canonical implementation uses a sorted map keyed by price — a std::map in C++ or a TreeMap in Java. Insert and delete are O(log N) where N is the number of distinct price levels. Best bid and best ask retrieval are O(1) via rbegin() on the bid side and begin() on the ask side. Each map value is a doubly-linked list of orders at that price level, enabling O(1) removal of a specific order when given a pointer to its node. Orders are indexed separately in a hash map from order_id to order node pointer, allowing cancel and modify operations to locate and remove orders in O(1). The combination of sorted map for price traversal and hash map for direct access is the standard approach used by production exchange matching engines.

Order Types

Understanding order types is essential for modeling both client behavior and matching engine logic. A market order executes immediately at the best available price on the opposite side. It sweeps through price levels until filled or the book is exhausted. Partial fills are normal; any unfilled remainder may be cancelled or, in some systems, converted to a limit order at last traded price. A limit order executes at the specified price or better. A buy limit at $100 will match against asks at $100 or below. If not immediately matchable, it rests in the book. Limit orders are the primary building block of the order book. A stop order remains dormant until a trigger price is reached, at which point it converts to a market order. A stop-loss sell at $90 activates when the last trade price falls to $90, protecting the holder from further downside. IOC (Immediate-or-Cancel) orders execute as much as possible immediately and cancel any unfilled remainder. FOK (Fill-or-Kill) orders must execute in full immediately or be cancelled entirely — no partial fills allowed. FOK requires a pre-check pass through the book before any execution.

Matching Engine

The matching engine applies price-time priority: orders are matched against the opposite side in price order first, then in arrival order (FIFO) among orders at the same price. When a new order arrives, the engine checks if it crosses the spread — if a buy limit price >= best ask, or a sell limit price <= best bid. If it does, matching begins. The engine iterates through the best price level on the opposite side, filling against each order in queue order. If the incoming order is fully filled, it is discarded. If the opposite side's resting order is fully filled, it is removed from the book and queue. If the incoming order is partially filled against a resting order, the resting order's quantity is decremented in place. After exhausting a price level's queue, the engine removes the price level entry from the sorted map and moves to the next best price. Matching continues until the incoming order is filled, no crossing orders remain, or (for limit orders) the incoming order's limit price is no longer satisfied. Partial fills generate individual execution reports. Each fill records: trade_id, aggressor_order_id, resting_order_id, price, quantity, timestamp.

Market Data Feed

Market data feeds distribute order book state to subscribers. Two standard levels exist. Level 1 publishes best bid price, best bid quantity, best ask price, best ask quantity, and last trade price/size. This is sufficient for most retail and algorithmic traders. Level 2 publishes the full order book depth — all price levels and their aggregate quantities on both sides. Some feeds also provide Level 3 data showing individual orders, not just aggregated quantities per level. Feeds are published via multicast UDP for low latency. Each message carries a sequence number. Receivers detect gaps by checking for missing sequence numbers and request retransmission from a TCP-based snapshot server. To minimize retransmission load, receivers can also request a full book snapshot and replay the live feed from that point forward. Feed handlers decode binary protocols (ITCH, PITCH, proprietary FIX variants) and maintain an in-memory order book replica. The decoded book drives both trading algorithms and display systems.

Pre-Trade Risk Checks

Risk checks run on every order before it reaches the matching engine. They must be fast — typically sub-microsecond — to avoid adding latency. Position limits prevent a trader from exceeding maximum long or short exposure in a symbol or across the portfolio. The risk engine maintains a real-time position ledger updated on each fill. Buying power check ensures sufficient cash or margin is available for the order. For equities, this is straightforward. For derivatives, margin calculations may be more complex. Price collar rejects orders priced more than N% away from the last trade price. This prevents fat-finger errors where a trader accidentally submits an order with a misplaced decimal. Duplicate order detection checks a recent cache of client order IDs. If the same client_order_id appears twice within a window, the second is rejected to prevent double submission. A kill switch per trader or firm can halt all order submission immediately. This is activated manually or automatically on breach of loss limits.

Trade Settlement

After a match, an execution report is generated and sent to both the aggressor and resting order parties via FIX protocol (FIX 4.2 or later). The execution report contains: ExecID, OrderID, ClOrdID, ExecType (Trade), OrdStatus, Symbol, Side, LastQty, LastPx, CumQty, LeavesQty. The clearing house receives all matched trades and nets positions across participants at end of day. Netting reduces the number of actual securities transfers required. For equities in most markets, final settlement occurs on T+2 (trade date plus two business days). During the settlement window, the clearing house acts as central counterparty, guaranteeing delivery even if one side defaults. Failed trades trigger buy-in procedures where the clearing house purchases the securities in the open market and charges the failing party for any cost difference.

Latency Optimization

Exchange matching engines operate in the microsecond to nanosecond range. Standard OS networking adds too much latency for competitive co-location. Kernel bypass networking via RDMA (InfiniBand) or DPDK (Data Plane Development Kit) allows network packets to be processed entirely in user space, bypassing the kernel TCP/IP stack. Round-trip times drop from tens of microseconds to single-digit microseconds. Lock-free ring buffers (SPSC — single producer, single consumer) pass messages between components without mutex overhead. The producer writes to the tail, the consumer reads from the head, and cache line padding prevents false sharing. CPU affinity pins specific threads to specific cores, eliminating context switch overhead. NUMA awareness ensures memory allocations occur on the same NUMA node as the processing core, avoiding cross-node memory access latency. Co-location in the exchange’s data center reduces network round-trip to sub-millisecond, giving co-located participants a structural edge over remote participants.

Auction Mechanism

Most exchanges run call auctions at market open and close rather than continuous matching. During the auction period, orders accumulate without matching. At the end of the auction, a single clearing price is determined that maximizes the total tradeable volume. The algorithm iterates through candidate prices. At each price, it calculates: total buy volume willing to trade at-or-above that price, total sell volume willing to trade at-or-below that price. The clearing price is where min(buy_volume, sell_volume) is maximized. In case of ties, secondary criteria apply (e.g., minimize surplus, match reference price). All matched orders execute at the single clearing price regardless of their individual limit prices. This is important for price discovery and reduces the advantage of order submission timing. Unmatched orders may either rest in the book as the continuous session begins or be cancelled depending on their time-in-force. { “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [ { “@type”: “Question”, “name”: “Why do matching engines use price-time priority over FIFO?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Price-time priority ensures that the best-priced orders execute first, which incentivizes market makers to post competitive quotes. Within the same price level, FIFO ordering rewards orders placed earlier, encouraging liquidity provision. Pure FIFO without price priority would allow inferior prices to execute ahead of better ones, harming price discovery and overall market quality.” } }, { “@type”: “Question”, “name”: “How does market data multicast handle packet loss?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Exchanges publish market data over UDP multicast with sequence numbers on every message. Receivers detect gaps in sequence numbers and trigger a retransmission request via a separate TCP-based recovery channel (e.g., NYSE Refresh or NASDAQ MoldUDP64 retransmit). For extremely low-latency feeds, some venues use redundant multicast groups so receivers can reconstruct data from a secondary stream without waiting for a retransmit round-trip.” } }, { “@type”: “Question”, “name”: “What pre-trade risk checks must complete in microseconds?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Critical sub-microsecond pre-trade checks include: position and buying-power limits (to prevent oversized exposure), duplicate order detection (same order ID check), price collar validation (reject orders too far from the national best bid/offer), instrument status checks (halted or not tradable), and maximum order size limits. These are implemented in FPGA or lock-free in-memory data structures to stay within single-digit microsecond budgets.” } }, { “@type”: “Question”, “name”: “What are lock-free order book implementation techniques?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Lock-free order books typically use compare-and-swap (CAS) operations on atomic pointers for queue manipulation, hazard pointers or epoch-based reclamation for safe memory deallocation, and a price-level map backed by a skip list or a flat array indexed by price tick. Single-writer multiple-reader designs pin the matching engine to one core so only reads need to be lock-free, eliminating CAS contention on the hot path entirely.” } }, { “@type”: “Question”, “name”: “Why do stock exchanges offer co-location for low latency?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “Co-location places a trading firm’s servers inside the same data center as the exchange’s matching engine, reducing round-trip network latency to under 100 microseconds compared to milliseconds over public internet. It also eliminates variable WAN jitter, gives predictable cross-connect fiber distances, and allows firms to use kernel-bypass networking (DPDK, RDMA, or FPGA NICs) directly connected to the exchange network. Regulators require exchanges to offer co-location at equal distances to all participants to ensure fair access.” } } ] }