A high-performance, multi-threaded Limit Order Book (LOB) engine optimized for sub-microsecond execution on ARM64 (Apple Silicon) and x86_64 architectures.
Tests conducted on Apple M-series Silicon (8-core, 24MHz base).
| Component | Metric | Result |
|---|---|---|
| Raw Match Latency | Avg Time per Order | 86.6 ns |
| End-to-End Latency | Producer-to-Consumer | 173.2 ns |
| Throughput | Orders per Second | ~5.8 Million/sec |
| Memory Allocation | Hot-path Allocations | 0 (Zero) |
-
Custom Memory Pool: Implements a stack-based free-list to pre-allocate
Orderobjects at startup. This ensures$O(1)$ allocation and eliminates OS kernel syscalls during the Hot Path. -
Cache Alignment: Critical atomic variables and data structures use
alignas(64)to prevent False Sharing, ensuring that the Producer and Consumer threads do not invalidate each other's L1 cache lines.
- SPSC Ring Buffer: Communication between the Network (Producer) and Engine (Consumer) threads is handled via a lock-free circular queue.
- Atomic Memory Barriers: Uses
std::memory_order_acquireandstd::memory_order_releaseinstead of heavy sequential consistency or mutexes, reducing synchronization overhead to the hardware minimum.
-
Data Structures: Combines
std::unordered_mapfor instant order lookups by ID with a custom Doubly-Linked List for$O(1)$ removal from price levels, maintaining strict Price-Time Priority.
LOB-engine/
├── apps/
│ └── main.cpp # End-to-end performance demonstration
├── benchmarks/
│ └── EngineBench.cpp # Micro-benchmarking suite (Google Benchmark)
├── src/ # Core Matching Engine logic
│ ├── MemoryPool.hpp # Lock-free memory management
│ ├── Order.hpp # Internal order representation
│ ├── OrderBook.hpp # O(1) matching logic
│ ├── PriceLevel.hpp # Doubly-linked list for price levels
│ └── RingBuffer.hpp # SPSC lock-free communication
├── tests/
│ └── OrderBookTest.cpp # Formal validation suite (GoogleTest)
├── scripts/
│ └── make_plot.py # Python tool for latency visualization
├── plots/
│ └── latency_distribution.png
├── CMakeLists.txt # Build system configuration
└── readme.md # Project documentation
- C++20 Compiler (Clang 15+ or GCC 11+)
- CMake (3.15+)
- GoogleTest & Google Benchmark (Automatically downloaded via CMake)
# Enter directory
cd LOB-engine
# Build the project
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4
# Run Performance Demo
./hft_app
# Run Unit Tests
./unit_tests
# Run Benchmarks
./benchmarks