Skip to content

dylanjayabahu/LOB-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ultra-Low Latency C++20 Matching Engine

A high-performance, multi-threaded Limit Order Book (LOB) engine optimized for sub-microsecond execution on ARM64 (Apple Silicon) and x86_64 architectures.

🚀 Performance Results

Tests conducted on Apple M-series Silicon (8-core, 24MHz base).

Component Metric Result
Raw Match Latency Avg Time per Order 86.6 ns
End-to-End Latency Producer-to-Consumer 173.2 ns
Throughput Orders per Second ~5.8 Million/sec
Memory Allocation Hot-path Allocations 0 (Zero)

Latency Distribution

🛠️ Technical Architecture

1. Mechanical Sympathy & Memory Optimization

  • Custom Memory Pool: Implements a stack-based free-list to pre-allocate Order objects at startup. This ensures $O(1)$ allocation and eliminates OS kernel syscalls during the Hot Path.
  • Cache Alignment: Critical atomic variables and data structures use alignas(64) to prevent False Sharing, ensuring that the Producer and Consumer threads do not invalidate each other's L1 cache lines.

2. Lock-Free Concurrency

  • SPSC Ring Buffer: Communication between the Network (Producer) and Engine (Consumer) threads is handled via a lock-free circular queue.
  • Atomic Memory Barriers: Uses std::memory_order_acquire and std::memory_order_release instead of heavy sequential consistency or mutexes, reducing synchronization overhead to the hardware minimum.

3. $O(1)$ Order Management

  • Data Structures: Combines std::unordered_map for instant order lookups by ID with a custom Doubly-Linked List for $O(1)$ removal from price levels, maintaining strict Price-Time Priority.

📂 Project Structure

LOB-engine/
├── apps/
│   └── main.cpp           # End-to-end performance demonstration
├── benchmarks/
│   └── EngineBench.cpp    # Micro-benchmarking suite (Google Benchmark)
├── src/                   # Core Matching Engine logic
│   ├── MemoryPool.hpp     # Lock-free memory management
│   ├── Order.hpp          # Internal order representation
│   ├── OrderBook.hpp      # O(1) matching logic
│   ├── PriceLevel.hpp     # Doubly-linked list for price levels
│   └── RingBuffer.hpp     # SPSC lock-free communication
├── tests/
│   └── OrderBookTest.cpp  # Formal validation suite (GoogleTest)
├── scripts/
│   └── make_plot.py       # Python tool for latency visualization
├── plots/
│   └── latency_distribution.png 
├── CMakeLists.txt         # Build system configuration
└── readme.md              # Project documentation

Build & Execution

Dependencies

  • C++20 Compiler (Clang 15+ or GCC 11+)
  • CMake (3.15+)
  • GoogleTest & Google Benchmark (Automatically downloaded via CMake)

Build & Run

# Enter directory
cd LOB-engine

# Build the project
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4

# Run Performance Demo
./hft_app

# Run Unit Tests
./unit_tests

# Run Benchmarks
./benchmarks

About

C++20 Matching Engine optimized for 86ns latency and 5.8M orders/sec using lock-free SPSC queues and custom memory pooling on ARM64.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors