This repository is based on https://github.com/duckdb/extension-template, check it out if you want to build and ship your own DuckDB extension.
This extension, OPTBench, is a DuckDB extension for benchmarking query optimizers.
Install the following system dependencies before building:
# Ubuntu / Debian
sudo apt install libssl-dev libeigen3-devThis repo uses git submodules (duckdb and extension-ci-tools). Always clone with:
git clone --recurse-submodules <repo-url>If you already cloned without submodules, run:
git submodule update --init --recursiveThis project requires LibTorch (the C++ PyTorch distribution). If you do not have CUDA available, download the CPU-only build:
wget https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.1.2%2Bcpu.zip
unzip libtorch-cxx11-abi-shared-with-deps-2.1.2+cpu.zip -d ~/Note: LibTorch 2.7.x prebuilt CPU binaries have internal header inconsistencies and will fail to compile. Use 2.1.2.
Then set LIBTORCH_DIR before building:
export LIBTORCH_DIR=~/libtorchAlternatively, pass it directly to make:
LIBTORCH_DIR=~/libtorch makeNote: If you switch
LIBTORCH_DIRafter a previous build, clear the CMake cache first to avoid picking up stale paths:rm -rf build/release/CMakeCache.txt build/release/CMakeFiles
Now to build the extension, run:
makeThe main binaries that will be built are:
./build/release/duckdb
./build/release/test/unittest
./build/release/extension/cactusdb/cactusdb.duckdb_extensionduckdbis the binary for the duckdb shell with the extension code automatically loaded.unittestis the test runner of duckdb. Again, the extension is already linked into the binary.cactusdb.duckdb_extensionis the loadable binary as it would be distributed.
To run the extension code, simply start the shell with ./build/release/duckdb.
Different tests can be created for DuckDB extensions. The primary way of testing DuckDB extensions should be the SQL tests in ./test/sql. These SQL tests can be run using:
make testThe frontend is a Flask web app that lets you explore query plans, compare optimizers, run benchmarks, and upload custom optimizer passes — all from a browser.
Python 3.9+ is required. Install the Python dependencies:
pip install flask werkzeugNote: The frontend shells out to
./build/release/duckdb, so the extension must be built first (see Build steps above).
The Frontend/app.py file expects to be run from the project root so that relative paths to build/, queries/, and plugins/ resolve correctly:
cd <repo-root>
python Frontend/app.pyFlask will start on http://127.0.0.1:5000 by default.
If LIBTORCH_DIR is not already exported in your shell, set it before launching:
LIBTORCH_DIR=~/libtorch python Frontend/app.py| Section | Description |
|---|---|
| Query explorer | Browse the built-in SQL+ML queries (Expedia, Flights, UC01–UC10, synthetic ranker) |
| Plan comparison | Side-by-side physical plan diff for any two optimizer settings |
| Benchmark runner | Execute a query under every active optimizer and compare runtimes |
| Optimizer builder | Write IF-THEN rules using the metric catalog to create custom optimizers |
| Optimizer upload | Upload a .cpp pass file; the server rebuilds the extension and activates it live |
- Write your pass as a self-contained
.cppfile that callsREGISTER_OPT_PASS(<name>, check, apply). - Open the Upload Pass panel in the UI and select your
.cpp(and an optional.hppheader). - The server drops the file into
plugins/opt_passes/, regenerates the manifest, runsmake release, and activates the pass under the slot nameOPT#<name>— no manual rebuild needed. - Build progress is shown in real time via the Build Status indicator in the UI.
Allowed
#includeheaders: standard library headers (algorithm,vector,string, …) and project-internal headers underduckdb/,optimization/,ml_functions/, andcost_model/. System or third-party headers are rejected at upload time.
For demos or multi-user use, run Flask behind a production WSGI server:
pip install gunicorn
gunicorn -w 1 -b 0.0.0.0:5000 --chdir <repo-root> "Frontend.app:app"Single worker (
-w 1) is required because the optimizer state and build status are kept in-process memory; multiple workers would not share them.