Skip to content

asu-cactus/OPTBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OPTBench

This repository is based on https://github.com/duckdb/extension-template, check it out if you want to build and ship your own DuckDB extension.


This extension, OPTBench, is a DuckDB extension for benchmarking query optimizers.

Building

Prerequisites

Install the following system dependencies before building:

# Ubuntu / Debian
sudo apt install libssl-dev libeigen3-dev

Clone with submodules

This repo uses git submodules (duckdb and extension-ci-tools). Always clone with:

git clone --recurse-submodules <repo-url>

If you already cloned without submodules, run:

git submodule update --init --recursive

LibTorch dependency

This project requires LibTorch (the C++ PyTorch distribution). If you do not have CUDA available, download the CPU-only build:

wget https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.1.2%2Bcpu.zip
unzip libtorch-cxx11-abi-shared-with-deps-2.1.2+cpu.zip -d ~/

Note: LibTorch 2.7.x prebuilt CPU binaries have internal header inconsistencies and will fail to compile. Use 2.1.2.

Then set LIBTORCH_DIR before building:

export LIBTORCH_DIR=~/libtorch

Alternatively, pass it directly to make:

LIBTORCH_DIR=~/libtorch make

Note: If you switch LIBTORCH_DIR after a previous build, clear the CMake cache first to avoid picking up stale paths:

rm -rf build/release/CMakeCache.txt build/release/CMakeFiles

Build steps

Now to build the extension, run:

make

The main binaries that will be built are:

./build/release/duckdb
./build/release/test/unittest
./build/release/extension/cactusdb/cactusdb.duckdb_extension
  • duckdb is the binary for the duckdb shell with the extension code automatically loaded.
  • unittest is the test runner of duckdb. Again, the extension is already linked into the binary.
  • cactusdb.duckdb_extension is the loadable binary as it would be distributed.

Running the extension

To run the extension code, simply start the shell with ./build/release/duckdb.

Running the tests

Different tests can be created for DuckDB extensions. The primary way of testing DuckDB extensions should be the SQL tests in ./test/sql. These SQL tests can be run using:

make test

Frontend (OPTBench Web UI)

The frontend is a Flask web app that lets you explore query plans, compare optimizers, run benchmarks, and upload custom optimizer passes — all from a browser.

Prerequisites

Python 3.9+ is required. Install the Python dependencies:

pip install flask werkzeug

Note: The frontend shells out to ./build/release/duckdb, so the extension must be built first (see Build steps above).

Starting the server

The Frontend/app.py file expects to be run from the project root so that relative paths to build/, queries/, and plugins/ resolve correctly:

cd <repo-root>
python Frontend/app.py

Flask will start on http://127.0.0.1:5000 by default.

If LIBTORCH_DIR is not already exported in your shell, set it before launching:

LIBTORCH_DIR=~/libtorch python Frontend/app.py

What the UI provides

Section Description
Query explorer Browse the built-in SQL+ML queries (Expedia, Flights, UC01–UC10, synthetic ranker)
Plan comparison Side-by-side physical plan diff for any two optimizer settings
Benchmark runner Execute a query under every active optimizer and compare runtimes
Optimizer builder Write IF-THEN rules using the metric catalog to create custom optimizers
Optimizer upload Upload a .cpp pass file; the server rebuilds the extension and activates it live

Uploading a custom optimizer pass

  1. Write your pass as a self-contained .cpp file that calls REGISTER_OPT_PASS(<name>, check, apply).
  2. Open the Upload Pass panel in the UI and select your .cpp (and an optional .hpp header).
  3. The server drops the file into plugins/opt_passes/, regenerates the manifest, runs make release, and activates the pass under the slot name OPT#<name> — no manual rebuild needed.
  4. Build progress is shown in real time via the Build Status indicator in the UI.

Allowed #include headers: standard library headers (algorithm, vector, string, …) and project-internal headers under duckdb/, optimization/, ml_functions/, and cost_model/. System or third-party headers are rejected at upload time.

Production / non-debug mode

For demos or multi-user use, run Flask behind a production WSGI server:

pip install gunicorn
gunicorn -w 1 -b 0.0.0.0:5000 --chdir <repo-root> "Frontend.app:app"

Single worker (-w 1) is required because the optimizer state and build status are kept in-process memory; multiple workers would not share them.

About

We port CactusDB SQL-ML co-optimization techniques as a DuckDB extension

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Generated from duckdb/extension-template