Skip to content

martin-majlis/probstructs

Repository files navigation

Probabilistic Structures

ProbStructs as easy to use C++ library with probabilistic structures.

Build status Test status Benchmark status Documentation Status GitHub stars

Documentation

Full documentation is available at http://probstructs.readthedocs.io/en/latest/

Classes

Example

using namespace probstructs;

ExponentialCountMinSketch<int> sketch(100, 4, 8);

uint32_t ts = 0;

ts = 0;
sketch.inc("aaa", ts, 1);
sketch.inc(std::string("bbb"), ts, 4);
sketch.inc("ccc", ts, 8);

std::cerr << sketch.get(std::string("aaa"), 4, ts) << std::endl;
// 1

std::cerr << sketch.get("bbb", 4, ts) << std::endl;
// 4

std::cerr << sketch.get("ccc", 4, ts) << std::endl;
// 8

std::cerr << sketch.get("ddd", 4, ts) << std::endl;
// 0

ts = 4;
std::cerr << sketch.get("aaa", 2, ts) << std::endl;
// 0
std::cerr << sketch.get("bbb", 2, ts) << std::endl;
// 0
std::cerr << sketch.get(std::string("ccc"), 2, ts) << std::endl;
// 0
std::cerr << sketch.get("ddd", 2, ts) << std::endl;
// 0

std::cerr << sketch.get("aaa", 8, ts) << std::endl;
// 1
std::cerr << sketch.get("bbb", 8, ts) << std::endl;
// 4
std::cerr << sketch.get("ccc", 8, ts) << std::endl;
// 8
std::cerr << sketch.get("ddd", 8, ts) << std::endl;
// 0

Building the docs

Prerequisites: CMake 3.11+, Doxygen, Graphviz, and the Python packages listed in docs/requirements.txt (breathe and sphinx-rtd-theme).

macOS:

brew install cmake doxygen graphviz
pip install -r docs/requirements.txt

Ubuntu/Debian:

sudo apt-get install cmake doxygen graphviz
pip install -r docs/requirements.txt

Then build:

make docs-build

The generated HTML lands in _docs/docs/sphinx/.

Benchmarks

Build and run the benchmark suite (requires CMake 3.11+ and a C++17 compiler; install on macOS with brew install cmake):

make bench-build    # fetches Google Benchmark, compiles
make bench-run      # runs and saves results to benchmark_results/local/<timestamp>.json
make bench-compare  # compares the two most-recent local result files

Results are stored in benchmark_results/local/ (gitignored, per-machine) so local runs never pollute the repo. CI results are committed to benchmark_results/ci/ and used as the regression baseline for pull requests.

See the full benchmark documentation for details on filtering, repeating runs, and comparing specific result files.

About

Collection of probabilistic data structures - CM-Sketch, ECM-Sketch, exponential histogram in C++.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors