Accelera is a hybrid Python/C++ machine learning framework for building graph-based pipelines, running independent branches in parallel, generating HTML reports, and experimenting with automated preprocessing and loop parallelization.
- Graph ML pipelines: build DAG-style workflows with preprocessing, model, predict, metric, merge, and branch nodes.
- Parallel branch execution: compare multiple preprocessing/model/metric combinations in one pipeline run through the C++ graph backend.
- Custom model support: plug in sklearn-compatible estimators or extend
CustomClassifier,CustomRegressor,CustomClusterer, andCustomTransformer. - Reporting: generate graph visualizations and HTML metric reports through
GraphReport,ModelReport, and AutoML preprocessing reports. - Auto preprocessing: tabular, text, image-classification, and segmentation preprocessing utilities with saved preprocessors and visual summaries.
- Dataset retriever: list and download shared CSV datasets into a local
cache with
accelera.src.utils.dataset_retriever.DatasetRetriever. - C/C++ code parallelizer: extract loops with Clang AST, derive loop
features, call an OpenMP classifier service, and inject OpenMP pragmas
into parallelizable
forloops. This module is Linux-only. - Benchmark backend prototype: Express/MongoDB backend scaffolding for benchmarks, users, metrics, and submissions.
- The core DAG pipeline, custom estimator interfaces, reports, dataset retrieval, and preprocessing utilities are implemented in this repo.
- The AutoML search agent API exists, but the default search algorithm is still a placeholder.
- The benchmark backend is an early prototype.
- The code parallelizer requires Linux, LLVM/Clang, built pybind bindings,
and the classifier endpoint configured in
accelera/src/config.py.
git clone https://github.com/Mohamed-Ashraf273/accelera.git
cd accelera
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install psutil requests gdown graphviz
# Add Accelera to Python's import path for this terminal session.
# This is required before running examples, notebooks, or tests from the repo.
export PYTHONPATH="$PWD:${PYTHONPATH:-}"
# Linux only, required before CMake if you want to build code-parallelizer
# bindings and also because the current Linux CMake config expects LLVM.
sudo bash shell/install_llvm.sh 18
cmake -S . -B build
cmake --build build -j"$(nproc)"
# After building the C++/pybind modules, also expose the generated bindings.
export PYTHONPATH="$PWD:$PWD/build/bindings:${PYTHONPATH:-}"Run the export PYTHONPATH=... command again whenever you open a new terminal.
If you skip it, imports such as from accelera.src... or the native graph
binding may fail even when the package files exist locally.
# Parallel sklearn-vs-Accelera pipeline comparison
python examples/sklearn_comp.py
# Full branching pipeline demo with a custom PyTorch classifier and reports
python examples/demo.py
# Run tests
pytest acceleraFor notebooks, open examples/dataset_retriever_demo.ipynb,
examples/code_optimizer_demo.ipynb,
examples/autopreprocessing-classification-v3.ipynb, or
examples/segmentation-training-gp.ipynb after exporting PYTHONPATH.
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from accelera.src.accelera_pipe.core.pipeline import Pipeline
X, y = make_classification(
n_samples=5000,
n_features=20,
n_informative=15,
random_state=42,
)
X_test, y_test = X[:200], y[:200]
pipe = Pipeline()
pipe.branch(
"preprocessing",
pipe.preprocess("standard", StandardScaler(), branch=True),
pipe.preprocess("minmax", MinMaxScaler(), branch=True),
).model(
"logreg",
LogisticRegression(max_iter=1000),
).predict(
"predict",
test_data=X_test,
).metric(
"accuracy",
"accuracy_score",
y_true=y_test,
)
predictions, executed_graph = pipe(X, y, select_strategy="max")
best_result = executed_graph(X_test, y_test)
print(predictions)
print(best_result)The examples below assume you already ran the Quick Start setup and exported
PYTHONPATH. For graph-backed pipeline examples, use:
export PYTHONPATH="$PWD:$PWD/build/bindings:${PYTHONPATH:-}"If the native graph import fails, rebuild the C++ bindings with
cmake --build build -j"$(nproc)" and run the export command again.
Use Pipeline when you want to compare several preprocessing/model paths in a
single graph run. Each builder call adds a node. Passing branch=True creates
a branch candidate, and branch() groups those candidates under one split.
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from accelera.src.accelera_pipe.core.pipeline import Pipeline
X, y = make_classification(n_samples=5000, n_features=20, random_state=42)
X_val, y_val = X[:500], y[:500]
X_test, y_test = X[500:1000], y[500:1000]
pipe = Pipeline()
pipe.branch(
"preprocessing",
pipe.preprocess("standard", StandardScaler(), branch=True),
pipe.preprocess("minmax", MinMaxScaler(), branch=True),
).model(
"model_lr",
LogisticRegression(max_iter=1000),
).predict(
"predict",
test_data=X_val,
).metric(
"metric",
"accuracy_score",
y_true=y_val,
)
results, executed_graph = pipe(X, y, select_strategy="max")
test_results = executed_graph(X_test, y_true=y_test)
print(results)
print(test_results)Useful pipeline options:
select_strategy="all"returns all graph paths.select_strategy="max"selects the path with the highest metric.select_strategy="min"selects the path with the lowest metric.custom_strategy=accepts a user-defined path selection function.pipe.disable_parallel_execution()forces serial graph execution.pipe.set_multicore_threshold(n)changes the backend threshold for multicore execution.cache=Falseis the default for preprocess/model nodes. Enable cache only when repeated runs reuse the same expensive node inputs.
Unexecuted pipelines and executed graphs can both be saved as one pickle file. Save an unexecuted pipeline when you want to store the graph recipe and train it later. Save an executed graph when you already trained the pipeline and want to reuse the fitted preprocessing/model path for inference.
from accelera.src.accelera_pipe.core.executed_graph import ExecutedGraph
from accelera.src.accelera_pipe.core.pipeline import Pipeline
pipe.save("pipeline.pkl")
loaded_pipe = Pipeline.load("pipeline.pkl")
results, executed_graph = loaded_pipe(X, y, select_strategy="max")
executed_graph.save("executed_graph.pkl")
loaded_graph = ExecutedGraph.load("executed_graph.pkl")
predictions = loaded_graph(X_test, y_true=y_test)Notes:
- An unexecuted pipeline stores the graph structure and callable node objects.
- An executed graph stores the fitted objects needed for inference.
- Top-level custom preprocess functions and simple lambdas can be stored through source-backed wrappers when possible.
- Custom functions with closures are rejected because captured external variables cannot be reconstructed safely from source code alone.
Use the dataset retriever when you want to pull one of the shared demo datasets
without manually downloading CSV files. Call available_datasets() first to
see the registered names, then connect, retrieve the dataset, and close the
connection when finished.
from accelera.src.utils.dataset_retriever import retriever
print(retriever.available_datasets())
retriever.connect()
housing_df = retriever.retrieve_dataset("Housing", df=True)
print(housing_df.head())
retriever.close()Tabular preprocessing prepares classical machine-learning datasets. It handles
common cleaning, train/validation splitting, target handling, and report output
under the folder you pass in folder_path.
from accelera.src.automl.core.classical_training_preprocessing import (
ClassicalTrainingPreprocessing,
)
from accelera.src.utils.dataset_retriever import retriever
retriever.connect()
df = retriever.retrieve_dataset("Titanic-Dataset", df=True)
preprocessor = ClassicalTrainingPreprocessing(
df,
target_col="Survived",
problem_type="classification",
folder_path="./titanic_preprocessing_report",
)
X_train, y_train, X_val, y_val = preprocessor.common_preprocessing()
retriever.close()Text preprocessing prepares a text column and target column for NLP experiments. Pass the dataframe, the target column, and the text column, then use the returned train/validation arrays in your model code.
import pandas as pd
from accelera.src.automl.core.text_training_preprocessing import (
TextTrainingPreprocessing,
)
reviews_df = pd.DataFrame(
{
"review": ["Great product", "Very bad experience", "I like it"],
"class": [1, 0, 1],
}
)
text_preprocessor = TextTrainingPreprocessing(
reviews_df,
target_col="class",
text_col="review",
folder_path="./reviews_report",
)
X_train, y_train, X_val, y_val = text_preprocessor.common_preprocessing()Image preprocessing expects a folder structure that contains class folders.
When split_training=True, it creates a validation split from the training
folder. Use augment=True when you want training-time augmentation.
from accelera.src.automl.core.classification_image_training_preprocessing import (
ClassificationImageTrainingPreprocessing,
)
image_preprocessor = ClassificationImageTrainingPreprocessing(
training_folder_images="./PetImages", # replace with your class folders
folder_path="./PetImagesReport",
split_training=True,
val_size=0.2,
images_size=(224, 224),
augment=True,
)
training_loader, validation_loader = image_preprocessor.common_preprocessing()Graph reports visualize a serialized pipeline graph together with the pipeline
results. Serialize the pipeline to XML first, then pass that XML file and the
results to GraphReport.
from accelera.src.utils.accelera_utils import serialize
from accelera.src.accelera_pipe.wrappers.graph_report import GraphReport
predictions, executed_graph = pipe(X, y, select_strategy="max")
serialize(pipe, "pipeline.xml")
report = GraphReport("pipeline_report", "pipeline.xml", predictions)
report.execute()Use ModelReport when you already have metric results from a normal model and
want the same report format without building a full Accelera Pipe graph.
from sklearn.metrics import accuracy_score
from accelera.src.accelera_pipe.wrappers.model_report import ModelReport
accuracy = accuracy_score(y_test, model.predict(X_test))
results = [
{
"metric name": "accuracy",
"result": accuracy,
"plot_func": None,
"labels_name": None,
"headers_name": None,
}
]
report = ModelReport("model_report", results=results)
report.execute()Use the parallelizer when you want to analyze loop-heavy C/C++ code and emit
OpenMP pragmas. The module is Linux-only and needs the C++ bindings, LLVM/Clang,
and the classifier endpoint configured in accelera/src/config.py.
from accelera.src.utils.parallelizer import parallelizer
parallelizer.parallelize("examples/test_loops.c")
# Writes examples/parallelized_test_loops.cFor in-memory C/C++ code:
from accelera.src.utils.parallelizer import parallelizer
code = """
int main() {
int total = 0;
for (int i = 0; i < 1000; i++) {
total += i;
}
}
"""
parallelized_code = parallelizer.parallelize(code, file=False)
print(parallelized_code)For supported Python code, the parallelizer first converts Python to C++ and then applies the same loop extraction and OpenMP insertion path:
code = """
total = 0
for i in range(1000):
total += i
print(total)
"""
parallelized_code = parallelizer.parallelize(code, file=False)
print(parallelized_code)The Python-to-C++ converter supports a restricted loop-friendly subset:
- constants, variables, arithmetic, comparisons, boolean operations;
- function calls and
print; - attribute access and indexing, but not slices;
- simple assignment and simple-name augmented assignment;
if/else,return;for i in range(...)with one, two, or three arguments;- simple
deffunctions without decorators.
Unsupported Python syntax raises an error or falls back to the original Python function when used through automatic pipeline optimization.
Pipeline.preprocess() automatically tries to optimize custom preprocessing
functions through the Parallelizer when possible.
from accelera.src.accelera_pipe.core.pipeline import Pipeline
def normalize_rows(X):
for i in range(len(X)):
s = 0
for j in range(len(X[i])):
s += X[i][j] * X[i][j]
norm = s ** 0.5
for j in range(len(X[i])):
X[i][j] = X[i][j] / norm
return X
pipe = Pipeline()
pipe.preprocess("normalize", normalize_rows)The automatic path is:
Python custom function
-> py2cpp_converter
-> parallelizer OpenMP pragma insertion
-> cpp_compiler.py / pybind11 native module
-> Accelera Pipe preprocess node
If conversion, classification, OpenMP insertion, compilation, or import fails, Accelera keeps the original Python function so the pipeline remains correct.
Things that can prevent these modules from running:
- Missing C++ bindings: run
cmake --build buildand exportPYTHONPATH="$PWD:$PWD/build/bindings". - Not on Linux: the code parallelizer bindings are disabled on Windows and macOS in the current CMake configuration.
- LLVM/Clang missing: install LLVM/Clang before configuring CMake. The
project script is
sudo bash shell/install_llvm.sh 18. - OpenMP compiler support missing: generated native code requires a compiler
with OpenMP support. On Linux this usually means
g++/clang++plus OpenMP runtime libraries. - Classifier endpoint unavailable: set
ACCELERA_CLASSIFIER_ENDPOINTor ensure the default Hugging Face Space is reachable. Also checkACCELERA_REQUEST_TIMEOUT_Sfor slow networks. clang-formatmissing: output formatting is optional. Installclang-formator setACCELERA_ENABLE_CPP_FORMATTING=0.- Unsupported Python syntax: the converter is intentionally limited. Use
simple numeric loops,
range, scalar variables, and indexing. - Custom function source unavailable: functions defined dynamically, interactively, or inside closures may not be inspectable or saveable.
- Closure variables in saved custom functions: source-backed save/load rejects closures because external captured values are not stored.
- Pickle limitations: custom classes/functions must be pickle-compatible unless wrapped by the source-backed function path.
- Large memory usage in branch-heavy searches: graph execution may use more memory than sklearn Pipeline because multiple branches and fitted states can be alive during selection.
- Cache confusion: cache is off by default. Enable it only when repeated identical node inputs justify the hashing and disk I/O cost.
Useful environment variables:
export PYTHONPATH="$PWD:$PWD/build/bindings"
export ACCELERA_CLASSIFIER_ENDPOINT="https://accelera-ai-open-mp-classifier.hf.space/predict"
export ACCELERA_REQUEST_TIMEOUT_S=10
export ACCELERA_ENABLE_CPP_FORMATTING=0 # optional
export ACCELERA_CPP_OPT_LEVEL=-O0 # faster compile, default in cpp_compilerUseful validation commands:
python examples/sklearn_comp.py
python examples/parallel_accpipe.py
python tools/evaluate_hard_parallelizer.py
pytest accelera/src/accelera_pipe/core/pipeline_test.py -q
pytest accelera/src/utils/parallelizer_test.py -qaccelera/
├── accelera/
│ ├── api/ # generated public API modules
│ ├── bindings/ # pybind11 bindings
│ └── src/
│ ├── accelera_pipe/ # DAG pipeline, execution graph
│ ├── automl/ # preprocessing, reports, AutoML agent scaffold
│ ├── benchmark/ # Node.js backend prototype
│ ├── custom/ # estimator base classes
│ ├── utils/ # dataset retriever, parallelizer and code utilities
│ └── wrappers/ # HTML/report helpers
├── src/ # C++ core, nodes, AST, and utility sources
├── include/ # C++ headers
├── examples/ # scripts and notebooks
├── docs/ # MkDocs documentation
├── shell/ # setup scripts
└── CMakeLists.txt
# Regenerate API exports after changing Python modules
python api_gen.py
# Run formatting/lint hooks
pre-commit run --all-files --hook-stage manual
# Serve docs locally
mkdocs serve
# Run Benchmark
## Run Backend
cd accelera/src/benchmark/backend
npm install
npm run dev
## Run Frontend
cd accelera/src/benchmark/frontend
npm install
npm run dev
Apache License 2.0. See LICENSE.
