transformers-lite

Transformer inference in C++. A header-only library providing tensor operations and transformer building blocks.

The library has zero external dependencies — it relies only on the C++ standard library. OpenMP #pragma directives are used in a few hot loops for optional parallelism but are silently ignored by compilers without OpenMP support.

Requirements

CMake 3.16+
C++20 compiler (GCC 11+, Clang 14+)

Build

cmake -S . -B build
cmake --build build

Build with tests

Tests use GoogleTest and are fetched automatically via CMake's FetchContent — no manual install needed.

cmake -S . -B build -DBUILD_TESTS=ON
cmake --build build
cd build && ctest --output-on-failure

Usage

The library is header-only. Link against the transformers_lite CMake target:

add_subdirectory(transformers-lite)
target_link_libraries(your_target PRIVATE transformers_lite)

Then include the headers you need:

#include <transformers-lite/tensor.hpp>
#include <transformers-lite/ops.hpp>

using namespace transformers;

int main() {
    Tensor<CPU, float> x(Shape(4));
    x(0) = 1.0f;
    x(1) = 2.0f;
    x(2) = 3.0f;
    x(3) = 4.0f;
}

Adding custom ops

Most custom ops only require a new expression struct — no changes to the library internals.

An expression must satisfy the TensorExpr<E, COMPUTE, T> concept:

Method	Description
`Shape outputShape() const`	Returns the shape of the result
`void evalInto(TensorView<T>& out) const`	Writes the computed result into `out`

Tensor::operator= detects this interface automatically, so the assignment syntax result = myExpr(...) just works.

Example: element-wise clamp

#include <transformers-lite/core/ops/exprs.hpp>

namespace transformers_lite {

template <template <class> class COMPUTE, typename T>
struct ClampExpr {
    using value_type = T;
    TensorView<T> x;
    T low, high;

    Shape outputShape() const { return x.shape(); }

    void evalInto(TensorView<T>& out) const {
        for (size_t i = 0; i < x.size(); ++i)
            out[i] = std::clamp(x[i], low, high);
    }
};

template <template <class> class COMPUTE, typename T>
auto clamp(const Tensor<COMPUTE, T>& x, T low, T high) -> ClampExpr<COMPUTE, T> {
    return {TensorView<T>{x}, low, high};
}

} // namespace transformers_lite

Usage:

Tensor<CPU, float> x(Shape(4), {-2.f, 0.5f, 1.5f, 3.f});
Tensor<CPU, float> y;
y = clamp(x, 0.f, 1.f); // y == [0, 0.5, 1, 1]

When you also need a new backend kernel

If the op is compute-intensive and needs an AVX-512 (or future CUDA) fast path, add the dispatch inside evalInto using the existing Ops<COMPUTE, T> pattern:

Add the method to Ops<CPU, T> in include/transformers-lite/core/ops/cpu_ops.hpp.
Add an AVX-512 implementation in include/transformers-lite/core/ops/avx512_ops.hpp under #ifdef __AVX512F__, then dispatch with if constexpr inside the Ops<CPU, T> method.
Call Ops<COMPUTE, T>::yourMethod(...) from evalInto.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.githooks		.githooks
include/transformers-lite		include/transformers-lite
tests		tests
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
quick_test.sh		quick_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transformers-lite

Requirements

Build

Build with tests

Usage

Adding custom ops

Example: element-wise clamp

When you also need a new backend kernel

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

transformers-lite

Requirements

Build

Build with tests

Usage

Adding custom ops

Example: element-wise clamp

When you also need a new backend kernel

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages