Skip to content

oudeis01/nayuta_bert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bert.c

A pure-C, fully serial, fully observable inference engine for BERT-base-uncased.

Every floating-point operation is an explicit loop, so an operation-level hook can be inserted at any point of the forward pass. There is no OpenMP and no BLAS. The engine can run as a standalone dev binary, or in installation mode where it streams every operation as an event for downstream visualization and audio.

Project context

This engine is the computation core of Nayuta: The Transformer, an art installation and computational study that runs a single BERT forward pass over roughly 62,400 institutional art texts at about 1,000 operations per second, a pace that would take on the order of 159,000 years to complete. Every floating-point operation is surfaced as a visual or auditory event, so the model's computation is rendered as transparent arithmetic rather than an opaque result. (Nayuta is a Buddhist term for an immense, almost uncountable number, echoing that runtime.)

bert.c is factored out of that project as a reusable, self-contained component. The full installation mounts this repository as a submodule at bert_inference/engine/ and lives at github.com/oudeis01/nayuta.

Provenance

Based on llama2.c by Andrej Karpathy (MIT). The bidirectional attention pattern references modernbert.c by Hardik Vala. Key differences from Llama: word + position + token_type embeddings with LayerNorm, bidirectional N×N attention (no causal mask), no RoPE, GELU activation, standard 2-layer FFN with bias, and post-norm. See the header comment in bert.c for the full list.

Build

Two targets are produced from the single bert.c:

cmake -B build
cmake --build build
  • bert — dev mode, pure C, no dependencies (-lm only).
  • bert_install — installation mode (-DINSTALLATION_MODE), streams FMA batches over ZeroMQ (PUSH) and structural events over OSC. Requires libzmq and liblo (resolved via pkg-config).

Inputs

The engine consumes a weights file exported from a Hugging Face BERT-base checkpoint plus a pre-tokenized corpus. The exporters and corpus builders are not part of this engine repo; they live alongside the installation that drives it.

Run

Dev mode takes an exported weights file and a token source and runs one forward pass:

# token IDs from a file
./build/bert bert_base.bin tokens.txt

# or pipe token IDs on stdin
echo "101 2057 2024 2204 102" | ./build/bert bert_base.bin

The operation-level hook fires at each weight-matrix computation as the pass runs. In installation mode (bert_install), the same stream is pushed over ZeroMQ and OSC instead.

License

MIT. See LICENSE. Original llama2.c copyright retained.

About

Pure-C, fully observable BERT-base inference engine (computation core of Nayuta: The Transformer)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors