BrainAudio

LightBeam: An Accurate and Memory-Efficient CTC Decoder for Speech Neuroprostheses

General Installation Instructions

The following are general installation instructions for the main package. Instructions for the WFST-based language model environment (used in Willett et al., 2023 and Card et al., 2024) can be found at the bottom of this file.

Install the uv package. Instructions can be found here.
cd into the outer brainaudio directory and run uv sync.
uv sync creates a python virtual environment, which can be activated through source .venv/bin/activate

Test Installation

Ensure your installation is correct by running the following commands: (a bash script )

Repository Structure

Directory	Description
`src/brainaudio/`	Core Python package — models, training, inference, datasets
`scripts/`	End-to-end pipeline scripts
`auxiliary_folders/`	Supporting tools (LLM finetuning, GEC, shallow fusion assets)
`results/`	Decoder output CSVs and PER/WER summaries

Pipeline

The full pipeline is documented in scripts/README.md. A summary of each step is below.

1. Format Data

Convert raw competition data into the trial-level pickle format expected by the training pipeline.

uv run scripts/dataset/brain2text_2025.py   # or brain2text_2024.py
uv run scripts/dataset/lazyload_format.py

2. Train the Encoder

Train a CTC acoustic encoder (Transformer or GRU) over multiple seeds. Edit config_path and device at the top of scripts/train.py, then:

uv run scripts/train.py

Configs are in src/brainaudio/training/utils/custom_configs/. Configs for the four models used in the paper (baseline GRUs and Transformers for B2T '24 and '25) are provided.

3. Finetune the LLM

SFT fine-tune a causal LLM with LoRA on transcript data. The resulting adapter is used by the decoder for shallow fusion rescoring.

uv run scripts/finetune_llm.py \
    --model-name meta-llama/Llama-3.2-1B \
    --transcript-files /path/to/transcripts_merged_normalized.txt \
    --output-dir /path/to/save/adapter

The transcript file should have one sentence per line with a # VALIDATION comment separating train and val splits.

4. Generate Logits + Decode

Generate logits and decode with the Lightbeam CTC beam search decoder across multiple seeds in one call:

uv run scripts/batch_decode.py \
    --dataset b2t_25 \
    --model-mode transformer \
    --base-path /home/user \
    --brain2text-dir /home/user/data2 \
    --model-template "neurips_b2t_25_causal_transformer_seed_{seed}" \
    --seeds 0 1 2 3 4 \
    --val \
    --device cuda:0

Logits are generated automatically before each decode. If logits already exist, pass --logits-base to skip generation. Hyperparameters are in scripts/decoder_config.py — tuned defaults for both benchmarks are already set. The token list and lexicon required for decoding are in auxiliary_folders/shallow_fusion/.

Note: Lightbeam requires the Huge 4-gram LM, not included in this repo. Download in ARPA format from imagineville.org and update word_lm in scripts/decoder_config.py.

Generative Error Correction (GEC) (optional)

Fine-tune an instruction-tuned LLM to correct beam search hypotheses as a post-processing step. Edit the config block at the top of auxiliary_folders/finetune_llm/finetune_gec.py, then:

uv run auxiliary_folders/finetune_llm/finetune_gec.py

Ongoing Work

Test-Time Adaptation: We have not implemented DietCORP within this codebase, if you are interested in replicating the Feghhi at al., 2025 NeurIPs results code is available on the older codebase associated with that paper or you can reach out to us and we will work on integrating it within BrainAudio!

Installation Instructions for Language Model (WFST-based)

Required only for reproducing the WFST-based decoding results from Willett et al., 2023 and Card et al., 2024. This uses a separate Python environment from the main package.

cd into the outer brainaudio directory and run uv venv .wfst -p 3.9.
Activate the environment with source .wfst/bin/activate
Run uv pip install -r requirements.txt
Clone the following repository outside this repository: NEJM repo.
Create a directory called third_party. After creating this directory, your project structure should look like this:

    brainaudio/
    ├── src/
    │   └── brainaudio/
    └── third_party/ <-- Create this folder

Copy the language_model directory from the NEJM repo into third_party: cp -r nejm-brain-to-text/language_model brainaudio/third_party
Run cd third_party/language_model/runtime/server/x86 and then python setup.py install. Make sure this command is run in the .wfst venv.

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
auxiliary_folders		auxiliary_folders
results		results
scripts		scripts
src/brainaudio		src/brainaudio
wfst		wfst
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
kenlm_install.txt		kenlm_install.txt
pyproject.toml		pyproject.toml
requirements_brainaudio.txt		requirements_brainaudio.txt
requirements_wfst.txt		requirements_wfst.txt
uv.lock		uv.lock
verify_all.sh		verify_all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BrainAudio

General Installation Instructions

Test Installation

Repository Structure

Pipeline

1. Format Data

2. Train the Encoder

3. Finetune the LLM

4. Generate Logits + Decode

Generative Error Correction (GEC) (optional)

Ongoing Work

Installation Instructions for Language Model (WFST-based)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BrainAudio

General Installation Instructions

Test Installation

Repository Structure

Pipeline

1. Format Data

2. Train the Encoder

3. Finetune the LLM

4. Generate Logits + Decode

Generative Error Correction (GEC) (optional)

Ongoing Work

Installation Instructions for Language Model (WFST-based)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages