Bulbasaur is a Coverage-Guided Greybox Fuzzer (CGF) that implements the core techniques described in the BULBASAUR paper.
Unlike conventional fuzzers that rely solely on generic mutation operators, Bulbasaur integrates an LLM into the fuzzing loop to generate branch-specific mutation functions on demand. When the fuzzer stalls on a hard branch constraint, it queries the LLM with the relevant source context; the LLM produces a targeted Rust mutation function that is compiled to a shared library and loaded at runtime. This allows the fuzzer to perform precise, reusable mutations tailored to individual branch conditions rather than relying on random byte flips.
Bulbasaur is built on top of AFL++ instrumentation and adopts a three-target execution model (fast / full / trace) paired with Thompson Sampling–based seed scheduling and TaintFuzz-style operand-guided mutation.
The central challenge in coverage-guided greybox fuzzing is generating inputs that satisfy branch constraints in order to reach deep code regions. Existing techniques rely on general-purpose mutators that lack the ability to precisely manipulate the relevant input bytes.
Bulbasaur's key innovation: use a Large Language Model (LLM) to generate customised mutation functions for specific branch constraints. These functions perform localised, reusable modifications on inputs that already reach the target branch, producing mutations that precisely satisfy the constraint.
Coverage-Guided Greybox Fuzzing (CGF) hinges on generating inputs that satisfy branch constraints in order to explore deep code regions. Existing CGF techniques propose various approaches to improve the probability of solving such constraints. However, they still rely on general-purpose mutators, which lack the ability to precisely manipulate relevant input bytes to satisfy complex constraints. Our analysis reveals that LLMs can more accurately solve branch constraints by generating tailored mutators. These mutators enable the fuzzer to perform localized and reusable modifications on inputs that reach target branches, producing precise mutations to satisfy target constraints. To achieve this, we propose BULBASAUR, a branch-guided framework for online LLM-based mutator generation. BULBASAUR employs hard frontier-guided branch selection to identify critical branches, continuously collects and organizes relevant static and dynamic context to support high-quality mutator generation, and adopts an efficient and adaptive strategy to apply generated mutators during fuzzing.
| Document | Contents |
|---|---|
| docs/architecture.md | System architecture: chunked bitmap, LLM mutation thread, seed scheduling, TaintFuzz, instrumentation pipeline |
| docs/build.md | Building Bulbasaur and compiling target programs |
| docs/usage.md | Running Bulbasaur: basic mode and LLM mode, with parameter reference |
| docs/tools.md | Utility tools: branch_analyser, test_mutation_function |
| fuzzer/README.md | Fuzzer internals: core components, mutation strategies, seed scheduling, multi-threading |
| afl_llvm_mode/instrumentation/README.bulbasaur-instrumentation.md | LLVM instrumentation passes: fast/full/trace/debug pass details, ELF section layout |
| llm_scripts/README.md | LLM bridge script detailed reference |
export PATH=/path/to/clang+llvm-13/bin:$PATH
cd afl_llvm_mode && make -j$(nproc) && cd ..
cargo build --releaseexport CC=/path/to/Bulbasaur/afl_llvm_mode/afl-cc
export CXX=/path/to/Bulbasaur/afl_llvm_mode/afl-c++
./configure --disable-shared
# Basic mode requires 3 variants
make clean && BULBASAUR_INST_MODE=FAST make -j$(nproc) && cp <binary> targets/<program>_fast
make clean && BULBASAUR_INST_MODE=FULL make -j$(nproc) && cp <binary> targets/<program>_full
make clean && BULBASAUR_INST_MODE=TRACE make -j$(nproc) && cp <binary> targets/<program>_trace
# LLM mode additionally requires a debug variant (embeds source location information)
make clean && BULBASAUR_INST_MODE=DEBUG BULBASAUR_BRANCH_LOC_PATH=/path/to/output/ make -j$(nproc) \
&& cp <binary> targets/<program>_debugBasic mode
target/release/fuzzer \
-i seeds/ -o output/ -j 4 \
-f targets/<program>_full \
-t targets/<program>_trace \
-- targets/<program>_fast @@LLM mode
source /path/to/Bulbasaur/llm_scripts/.venv/bin/activate
export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"
export OPENAI_MODEL="gpt-4o"
python3 llm_scripts/bulbasaur_llm_bridge.py \
--fuzzer target/release/fuzzer \
--fast-target targets/<program>_fast \
--full-target targets/<program>_full \
--trace-target targets/<program>_trace \
--debug-target targets/<program>_debug \
--corpus seeds/ \
--output-dir output/ \
--branch-mapping targets/branch_loc.csv \
--source-base-path /path/to/target/src \
--function-loc targets/function_loc.csv \
--callgraph targets/callgraph_final.dot \
--jobs 4 --cpu-id 0 \
--exec-args @@See docs/build.md and docs/usage.md for full details.
Bulbasaur/
├── afl_llvm_mode/ # LLVM instrumentation passes and compiler wrapper scripts
│ └── instrumentation/ # Source for the four passes (fast/full/trace/debug)
├── fuzzer/ # Main fuzzer (Rust)
│ └── src/
│ ├── bin/ # Entry points: main.rs, branch_analyser.rs, test_mutation_function.rs
│ ├── branches/ # GlobalBranches (chunked bitmap + frontier_branch_map)
│ ├── depot/ # Seed corpus management, Thompson Sampling scheduling
│ ├── executor/ # Forkserver, three-target execution, ForkMutationExecutor
│ ├── search/ # Mutation strategies (AFL havoc, TaintFuzz, trim)
│ ├── llm_loop.rs # LLM mutation function loader thread
│ ├── fuzz_loop.rs # Main fuzzing loop (per thread)
│ └── stats/ # Statistics and terminal UI
├── common/ # Shared Rust library (config, shared memory definitions, trace data structures)
├── llm_scripts/ # LLM bridge (Python)
│ ├── bulbasaur_llm_bridge.py # Bridge main program (launches fuzzer + socket server)
│ ├── llm_agent.py # Multi-turn LLM interaction and code generation
│ ├── file_utils.py # Branch mapping, ELF parsing, source file lookup
│ ├── compilation.py # Generated Rust → cargo build → .so
│ └── README.md
├── docs/ # Detailed documentation
├── Cargo.toml
└── README.md
Bulbasaur's framework is based on Angora (S&P 2018) and incorporates techniques from AFL++.
The source code will be made publicly available after the paper is presented at USENIX Security 2026. If you have any questions in the meantime, feel free to reach out at wangyiyi25@mails.tsinghua.edu.cn.
