A Clang/LLVM plugin for C memory safety, with two paths:
- Sound path — annotate your code with the
pg(...)dialect and pagurus proves memory + thread safety, rejecting anything not provably safe. Activated automatically on any file that uses the dialect. - Confirmed-bug path — on plain, un-annotated C, pagurus reports only high-confidence, confirmed memory bugs — never a heuristic or a false positive. This is the default for un-annotated code.
You pick the path by how much you annotate: nothing → a precise bug-finder on legacy C; full pg(...) annotations → a sound checker. A file is routed to the sound engine the moment it uses any pg(...) annotation (override with -Xclang -plugin-arg-pagurus -Xclang heuristic / … sound).
Only high-confidence, confirmed memory bugs — no lints, no heuristics, no false positives.
| Rule | Name | Description |
|---|---|---|
| E001 | use-after-free |
Dereference or call after free() |
| E002 | double-free |
free() called twice on the same pointer |
| E004 | return-of-local |
return &local, return &s.f, return arr, or return p where p = &local — dangling reference |
| E005 | null-deref |
Dereference without null check after malloc |
| E006 | uninit-use |
Variable read before initialisation |
| E011 | array-oob |
Constant array index out of declared bounds |
| E022 | invalid-free |
free() of a non-heap target: &x, p + n, a local array, or a pointer holding a stack address |
| E023 | use-after-realloc |
Using the old pointer after q = realloc(p, …) may have freed/moved it |
| Rule | Name | Analysis method |
|---|---|---|
| IR-E001 | use-after-free |
AliasAnalysis::isMustAlias + DominatorTree |
| IR-E001b | use-after-free (GEP) |
GEP element of freed object |
| IR-E002 | double-free |
Two free() calls that MustAlias |
| IR-E011 | array-oob |
ScalarEvolution proves a loop index leaves a constant-size or symbolic (malloc) object's bounds; see IR_SCEV_OOB.md |
| IR-E022 | invalid-free |
free() of a non-heap object (getUnderlyingObject → GlobalVariable/Alloca); module pass, runs at -O0 |
The IR-level checks route through Clang's
DiagnosticsEngine(via theLLVMContextdiagnostic handler), so they appear as realerror:s with a source caret, are visible to IDEs, and are tested with-verifyexactly like the AST checks (tests/run_ir_tests.sh, thepagurus_irCTest). They need codegen + SSA, so the function-pass witnesses run at-O1; IR-E022 is a module pass and runs even at-O0.The leak, lifetime, borrow-conflict, drop, lint (
strcpy/format-string), and data-race checks the heuristic engine used to emit are not in this path — they are either inherently heuristic (and misfire on idiomatic C) or now belong to the sound path. They moved to the sound engine or were retired.
The confirmed-bug path above is an unsound bug-finder (high precision, accepts false negatives). The sound engine is the opposite: over pg(...)-annotated code it rejects all in-scope UB — what is not provably safe is an error. It is a separate analysis (src/pagurus_sound.cpp); sound-dump prints its ownership IR.
The dialect adds capability/bounds/lock annotations through a single pg(...) macro — pg(owned), pg(mut, a), pg(ref, a), pg(count, n), pg(guarded_by, m), pg(requires, m), pg(send)/pg(sync), pg(drop, free_fn) — that expands to __attribute__((annotate("pagurus::…"))). It is real C, ignored by plain compilers, and (since only pg is a macro) hijacks no common identifiers. Example: pg(owned) char *dup(pg(ref, a) const char *src pg(count, n), size_t n). See DIALECT.md for the spec and SOUND_REDESIGN.md for the engine and the staged build.
| Rule | Name | Domain |
|---|---|---|
| E500–E505 | use-after-free / double-free / invalid-free / use-after-move / uninit-use / leak (incl. pg(drop) types) |
temporal |
| E520–E525 | alias-violation / mutate-through-shared / use-while-borrowed / dangling-borrow / lifetime-mismatch / missing-capability | borrow & lifetime (mut-XOR-ref) |
| E540–E543 | out-of-bounds / null-deref / bad-bounds / ptr-past-end | spatial (static-only) |
| E560–E564 | unguarded-access / missing-capability-at-call / non-send / non-sync / lock-order | concurrency (lockset + threads) |
clang-18 -fsyntax-only -I std -fplugin=./build/pagurus_plugin.so \
-Xclang -plugin-arg-pagurus -Xclang sound my_dialect_code.cThe sound engine uses a CFG ownership IR, a monotone fixpoint state-dataflow (sound "unsafe-if-any-predecessor" join), location-sensitive borrow loans, a static bounds check, and a flow-sensitive lockset. Its oracle suite lives in tests/sound/.
Routing: a translation unit that uses the dialect — any pg(...) annotation — is checked by the sound engine automatically, no flag needed; un-annotated C goes to the confirmed-bug engine (so existing code is unaffected). sound forces the sound engine; heuristic forces the confirmed-bug engine even on annotated code.
# Ubuntu 24.04
sudo apt install clang-18 llvm-18-dev libclang-18-dev cmake
mkdir build && cd build
cmake .. \
-DLLVM_DIR=$(llvm-config-18 --cmakedir) \
-DClang_DIR=/usr/lib/llvm-18/lib/cmake/clang
make -j$(nproc)See BUILDING.md for detailed build instructions and runtime dependencies.
# AST checks only (E001–E021):
clang -fplugin=./build/pagurus_plugin.so -c your_file.c
# AST + LLVM IR analysis (adds GEP/bitcast/loop-carried checks + drop injection):
clang -fplugin=./build/pagurus_plugin.so \
-fpass-plugin=./build/pagurus_plugin.so \
-g -O0 -c your_file.cNote:
#pragma pagurusis retired. Annotations are written with thepg(...)dialect macro (sound path); plain C needs no annotations (confirmed-bug path). See DIALECT.md.
Run pagurus across an entire codebase using the included tools:
# Check all files under src/ with 4 parallel jobs
./pagurus-check --plugin=./build/pagurus_plugin.so \
--cflags="-Iinclude" \
--jobs=4 --dir=src
# Or use a compilation database
bear make
./pagurus-check --plugin=./build/pagurus_plugin.so \
--compile-db=compile_commands.jsonIntegrate into Makefiles:
# myproject/Makefile
PAGURUS_PLUGIN = /path/to/build/pagurus_plugin.so
include /path/to/pagurus.mk
# Then run:
# make pagurus-checkSee INTEGRATION.md for complete integration guide.
- Non-lexical lifetimes (NLL): Precise borrow tracking with loan release at last use
- Control flow analysis: Conditional and loop-aware borrow propagation
- Inter-procedural: Function summaries for return-alias and parameter effects
- Move semantics: Rust-style ownership transfer for drop-annotated types
- Drop injection: Automatic RAII-style cleanup at IR level with
-fpass-plugin= - Source transformation: Produces plain C code without pagurus annotations
- Two-tier analysis: AST for precision + IR for patterns invisible at source level
- BUILDING.md — Build instructions and runtime dependencies
- ANNOTATIONS.md — Complete
#pragma pagurusreference - INTEGRATION.md — Multi-file project integration with
pagurus-checkandpagurus.mk - ARCHITECTURE.md — Technical architecture and implementation details
MIT