Verifiable AI-build governance you run on yourself. Four governance modules — an autonomy gate, build discipline, model fitness, and agent-diff code review — that prove themselves with one offline command, where every claim prints its own pass count. Not advice. Not a PDF. A runnable artifact whose trust mechanism is a reproducible exit code on your machine.
git clone https://github.com/AkuchiS/AgentProof.git
cd AgentProof
python3 verify.py # or: make verifyRunning python3 verify.py on a clean machine (no install, no network):
================================================================
AgentProof -- verify.py
Running the REAL selftests on THIS machine. No network. Stdlib only.
python: 3.10.12
================================================================
[PASS] autonomy gate autonomy_gate.py -> 19/19 (exit=0)
[PASS] build discipline build_discipline.py -> 22/22 (exit=0)
[PASS] model fitness model_fitness.py -> all passed (exit=0)
[PASS] aegis codereview aegis_codereview.py -> 23/23 (exit=0)
================================================================
RESULT: PASS (every governance module passed its own selftest)
================================================================
The pass counts above are printed by the modules themselves, not asserted by this README. If your numbers differ, trust your machine, not this file. Your exit code is the product.
Four real governance modules — the same governance files we run internally — plus a single verifier.
| File | What it does | Its own selftest |
|---|---|---|
modules/autonomy_gate.py |
Scores an action AUTO / ESCALATE / DEFER on 4 axes (reversibility, cost-of-failure, process-maturity, requirement-stability) + a hard allowlist (email/DM/payment/secrets/legal are always ESCALATE) + a goal-vs-task win-condition verifier that blocks "done at 80%". |
19/19 |
modules/build_discipline.py |
The laziness ladder: steers agent code generation toward skip > stdlib > native feature > one-liner > minimal code, and flags any diff over a line budget that lacks a named # CEILING: justification. |
22/22 |
modules/model_fitness.py |
Scores a candidate model against 5 fixed in-house tasks on 6 normalized columns, plus a sovereignty gate (license / open-weights / on-prem-feasible / ≥2 independent benchmark boards). Promotes only on fitness ≥ 60 AND sovereignty pass. | all passed |
modules/aegis_codereview.py |
Reviews agent-generated diffs for (a) newly-added dependencies and (b) silent bugs a type-checker misses — e.g. INSERT/UPDATE/DELETE.execute() with no .commit(), build-then-return None, wrong-arity helper reuse, view-cleared-but-backing-store-not. |
23/23 |
verify.py |
Runs all four selftests as subprocesses, parses the real pass counts, prints a combined PASS/FAIL and exit code. This is the trust mechanism. |
— |
There is nothing to install.
- Requirement: Python 3 (developed on 3.10; uses only the standard library —
re,json,sys,subprocess,os). No pip, no venv, no internet. - Run everything:
python3 verify.py - Run via make (if you have it):
make verify - Run the raw, unaggregated selftests:
make selftests, or directlypython3 modules/autonomy_gate.py(each module is its own runnable selftest).
Note on
make:make verifyandmake selftestsare convenience wrappers. Ifmakeis not on your machine,python3 verify.pyis the universal entrypoint and does exactly the same thing. The Makefile recipe body ispython3 verify.py.
- These are governance modules, not a turnkey platform. They classify,
score, and flag. Wiring them into your pipeline (calling the gate before an
action, running the codereview on your diffs) is your integration work. See
WIRING.mdfor exactly how much of this is wired into our own live systems today — the honest answer is 3 of 4 (all advisory), 1 staged. aegis_codereviewships with a stubbed CVE check (check_cvereturnsunknownoffline by design) so the core stays network-free. Dependency detection is real; CVE lookup is left to you / AEGIS Scan.- No revenue, results, or P&L claims appear anywhere in this product. The only numbers we publish are integers a selftest on your own machine prints.
AgentProof is one of AkuchiS's open trust artifacts — tools whose credibility comes from running, not asserting. Its sibling AEGIS Guard is our open input/output safety guard for AI agents, public and runnable: https://github.com/AkuchiS/aegis-guard. Clone it, run its selftest, see the numbers yourself. The same "a tool that runs beats prose that asserts" rule that governs this repo governs that one. See the rest of the toolkit at github.akuchis.com.
README.md— this fileverify.py— the one-command verifier (the trust mechanism)Makefile—make verify/make selftestswrappersmodules/— the 4 governance modules, copied in verbatim with selftests intactRECEIPTS.md— every claim mapped to the exact command + integer that proves itMETHODOLOGY.md— the kill-gate rubric + written method, as docs for the codeWIRING.md— the honesty ledger: which modules are wired live vs runnable-but-unwired
An AkuchiS tool · MIT · part of the Proof family → github.akuchis.com · Sponsor the work