Proof, not guesses, for agent-caused regressions.
RegressProof is a standalone CLI and GitHub Action utility for detecting measurable AI coding regressions. It compares a baseline against a changed state, runs verification commands, maps failures to diffs, and produces evidence-focused reports instead of intuition-only blame.
If you are new to the repository, use this order:
npm install && npm run buildnpm run verify- inspect
/tmp/regressproof-mvp-*/regressproof-mvp-summary.json - inspect
examples/external-runs.jsonanddocs/REGRESSPROOF_CASE_STUDIES.md
That path shows the current self-proof surface first, then the pinned outside-repository evidence.
Current verified surface:
10curated external validation runs across9public repositories11/11tracked fixtures passing- standalone committed trust scenario:
successful_change / high - standalone deep trust scenario:
successful_change / high - external corpus includes TypeScript and Python repositories
- first-class conservative classifications for preexisting and environment failures
Proof artifacts:
- external run catalog:
examples/external-runs.json - case studies:
docs/REGRESSPROOF_CASE_STUDIES.md - proof ledger:
docs/REGRESSPROOF_PROOF_LEDGER.md - validation plan:
docs/REGRESSPROOF_VALIDATION_PLAN.md
Repository planning and positioning:
- positioning:
docs/REGRESSPROOF_POSITIONING.md - product brief:
docs/REGRESSPROOF_PRODUCT_BRIEF.md - standalone plan:
docs/REGRESSPROOF_STANDALONE_PLAN.md
For an external reviewer, the strongest compact demo today is:
- run
npm run verify - open the generated MVP summary in
/tmp/regressproof-mvp-*/regressproof-mvp-summary.json - confirm that fixtures, trust scenario, and deep trust scenario all complete
- open
examples/external-runs.json - inspect one pinned public-repository run and its changed-file evidence
One current reviewer-facing example:
- catalog record:
examples/external-runs.json->sindresorhus-is-type-guards-2026-05-05 - repository:
sindresorhus/is - committed range:
13febb6b01e24863ced3847a7ee112a48c154e0e~1..13febb6b01e24863ced3847a7ee112a48c154e0e - verdict:
successful_change / high - changed files:
source/index.tssource/types.tstest/type-tests.ts
- artifact path recorded in catalog:
/tmp/regressproof-sindresorhus-is-type-guards-artifacts/regressproof-report.json
This is the proof surface to optimize for: a reviewer can see the diff range, the touched files, the verdict, and where the full artifact lives.
RegressProof is most useful when the output answers review questions quickly:
| Reviewer question | RegressProof surface |
|---|---|
| Did a new failure appear, or was it already there? | introducedFailures vs preexistingFailures |
| Does the failure map back to the diff? | changed files, matchedChangedFiles, changed-file match fields |
| Is the tool making a strong claim or a cautious one? | verdict plus confidence |
| What should CI do? | configurable CI exit policy driven by verdict class |
| Where is the raw evidence? | JSON report, Markdown summary, PR summary, PR comment body, JSONL ledger |
Normal CI is necessary, but it usually answers only:
- did a check fail?
- is the branch red or green?
RegressProof is trying to answer the next layer:
- was the failure newly introduced relative to a baseline?
- does the failure point back to files changed in the patch?
- is this better classified as
confirmed_agent_fault,preexisting_failure,environment_failure, orinsufficient_evidence?
So the value is not "tests, but again." The value is baseline-aware interpretation, diff-aware evidence, and conservative fault classification around the tests you already trust.
RegressProof is proprietary source-available software unless a separate written agreement says otherwise.
See LICENSE and NOTICE.md.
Current MVP capabilities:
- local CLI execution
- GitHub Action execution
- baseline vs current quick-check verification
- diff-aware changed-file mapping
- conservative verdict classes
- JSON and Markdown artifacts
- append-only internal ledger output
- tracked fixture packs for reproducible validation
- committed real-repo trust scenarios
- curated public-repository validation catalog
Preferred top-level commands:
npm run verify
npm run trust
npm run trust:deep
npm run readiness
npm run standalone:exportThese aliases keep the public workflow easier to remember while preserving the lower-level scripts used by the validation harness.
RegressProof is now beyond fixture-only proof.
What is already confirmed:
- full
verify-mvppasses end-to-end - fixture suite passes
11/11 - committed trust scenario passes on the standalone repository
- committed deep trust scenario passes on the standalone repository
- external public-repository validation has been exercised on:
- docs/plugin repositories
- larger docs/configuration repositories
- code-plus-test repositories
- Python code-plus-test repositories
- larger code repositories with provider-oriented tests
Most recent external run:
- repository:
sindresorhus/is - pinned range:
13febb6b01e24863ced3847a7ee112a48c154e0e~1..13febb6b01e24863ced3847a7ee112a48c154e0e - repo-specific result:
successful_change / highon a type-guard narrowing slice - changed-file evidence includes
source/index.ts,source/types.ts,test/test.ts, andtest/type-tests.ts
Further reading:
docs/REGRESSPROOF_CASE_STUDIES.mdexamples/README.md
Current verdict classes:
successful_changeconfirmed_agent_faultpreexisting_failureenvironment_failureinsufficient_evidence
Install dependencies and build:
npm install
npm run buildRun the main MVP verification flow:
npm run verify:mvpIf the default system temp directory is low on space, use an explicit output directory:
npm run verify:mvp -- --out-dir /private/tmp/regressproof-verifyThis runs:
- the tracked fixture suite
- the committed trust scenario
- the deeper committed trust scenario
The final summary is written to:
/tmp/regressproof-mvp-*/regressproof-mvp-summary.json
The most honest first run is a narrow, stable verification slice on a real committed diff.
Recommended first pass:
- start from this repository's
regressproof.config.jsonshape - keep the first check set small and meaningful
- use checks that already pass reliably on both
HEAD~1andHEAD - inspect the artifacts before widening scope
Build RegressProof once:
npm install
npm run buildRun a local report against your repository:
node dist/cli.js run \
--repo /path/to/your-repo \
--config /path/to/your-repo/regressproof.config.json \
--artifact-dir /tmp/regressproof-your-repo \
--format jsonCheck whether the current committed range is ready for diff-aware validation:
npm run real:readiness -- --repo /path/to/your-repoRun the committed validation flow on your repository:
node scripts/run-committed-real-repo-validation.js \
--repo /path/to/your-repo \
--config /path/to/your-repo/regressproof.config.json \
--head-ref HEAD \
--artifact-dir /tmp/regressproof-your-repo-committedIf you want a public-repository-style temporary clone flow instead, adapt one of the configs in examples/ and use:
npm run real:public -- \
--url https://github.com/owner/repo.git \
--config ./examples/external-doc-plugin.config.json \
--head-ref HEAD \
--artifact-dir /tmp/regressproof-public-demoGood first targets are:
- one fast build command
- one targeted test command
- changed files that stay inside that verification boundary
RegressProof can also run as a composite GitHub Action from this repository.
Minimal PR workflow:
name: RegressProof
on:
pull_request:
jobs:
regressproof:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: tc7kxsszs5-cloud/RegressProof-cli@main
with:
repo: ${{ github.workspace }}
config: regressproof.config.json
head-ref: HEAD
artifact-dir: regressproof-artifacts
ci: "true"
- uses: actions/upload-artifact@v4
if: always()
with:
name: regressproof-artifacts
path: regressproof-artifacts/The action emits these outputs for later workflow steps:
verdictconfidencechanged_file_matchci_should_failreport_jsonreport_markdownreport_pr_markdownreport_pr_comment
Run the fixture suite:
npm run fixtures:run-allRun the committed trust scenario against the current repository:
npm run real:scenarioRun the deeper committed trust scenario:
npm run real:scenario:deepRun committed validation directly:
npm run real:committed -- --repo .List curated public-repository validation evidence:
npm run external:runsValidate the external-run catalog schema:
npm run external:checkPlan the next external corpus pass:
npm run external:run-corpusThe corpus runner intentionally treats queue entries as candidates until they are promoted with pinned execution metadata.
Use --execute --id <candidate-id> only after adding a repository-specific config, pinned headRef, and artifact directory.
Check whether a committed range is ready for trust validation:
npm run real:readiness -- --repo .Build the distributable dist/ output:
npm run buildRun the CLI directly from source:
node src/cli.js run --repo ./fixtures/simple-js --format jsonThe standalone layout is:
src/scripts/fixtures/docs/REGRESSPROOF_*.mdregressproof.config.jsonregressproof.real-repo.config.json
The standalone real-repo config assumes verification is executed from the repository root.
Fixtures use tracked scenario packs:
tracked/baselinetracked/currentfixture.materializer.json
Preferred flow:
- materialize a fixture into a temporary git repository
- run RegressProof against that temporary repo
Example:
node scripts/materialize-fixture.js \
--fixture ./fixtures/lint-js \
--out-dir /tmp/regressproof-materialized-lint
node src/cli.js run \
--repo /tmp/regressproof-materialized-lint/repo \
--config /tmp/regressproof-materialized-lint/repo/regressproof.config.json \
--format jsonHelpers:
npm run fixture:materialize -- --fixture ./fixtures/lint-js
npm run fixture:export-pack -- --fixture ./fixtures/lint-js
npm run fixtures:export-all-packsThe committed trust flow is meant to prove that RegressProof can validate itself or another repository through a real HEAD~1..HEAD range instead of only through isolated fixtures.
What the trust flow asserts:
- committed readiness is
ready diffRangeisHEAD~1..HEAD- baseline mode is
path_snapshot - current mode is
snapshot - verdict is
successful_change - confidence is
high
The deep trust flow uses a broader nested subset:
lint-jspreexisting-jsparser-jspython-js
RegressProof can now be demonstrated on public repositories outside its own codebase.
That does not mean every repository can be judged with one universal config. For code-heavy repositories, the most honest path is:
- choose a committed range such as
HEAD~1..HEAD - define a repository-appropriate build/test slice
- classify the result conservatively
This is already enough to show that RegressProof works as a real validation layer, not only as an internal demo.
Each run can emit:
regressproof-report.jsonregressproof-summary.mdregressproof-pr-summary.mdregressproof-pr-comment.md- append-only ledger JSONL
Use --artifact-dir to control where they are written.
Example:
node src/cli.js run \
--repo /tmp/regressproof-materialized-lint/repo \
--config /tmp/regressproof-materialized-lint/repo/regressproof.config.json \
--format json \
--artifact-dir /tmp/regressproof-artifactsIn CI mode, RegressProof exits non-zero only for configured verdict classes:
node src/cli.js run \
--repo /tmp/regressproof-materialized-lint/repo \
--config /tmp/regressproof-materialized-lint/repo/regressproof.config.json \
--format json \
--artifact-dir /tmp/regressproof-artifacts \
--ciExample configs for validating external repositories live at:
examples/external-doc-plugin.config.jsonexamples/external-click-flag-value.config.jsonexamples/external-ky-hooks.config.jsonexamples/external-nanostores-global-epoch.config.jsonexamples/external-ofetch-timeout-signal.config.jsonexamples/external-openclaw-code.config.jsonexamples/external-pluggy-pluginmanager.config.jsonexamples/external-scqos-python.config.jsonexamples/external-oh-my-codex-stable-slice.config.jsonexamples/README.md
Example:
node scripts/run-committed-real-repo-validation.js \
--repo /tmp/andrej-karpathy-skills \
--config ./examples/external-doc-plugin.config.json \
--head-ref HEAD \
--artifact-dir /tmp/regressproof-external-doc-pluginFor public GitHub repositories that should be cloned into a temporary workspace first, use:
npm run real:public -- \
--url https://github.com/openclaw/openclaw.git \
--config ./examples/external-openclaw-code.config.json \
--head-ref 97534372f858b5f67a98619a3fed8790edb00cc7 \
--artifact-dir /tmp/regressproof-openclaw-pinned-artifactsFor compact self-checking Python repositories, the SCQOS example is useful as an exploratory config and synthetic regression target:
npm run real:public -- \
--url https://github.com/KnowledgeeKZA3224/scqos-reference-implementation.git \
--config ./examples/external-scqos-python.config.json \
--head-ref 4a384ad08139c4311aaefc84bfc6d05f0ae1fa41 \
--artifact-dir /tmp/regressproof-scqos-artifactsTreat that SCQOS target conservatively: the upstream repository currently exposes a single public commit, so it is not completed corpus proof until a real pinned baseline/head range exists.
This repository currently lives inside a larger workspace, so there is also a workspace-oriented config:
regressproof.workspace-repo.config.json
That config exists to keep local development inside the larger workspace working. The default product direction is still standalone-first.
Read these first when resuming work:
docs/REGRESSPROOF_INDEX.mddocs/REGRESSPROOF_PRODUCT_BRIEF.mddocs/REGRESSPROOF_SPEC.mddocs/REGRESSPROOF_IMPLEMENTATION_PLAN.mddocs/REGRESSPROOF_MVP_TASK_BREAKDOWN.mddocs/REGRESSPROOF_VALIDATION_PLAN.mddocs/REGRESSPROOF_DECISION_LOG.md