A maintainability ratchet for AI-assisted Python. The bar can only move down.
AI coding agents are very good at writing code that compiles, runs, and passes the tests they ship with it. They are less good at:
- writing meaningful tests for the new code,
- noticing a 30-line function quietly became 130 lines,
- catching that the public API now exposes a function with no callers in tests,
- realising a small refactor turned an
ifladder into a 14-way cyclomatic monster.
A traditional review catches some of this. A ratchet catches all of it, mechanically, every time. riskratchet computes a per-function risk score from coverage gaps, cyclomatic complexity, churn, public surface, and sprawl, then fails CI or blocks the commit whenever risk grows past a baseline. Nobody has to play complexity cop.
The review workflow is inspired by
cargo-crap (which made the CRAP
metric practical in CI with baselines, PR comments, and JSON output) and
Cursor's thermo-nuclear-code-quality-review
agent prompt (which emphasises maintainability, structure, sprawl, and
explicit boundaries). riskratchet is neither a Python port of cargo-crap nor an
agent prompt: it reports CRAP and adds Python-specific signals on top (branch
gaps, churn, public surface, sprawl).
pip install riskratchet
# or run without installing
uvx riskratchet --help# 1. run your tests with coverage in JSON form
pytest --cov --cov-report=json:coverage.json
# 2. snapshot the current risk profile
riskratchet baseline src --coverage coverage.json --output .riskratchet.json
# 3. inspect what was captured
riskratchet scan src --coverage coverage.json
# 4. fail the build when risk regresses
riskratchet check src --coverage coverage.json --baseline .riskratchet.jsonriskratchet check exits 1 on regressions, 2 on usage errors (e.g. missing
baseline), and 0 otherwise.
For early adoption before a baseline exists, check --fail-above N gates
on an absolute threshold without requiring a baseline (baseline gating
remains the recommended mode for mature codebases):
# No baseline yet: fail if any function scores above 60.
riskratchet check src --coverage coverage.json --fail-above 60
# scan also exposes a no-baseline gate (different exit/output shape).
riskratchet scan src --coverage coverage.json --fail-above 75
riskratchet scan src --coverage coverage.json --fail-severity highWhen --baseline and --fail-above are both given, the baseline gate
is authoritative and --fail-above is ignored with a stderr warning.
riskratchet init scaffolds a [tool.riskratchet] section in
pyproject.toml and prints a ready-to-paste CI snippet. With
--with-baseline (or by saying yes to the interactive prompt on a
TTY when pytest is detected), it also runs pytest --cov and creates
the baseline in one go:
riskratchet init # write config, print snippet
riskratchet init --with-baseline # also run pytest --cov + baseline
riskratchet init --force # replace existing [tool.riskratchet]riskratchet doctor is a six-check pre-flight that names whatever
would make check fail to start (missing paths, missing/malformed
baseline, missing/stale coverage, no git history, unknown config
keys, invalid suppressions) and prints the exact fix command for
each. The status table goes to stdout; the → fix: remediations go
to stderr so you can pipe them separately:
riskratchet doctor # human-readable table + remediation
riskratchet doctor --json # validates against schemas/doctor.schema.json
riskratchet doctor 2>/dev/null # status table only
riskratchet doctor >/dev/null # remediation commands onlydoctor exits 0 only when every check is pass or warn; a single
fail exits 1. The intended workflow is init → doctor → fix the
warnings → baseline → check.
The composite action ships in action.yml so adopters don't have to
copy a workflow file — uses: KayhanB21/riskratchet@v0.2.8 is the
canonical reference. The action installs riskratchet via uv tool install, runs check (--format pr-comment in both baseline and
no-baseline modes), upserts a sticky PR comment, and surfaces the
check exit status so PR checks reflect regressions.
# .github/workflows/riskratchet.yml
on: [pull_request]
jobs:
riskratchet:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: KayhanB21/riskratchet@v0.2.8
with:
coverage: coverage.jsonInputs (defaults in parentheses): paths ([tool.riskratchet] paths), coverage (auto-detected), baseline (.riskratchet.json
— when the file is missing, the action runs in --fail-above mode),
fail-above (60), comment (true), python-version (3.12),
riskratchet-version (latest from PyPI), github-token
(${{ github.token }}).
For Marketplace discovery, the KayhanB21/riskratchet-action
wrapper repo is the recommended entry point; it delegates to the
root action.yml so both shapes share one source of truth.
Every tagged release ships supply-chain provenance you can inspect: a CycloneDX
SBOM of the wheel's runtime dependency closure (the sbom workflow artifact), a
signed GitHub build-provenance attestation on the wheel and sdist, and PEP 740
PyPI attestations from Trusted Publishing. To confirm a downloaded wheel was built
by this repo:
gh attestation verify riskratchet-<version>-py3-none-any.whl --owner KayhanB21See docs/threat-model.md for
what each artifact does and does not vouch for.
You've been vibe-coding a FastAPI backend with an AI agent for eight months.
It works, tests are green-ish (62% coverage), but you just noticed
services/billing.py::reconcile_subscriptions quietly grew to 180 lines and
an 11-way match statement you don't remember writing.
pip install riskratchet
pytest --cov --cov-branch --cov-report=json:coverage.json
riskratchet scan src --coverage coverage.json --top 10reconcile_subscriptions shows up at score 71 (high) with
structural_complexity: 90, sprawl: 55, coverage_gap: 60. You also spot a
surprise: a 12-line public utility _normalize_plan_id scoring 48 because it
has zero tests. Snapshot the bar:
riskratchet baseline src --coverage coverage.json --output .riskratchet.json
git add .riskratchet.json && git commit -m "Add riskratchet baseline"From here, every time the agent adds a webhook handler or "refactors the
billing flow," run riskratchet check before committing. If it quietly bloated
reconcile_subscriptions from 180 to 220 lines, the check exits 1 and names
the regression. You stop having to remember to look.
Why this is the canonical use case: AI agents are excellent at adding code, mediocre at noticing they've made things worse. The baseline is your memory.
- Team gating PRs in CI. Run
pytest --covandriskratchet check --format pr-commentin GitHub Actions; pipe togh pr comment. The PR-comment format starts with<!-- riskratchet-report -->so the bot updates the same comment on each push instead of spamming. The ratchet is mechanical and unowned, so nobody has to play "the complexity cop" in code review. See Using riskratchet from an AI coding agent. - Pre-commit hook for a solo repo. Wire
pytest-covandriskratchetinto.pre-commit-config.yamlso every commit regenerates coverage and gates the commit on no regressions. See Pre-commit integration. - Investigating one ugly function. Use
riskratchet explain path/to/file.py::qualnameto dump the six component scores and find the driver (complexity vs. coverage vs. sprawl). After refactoring, runriskratchet diff --json | jq '.improved[], .regressed[]'to prove the change was net-positive, not just rearranging deck chairs.
The classic CRAP score (CC^2 * (1 - line_coverage)^3 + CC) catches one
shape of bad code: complex and poorly tested. That's a real problem, but
it misses several others that ship to production just as often:
- A function with low complexity and zero tests. CRAP gives it
CC(a single digit). Risk is real but invisible. - A function with full line coverage but every branch covered the same way. CRAP only looks at line coverage.
- A function in a 2,000-line module everyone is afraid to touch. Sprawl is invisible to CRAP.
- A function that changed in 40 of the last 90 commits. Churn is invisible to CRAP.
riskratchet keeps CRAP as a reported metric and computes its own composite score from six weighted components so those other risks show up too.
Two things about pre-commit matter for riskratchet:
- Pre-commit hides your unstaged edits before running hooks. Hooks only see the code you're actually about to commit. Useful in general, but it means riskratchet sees a different source tree than the one open in your editor.
- Each
language: pythonhook runs in its own isolated virtualenv that contains riskratchet and its declared deps, not your project's pytest, application code, or test plugins.
Together these create one requirement: the coverage.json riskratchet reads
must reflect the same stashed source tree it's analyzing. Reusing an old
coverage.json from before pre-commit stashed your edits drifts source and
coverage out of sync.
That's why the published hook ships with --no-auto-cov --allow-missing-coverage by default: safe but limited. Pick one of the
patterns below to make it useful.
Run pytest --cov inside the same pre-commit chain so the coverage matches
the stashed tree exactly.
repos:
- repo: local
hooks:
- id: pytest-cov
name: pytest --cov (produces coverage.json for riskratchet)
entry: pytest --cov --cov-branch --cov-report=json:coverage.json -q
language: system
pass_filenames: false
always_run: true
- repo: https://github.com/KayhanB21/riskratchet
rev: v0.2.8
hooks:
- id: riskratchet
args:
- "src"
- "--coverage"
- "coverage.json"
- "--baseline"
- ".riskratchet.json"Skip the isolated venv entirely and run both hooks inside your project's environment. This is what riskratchet itself uses:
repos:
- repo: local
hooks:
- id: pytest-cov
entry: uv run pytest --cov --cov-branch --cov-report=json:coverage.json -q
language: system
pass_filenames: false
always_run: true
- id: riskratchet
entry: uv run riskratchet check src --coverage coverage.json --baseline .riskratchet.json --no-auto-cov
language: system
pass_filenames: false
always_run: trueTwo upsides: single env for both hooks (no isolated-venv surprises), and
uv run resolves the same Python and deps uv sync set up. Downside:
contributors must have uv installed locally.
Override the hook to language: system so it inherits your shell PATH (and
finds your real pytest):
repos:
- repo: local
hooks:
- id: riskratchet
entry: riskratchet check src --baseline .riskratchet.json
language: system
pass_filenames: false
always_run: trueriskratchet runs the configured [tool.riskratchet] test_command (default
pytest --cov --cov-branch --cov-report=json:{output} -q) and caches the
result under .riskratchet/coverage.json. The cache is reused until any .py
file under the scan paths is newer.
For local development outside pre-commit, auto-coverage applies to plain
riskratchet scan|baseline|check too; pass --no-auto-cov to opt out.
riskratchet is designed to be called from agents and parsed without
screen-scraping. See AGENTS.md for the full operational
contract; the recipes below cover the common cases.
Top three highest-risk functions:
riskratchet scan src --coverage coverage.json --json \
| jq '.functions[:3] | .[] | {qualname, score, severity}'Full baseline diff including improvements and removed functions:
riskratchet diff src --coverage coverage.json \
--baseline .riskratchet.json --jsonGate a CI job on regressions:
riskratchet check src --coverage coverage.json \
--baseline .riskratchet.json --json > regressions.json
status=$?
if [ "$status" -eq 1 ]; then
jq -r '.regressions[] | "- \(.qualname): \(.reason)"' regressions.json
exit 1
fi
exit "$status"Post regressions as a PR comment (use --format pr-comment for a sticky body
that updates in place via the <!-- riskratchet-report --> marker; use
--format github for inline workflow warnings):
riskratchet check src --coverage coverage.json \
--baseline .riskratchet.json --format markdown \
| gh pr comment --body-file -Markdown and PR-comment output can link each row back to source:
riskratchet scan src --format pr-comment \
--repo-url https://github.com/acme/project \
--commit-ref "$GITHUB_SHA"In GitHub Actions, those values are filled from GITHUB_SERVER_URL,
GITHUB_REPOSITORY, and GITHUB_SHA when available.
JSON output is validated against the schemas under
schemas/ on every release:
report.schema.json:scan --jsonregressions.schema.json:check --jsondiff.schema.json:diff --jsonbaseline.schema.json:.riskratchet.jsonon disksummary.schema.json:scan|check|diff --summary --jsonconfig.schema.json:config show --json
Native JSON output includes $schema and version fields so consumers can
pin parsing behavior.
- Running
checkwithout a baseline.riskratchet baselinemust run first (typically onmain) and the resulting.riskratchet.jsonchecked in. Exits2when missing. - Passing
coverage.xmlto--coverage. riskratchet readscoverage.json. Generate it withpytest --cov --cov-report=json:coverage.json. - Parsing stdout as both prose and JSON. Pick a format. With
--json, stdout is a single JSON object; status messages go to stderr. Whencheckexits1, a short hint with the two escape hatches (regenerate baseline, or loosen the per-component gate) is written to stderr, so stdout stays clean. - Bumping the baseline to silence a regression. The baseline is the bar;
if it has to move up, do it in a dedicated PR with a written justification.
In
checkoutput, "new" means absent from the baseline, so a function added in an earlier commit can still appear as new until the baseline intentionally accepts it.
For the broader trust boundaries and non-goals, see
docs/threat-model.md.
--exclude skips files at discovery time. --allow analyzes a file but
suppresses matching functions from reporting and gating:
riskratchet check src --baseline .riskratchet.json \
--allow "GeneratedModel.*" \
--allow "src/generated/**"Function patterns match dotted qualified names. Patterns containing / or
** match repo-relative POSIX paths.
The default missing-coverage policy is pessimistic: unmapped functions are treated as uncovered. For partial local runs:
riskratchet scan src --coverage coverage.json --missing-coverage optimistic
riskratchet scan src --coverage coverage.json --missing-coverage skipoptimistic treats missing file coverage as fully covered. skip drops
functions from unmapped files and reports the skipped count.
riskratchet ships a pytest plugin that runs check as part of your test
session:
pytest \
--cov --cov-report=json:coverage.json \
--riskratchet \
--riskratchet-paths src \
--riskratchet-baseline .riskratchet.jsonThe session exits non-zero when riskratchet finds regressions, so CI can gate
on pytest alone. Available flags:
--riskratchet(required to enable)--riskratchet-paths(default:src, repeatable)--riskratchet-baseline(default:.riskratchet.json)--riskratchet-coverage(default:coverage.json)--riskratchet-fail-new-above(default:50)--riskratchet-fail-regression-above(default:5)--riskratchet-fail-existing-above(default: unset)--riskratchet-fail-component-regression-above(default:15)--riskratchet-no-component-regression-gate
Each function gets six component scores in [0, 100]:
| Component | Weight | What it measures |
|---|---|---|
| coverage_gap | 30% | 1 - line_coverage |
| structural_complexity | 25% | cyclomatic complexity, saturating at CC=20 |
| branch_gap | 15% | 1 - branch_coverage when branch coverage is known |
| churn | 10% | commits in the last 90 days, saturating at 10 |
| public_surface | 10% | coverage gap penalised harder when the function is public |
| sprawl | 10% | function length and file length blended |
Total risk is the weighted sum. Severity bands: 0-24 low, 25-49 medium, 50-74 high, 75-100 critical.
is_public is determined statically from the AST: by qualname when no
__all__ is declared (leading-underscore is private, dunders are public);
by additive promotion from a static __all__ (omission never demotes);
fall back to the naming rule when __all__ is dynamic. Full rules in
AGENTS.md.
Each component is rescaled to [0, 100] (where 100 = maximum risk for that
signal) before being weighted into the total. Here's what each one actually
means, with a concrete example.
coverage_gap: "is this function tested at all?"
The fraction of lines in the function that your test suite never executes.
A function with 100% line coverage scores 0; a function with 0% line
coverage scores 100.
Example: a 40-line
parse_invoicewhere your tests only exercise the happy path (28 lines covered, 12 missed) givescoverage_gap = 30. A brand-newmigrate_to_v2with no tests at all givescoverage_gap = 100.
structural_complexity: "how many ways can this function go?"
Cyclomatic complexity, which roughly counts independent paths through the
function (each if, elif, and, or, for, except adds one).
Saturates at CC=20; anything past that is already "very complex" and
there's no value in keeping count.
Example: a getter with one return statement is
CC=1, score 0. Avalidate_user_inputwith 6 chainedif/elifbranches isCC=7, score ~35. A 14-waymatchstatement isCC=15, score ~75.
branch_gap: "are both sides of every if tested?"
Like coverage_gap, but for branches. A function whose tests only ever
take the if True path of an if/else will have full line coverage but
only 50% branch coverage. Only counts when your coverage run included
--cov-branch.
Example:
def discount(user): return 0.2 if user.is_premium else 0.0. A test that only passes premium users gets 100% line coverage but 50% branch coverage, sobranch_gap = 50.
churn: "how often does this function change?"
Number of git commits touching the function's line range in the configured
churn window (default 90 days, set with --churn-days or [tool.riskratchet] churn_window_days). Saturates at 10 commits. High churn means many people
have edited it recently, which correlates with bugs.
Example: a stable
parse_iso_datelast touched two years ago ischurn = 0. Apricing_engine.calculate_totaledited in 14 of the last 90 commits saturates at 10, sochurn = 100.
public_surface: "if this breaks, do callers we can't see break too?"
A multiplier on coverage gap: when a function is part of your public API,
its missing coverage is penalised harder than the same gap on a private
helper. A private helper with 40% coverage is a problem you can fix
locally; a public function with 40% coverage is a contract problem.
Example:
_normalize_pathwith 50% coverage givespublic_surface = 25. Publicformat_currencywith 50% coverage givespublic_surface = 50._LegacyExposedlisted in__all__with 50% coverage givespublic_surface = 50(promoted to public despite the underscore).
sprawl: "is this function (or its file) just too big?"
A blend of function length and the surrounding file's length. Long
functions are harder to hold in your head; long files mean any function in
them has more neighbors competing for attention. Both contribute.
Example: a 12-line function in a 200-line file gives
sprawl = 5. A 180-line function in a 2,000-line module givessprawl = 85.
Suppose services/billing.py::reconcile_subscriptions is 180 lines, public,
has CC=14, 55% line coverage, 40% branch coverage, no recent churn, and
lives in a 900-line file. Its components might look like:
| Component | Raw signal | Score | Weight | Contribution |
|---|---|---|---|---|
| coverage_gap | 45% uncovered | 45 | 0.30 | 13.5 |
| structural_complexity | CC=14 of 20 saturating | 70 | 0.25 | 17.5 |
| branch_gap | 60% uncovered branches | 60 | 0.15 | 9.0 |
| churn | 0 commits in 90 days | 0 | 0.10 | 0.0 |
| public_surface | public + 45% gap | 45 | 0.10 | 4.5 |
| sprawl | long function, big file | 65 | 0.10 | 6.5 |
| total | 51.0 |
Score 51 puts this in the high severity band. The dominant drivers are complexity and branch coverage; if you wanted to lower it without rewriting the function, the cheapest path is adding branch tests, not deleting lines.
Drop a [tool.riskratchet.weights] table in pyproject.toml to override any
subset; the remaining components keep their defaults and the whole vector is
renormalized. For example, to ignore churn entirely and double-weight
coverage:
[tool.riskratchet.weights]
coverage_gap = 0.6
churn = 0.0Unknown keys and negative values are rejected at startup so a typo cannot silently weaken the score.
riskratchet scan src --coverage coverage.json --format table # default
riskratchet scan src --coverage coverage.json --json # shortcut for --format json
riskratchet scan src --coverage coverage.json --format markdown # for PR comments
riskratchet scan src --coverage coverage.json --format sarif # for SARIF consumers
riskratchet scan src --coverage coverage.json --format github # GitHub Actions annotations
riskratchet scan src --coverage coverage.json --format pr-comment
riskratchet scan src --coverage coverage.json --summary # aggregate lines only
riskratchet scan src --coverage coverage.json --summary --json # schema-backed summary envelope
riskratchet scan src --coverage coverage.json --quiet # drops the trailing summary line
riskratchet scan src --coverage coverage.json --min-score 50 # hide lower-risk functions
riskratchet scan src --coverage coverage.json --top 10 # emit only the top NSARIF intentionally has a narrower contract than native JSON: scan --format sarif emits current findings after the same score filter used for
annotations, while check --format sarif and diff --format sarif emit only
failing regressions. A clean baseline still produces valid SARIF with an
empty results array.
Native JSON output (truncated):
{
"$schema": "https://github.com/KayhanB21/riskratchet/schemas/report.schema.json",
"version": "0.2",
"summary": {
"total_functions": 10,
"analyzed_functions": 42,
"emitted_functions": 10,
"total_files": 6,
"coverage_status": "present",
"suppressed_functions": 1,
"skipped_missing_coverage": 0,
"by_severity": { "low": 1, "medium": 6, "high": 3, "critical": 0 }
},
"functions": [
{
"path": "src/foo.py",
"qualname": "Foo.bar",
"score": 62.3,
"severity": "high",
"components": {
"coverage_gap": 80.0, "structural_complexity": 55.0,
"branch_gap": 70.0, "churn": 30.0,
"public_surface": 80.0, "sprawl": 10.0
},
"crap": 12.4
}
]
}Diagnostics never touch stdout — they go to stderr (or a file), so --json
consumers and pipes stay clean:
riskratchet scan src --verbose # human-readable run diagnostics on stderr
riskratchet scan src --debug-json # same diagnostics as a JSON envelope on stderr
riskratchet scan src --debug-json-file diag.json # ...or written to a fileThe --debug-json envelope reports the coverage source (single / map / auto,
including whether the auto-coverage cache was reused or regenerated), git/churn
settings, include/exclude/allow filter effects, the analysis tallies, and (for
check/diff) the resolved baseline. It is validated against
schemas/debug.schema.json and is its own versioned contract.
When redaction is active, the diagnostics surfaces above (banner, --verbose,
--debug-json) hash their paths too, so a --private-comment run does not leak
through diagnostics.
For closed-source repos, redaction hashes identifiers in every output format while leaving the ratchet decision unchanged (redaction runs after baseline matching):
riskratchet check src --coverage coverage.json --redact-paths # hash file paths
riskratchet check src --coverage coverage.json --redact-qualnames # hash function names
riskratchet check src --coverage coverage.json --private-comment # both + drop source linksSalt. Hashes are salted, with this precedence: --redact-salt TEXT, then
RISKRATCHET_REDACT_SALT, then [tool.riskratchet] redact_salt. With none set,
the salt is derived from the commit (GITHUB_REPOSITORY@GITHUB_SHA, else
git rev-parse HEAD); riskratchet warns only when there is no salt source at
all, because unsalted hashes over known paths are guessable. So hashes are
stable within a commit (scan/check/diff in one run correlate) and intentionally
unlinkable across commits and repos — set an explicit --redact-salt if you
need a fixed mapping across commits.
The baseline command does not accept redaction flags — the persisted baseline
is the source of truth for future rename matching and is never hashed.
Validate project config before relying on it in CI:
riskratchet config validate --config pyproject.toml
riskratchet config show --config pyproject.toml --jsonconfig validate exits 2 for malformed TOML, unknown keys, invalid value
types, or invalid groups.
riskratchet finds config by walking upward from the working directory for the
nearest pyproject.toml containing [tool.riskratchet] (the nearest one wins
if several ancestors define it; pass --config to point at a specific file).
Relative config paths (paths, coverage, baseline, the coverage map, the
coverage cache) resolve against that file's directory, and auto-generated
coverage runs from there too, so running from a nested package directory gives
the same result as running from the project root. An explicit --coverage,
positional path arguments, and the no-argument default all stay relative to
your current directory. The scanning commands only warn on an unknown
[tool.riskratchet] key (and on a pyproject.toml that fails to parse during
the walk), so a config written for a newer version still runs; reach for
config validate when you want that typo to fail (exit 2) in CI.
Roll function-level results up by package or workspace area with
[tool.riskratchet.groups]. Each function is assigned to the longest
matching repo-relative prefix; ungrouped functions are reported as null in
JSON and ungrouped in text or markdown.
[tool.riskratchet.groups]
core = "src/core"
api = ["src/api", "src/public_api"]For packages/* / services/* layouts where one coverage.json is not
practical, declare a per-prefix coverage map (or pass --coverage-map on the
CLI; longest matching prefix wins):
[tool.riskratchet]
paths = ["packages/alpha", "packages/beta"]
[tool.riskratchet.coverage_map]
"packages/alpha" = "packages/alpha/coverage.json"
"packages/beta" = "packages/beta/coverage.json"
[tool.riskratchet.groups]
alpha = "packages/alpha"
beta = "packages/beta"One repo-level baseline (recommended for tight coupling) is global; one baseline per package is useful when packages release independently. Every command prints a diagnostic banner to stderr summarizing the resolved root, scan paths, and coverage source.
I ran riskratchet against four widely-used Python libraries to show what its
output looks like on production code. Each was cloned fresh, its own test
suite run with pytest --cov --cov-report=json:coverage.json, then scanned.
Top findings:
| Library | Function | Score | CC | Line cov |
|---|---|---|---|---|
| python-slugify | __main__::main |
53.1 (high) | 3 | 11% (0% branch) |
| python-slugify | slugify |
33.3 | 27 | 88% |
| tabulate | _CustomTextWrap._wrap_chunks |
44.4 | 31 | 60% |
| tabulate | _normalize_tabular_data |
42.6 | 76 | 78% |
| tabulate | tabulate (entry) |
37.1 | 62 | 97% |
| humanize | precisedelta |
32.9 | 26 | 100% |
| humanize | naturaldelta |
32.4 | 33 | 100% |
| inflect | engine._sinoun |
36.7 | 108 | 98% |
| inflect | engine._plnoun |
36.2 | 100 | 99% |
The point is not that these libraries are bad. They have all-green CI and many users. The point is that even mature, well-tested code accumulates functions where complexity, coverage, and sprawl combine into something worth a second pair of eyes. A CC=108 function with 98% coverage is not on fire; it is a function that works and is tested. The ratchet's job is to keep those numbers from getting worse over time.
| Tool | Per-function risk | Baseline / ratchet | Combines complexity + coverage + churn |
|---|---|---|---|
| coverage.py | line / branch only | no | no |
| radon | complexity only | no | no |
| xenon | complexity only | yes (threshold) | no |
| pytest-crap | yes (CRAP) | no | partial (CC + line coverage) |
| riskratchet | yes | yes | yes |
The same commands run in GitHub Actions:
uv sync --locked
uv run ruff check .
uv run ruff format --check .
uv run mypy src tests
uv run pytest --cov=src/riskratchet --cov-branch --cov-report=term-missing
uv build --clearStrict typing covers both src/ and tests/.
