The asymmetry
tilth/validators.py:
run_pytest(workspace, task_ids) filters to the current task's + completed tasks' test globs (test_t<NNN>_*.py) — the ratchet.
run_ruff(workspace) runs ruff check . — workspace-wide, with no task scoping (validators.py:80).
So at an early-stage task, ruff lints the future tasks' seed test files, which legitimately have unsorted imports / references to modules that don't exist yet. The worker can't fix them (touching another task's files is a hard scope_creep reject — correctly). It's cornered: ruff fails, and the only files it's allowed to change are its own.
This is F5 (proposals/frictions-2026-05-26.md — "worker affordance bleed: validators filter, worker doesn't") realised specifically for the lint floor.
Observed cost
Demo session 20260529-134013 (Phase 3 validation), T-001:
- 27 iterations / 226k tokens — ~15 spent on the ruff-vs-future-task-seed-files dance, vs T-002 (6 iters) and T-003 (12) which were clean.
- iter 16: worker ran
ruff --fix on the future-task test files → modified them → rejected scope_creep (correct).
- iter 27 (accepted): worker added
[tool.ruff.lint] per-file-ignores for I001 on the future-task seed files, in its own pyproject.toml, so workspace ruff passes without touching them.
The per-file-ignore is the same shape as the [tool.ruff] exclude the frictions doc flagged as F12 gaming (session 173822). Phase 3's work_arounds made it declared and adjudicated (the evaluator accepted it as legit scoping) rather than hidden — a real improvement — but whether it's "legit scoping" or "papering over a harness gap" is exactly the ambiguity this asymmetry creates. The worker shouldn't have to make that call.
Proposed fix
Scope run_ruff to the task's owned files the way run_pytest is scoped — at minimum, exclude future-task seed test files from the lint at earlier stages (the harness knows the task ids and the test_t<NNN> convention). Ruff should see the same "live at this stage" set pytest does.
Open questions for the fix:
- Lint scope = the worker's diff? task-owned source + this task's test glob + completed tasks' globs (mirroring the pytest ratchet)? The latter keeps the regression-guard intent.
- Passing explicit paths to
ruff check <paths> vs. a generated --exclude for future-task globs. Explicit paths is cleaner but must include pyproject.toml etc.
Why it matters
- Removes a recurring per-task token tax (inflates OQ#6 in the v1 plan).
- Removes the F12-gaming-vs-legit ambiguity at the source, so the evaluator isn't asked to bless lint-silencing workarounds.
- Tightens the mechanical floor: the worker is judged on lint of what it owns, not on noise from work it isn't allowed to do yet.
Related
The asymmetry
tilth/validators.py:run_pytest(workspace, task_ids)filters to the current task's + completed tasks' test globs (test_t<NNN>_*.py) — the ratchet.run_ruff(workspace)runsruff check .— workspace-wide, with no task scoping (validators.py:80).So at an early-stage task, ruff lints the future tasks' seed test files, which legitimately have unsorted imports / references to modules that don't exist yet. The worker can't fix them (touching another task's files is a hard
scope_creepreject — correctly). It's cornered: ruff fails, and the only files it's allowed to change are its own.This is F5 (
proposals/frictions-2026-05-26.md— "worker affordance bleed: validators filter, worker doesn't") realised specifically for the lint floor.Observed cost
Demo session
20260529-134013(Phase 3 validation), T-001:ruff --fixon the future-task test files → modified them → rejectedscope_creep(correct).[tool.ruff.lint] per-file-ignoresfor I001 on the future-task seed files, in its ownpyproject.toml, so workspace ruff passes without touching them.The per-file-ignore is the same shape as the
[tool.ruff] excludethe frictions doc flagged as F12 gaming (session 173822). Phase 3'swork_aroundsmade it declared and adjudicated (the evaluator accepted it as legit scoping) rather than hidden — a real improvement — but whether it's "legit scoping" or "papering over a harness gap" is exactly the ambiguity this asymmetry creates. The worker shouldn't have to make that call.Proposed fix
Scope
run_ruffto the task's owned files the wayrun_pytestis scoped — at minimum, exclude future-task seed test files from the lint at earlier stages (the harness knows the task ids and thetest_t<NNN>convention). Ruff should see the same "live at this stage" set pytest does.Open questions for the fix:
ruff check <paths>vs. a generated--excludefor future-task globs. Explicit paths is cleaner but must includepyproject.tomletc.Why it matters
Related
20260529-134013)