Skip to content

feat(verification): discriminated-union spec + deadline runner + VERIFIERS registry#27

Draft
pradeepvrd wants to merge 1 commit into
submit/1-models-loopfrom
submit/2-verification
Draft

feat(verification): discriminated-union spec + deadline runner + VERIFIERS registry#27
pradeepvrd wants to merge 1 commit into
submit/1-models-loopfrom
submit/2-verification

Conversation

@pradeepvrd

@pradeepvrd pradeepvrd commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Verification used to live in pkg/agents/verifier/ (verifier.py, spec.py, hand-maintained leaf checks); this migrates it into devops_bench/verification/ with a registry-driven spec, a deadline-bounded runner, and a VERIFIERS registry.

Behavior changes

  • Spec parsing is registry-driven (VERIFIERS.register()): new verifiers register themselves and appear in the schema without editing a central discriminated Union.
  • The runner computes one time.monotonic() deadline shared across the whole spec tree: sequence consumes it serially and fail-fasts; parallel hands each child the full remaining budget and ANDs the results.
  • timeout_sec is a float clamped to the deadline (was an int); a leaf short-circuits when under ~1s of budget remains instead of issuing a doomed kubectl wait.
  • Compound results nest typed VerificationResult children (children + leaf raw) instead of dicts/lists; nodes require an explicit type (no compound inference from bare lists/dicts).

Bugs fixed

  • parallel checks used to block until the timeout even when all children finished early; they now return as soon as the children complete.
  • Per-child timeout rebudgeting (initial − elapsed) let repeated children drift and extend the overall budget; the shared monotonic deadline removes that.

Comment thread devops_bench/verification/registry.py Outdated
Comment thread devops_bench/verification/verifiers/pod_healthy.py
Comment thread devops_bench/verification/runner.py
Comment thread devops_bench/verification/runner.py
Comment thread devops_bench/verification/schema.py Outdated
@pradeepvrd pradeepvrd force-pushed the submit/1-models-loop branch from 39c87ff to 88c9939 Compare June 21, 2026 01:30
@pradeepvrd pradeepvrd force-pushed the submit/2-verification branch from 6592b55 to a510df6 Compare June 21, 2026 01:30
@pradeepvrd pradeepvrd force-pushed the submit/1-models-loop branch from 88c9939 to 2827366 Compare June 23, 2026 06:37
@pradeepvrd pradeepvrd force-pushed the submit/2-verification branch 2 times, most recently from 9500fba to 2da6431 Compare June 23, 2026 07:19
@pradeepvrd pradeepvrd force-pushed the submit/1-models-loop branch from 2827366 to 49f0968 Compare June 23, 2026 17:59
…FIERS registry

Verification used to live in `pkg/agents/verifier/` (`verifier.py`, `spec.py`, hand-maintained leaf checks); this migrates it into `devops_bench/verification/` with a registry-driven spec, a deadline-bounded runner, and a `VERIFIERS` registry.

**Behavior changes**
- Spec parsing is registry-driven (`VERIFIERS.register()`): new verifiers register themselves and appear in the schema without editing a central discriminated `Union`.
- The runner computes one `time.monotonic()` deadline shared across the whole spec tree: `sequence` consumes it serially and fail-fasts; `parallel` hands each child the full remaining budget and ANDs the results.
- `timeout_sec` is a float clamped to the deadline (was an int); a leaf short-circuits when under ~1s of budget remains instead of issuing a doomed `kubectl wait`.
- Compound results nest typed `VerificationResult` children (`children` + leaf `raw`) instead of dicts/lists; nodes require an explicit `type` (no compound inference from bare lists/dicts).

**Bugs fixed**
- `parallel` checks used to block until the timeout even when all children finished early; they now return as soon as the children complete.
- Per-child timeout rebudgeting (`initial − elapsed`) let repeated children drift and extend the overall budget; the shared monotonic deadline removes that.
@pradeepvrd pradeepvrd force-pushed the submit/2-verification branch from 2da6431 to 415e316 Compare June 23, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant