feat(verification): outcome verification engine (Stage 2b)#6
Closed
pradeepvrd wants to merge 3 commits into
Closed
feat(verification): outcome verification engine (Stage 2b)#6pradeepvrd wants to merge 3 commits into
pradeepvrd wants to merge 3 commits into
Conversation
398f7fc to
12eb352
Compare
Modules moved/refactored: - pkg/agents/verifier/base.py -> devops_bench/verification/base.py - pkg/agents/verifier/verifier.py -> devops_bench/verification/runner.py - pkg/agents/verifier/spec.py -> devops_bench/verification/spec.py - pkg/agents/verifier/pod_healthy.py -> devops_bench/verification/verifiers/pod_healthy.py - pkg/agents/verifier/scaling_complete.py -> devops_bench/verification/verifiers/scaling_complete.py - pkg/agents/verifier/test_verifier.py -> tests/unit/verification/ (8 unittest cases ported to pytest, split across runner/pod_healthy/scaling_complete) Bugs fixed vs legacy: - (none in this commit; see fix(verification) commits) Improvements vs legacy: - Route all kubectl through the Stage-1 devops_bench.k8s wrappers (wait, get_json, poll_until) instead of raw subprocess; preserves pod Ready/Running and deployment readyReplicas>=min_replicas semantics with backoff polling. - Replace print()/stderr noise with structured get_logger diagnostics in the pod_healthy fallback and scaling_complete except paths. - Apache headers, Google-style docstrings, __all__, from __future__ annotations, and pydantic v2 models throughout; minimal package __init__ (no eager heavy imports).
Modules moved/refactored:
- see base move commit (devops_bench/verification/*; verifiers/{pod_healthy,scaling_complete}.py, runner.py)
Bugs fixed vs legacy:
- Null k8s status AttributeError (both verifiers): when the API returns
"status": null, x.get("status", {}) returns None and the chained .get(...)
raised AttributeError. Guard with (x.get("status") or {}).get(...) in
pod_healthy._check_pods_status and scaling_complete._check_scaling.
- Timeout-budget overrun: VerifierAgent._remaining clamped to max(1, ...), so
once the shared budget was exhausted every remaining list/dict member still
got a borrowed 1s and ran. _remaining may now return <=0, and both compound
loops short-circuit when remaining <= 0, recording the unrun member as timed
out via the new _timed_out_result helper.
- Runner re-wrap: the compound loops re-wrapped each member with
VerificationSpec(sub_spec); members are passed through to wait_for_condition
directly (its guard handles raw dict/list), avoiding a needless re-validation
that breaks once members are already VerificationSpec instances.
Improvements vs legacy:
- (none in this commit; see feat(verification) commit)
…timeouts
Modules moved/refactored:
- see base move commit (devops_bench/verification/{base,spec,runner}.py;
verifiers/{pod_healthy,scaling_complete}.py)
Bugs fixed vs legacy:
- (none in this commit; see fix(verification) commit)
Improvements vs legacy:
- Recursive VerificationSpec: the RootModel union now references VerificationSpec
for its list members and dict values (forward ref + model_rebuild()), so
compound specs nest arbitrarily (list-of-lists, dict-of-lists, ...) instead of
raising ValidationError. The dispatcher's recursion is now actually reachable.
- Optional kubeconfig on BaseVerifier (kubeconfig: str | None = None), forwarded
to the devops_bench.k8s wrappers (wait/get_json) in both verifiers for
multi-cluster targeting; None keeps the ambient kubeconfig.
- Widen timeout_sec int -> float through verify(), wait_for_condition, and
_remaining for precision parity with the k8s wrappers.
- Tests: nested-spec dispatch (dict-of-list, list-of-lists) and kubeconfig
forwarding for both verifiers.
12eb352 to
147dad4
Compare
Merged
6 tasks
Owner
Author
|
Superseded by the reconciled cross-cutting refactor (see docs/refactor/e2e-refactor-sequencing-plan.md). Reworked into the layered devops_bench/ package on branch refactor/integration; replaced by the reworked component PRs and capstone #23. Closing as superseded. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Restructures the legacy verifier into
devops_bench/verification/(←pkg/agents/verifier/*).base.py(VerificationResult, BaseVerifier ABC),spec.py(discriminated union),runner.py(recursive dispatcher),verifiers/{pod_healthy,scaling_complete}.py.devops_bench.k8swrappers (no raw subprocess).tests/unit/verification/.Stacked draft PR — part of the in-place Stage 2/3 restructure (see
docs/migration/pr-plan.md). Base is the fork branch shown above; it will be retargeted togke-labs/mainonce Stage 1 (gke-labs#89–92) merges. PRs are intended to be reviewed and merged in stage order.Status: peer-reviewed by 2 teammates + senior sign-off on the full integration branch; full suite green (ruff + 374 unit tests). Do NOT mark ready until its stage is up for merge.