Skip to content

feat(verification): outcome verification engine (Stage 2b)#6

Closed
pradeepvrd wants to merge 3 commits into
integration/devops-bench-stage1from
feat/devops-bench-verification
Closed

feat(verification): outcome verification engine (Stage 2b)#6
pradeepvrd wants to merge 3 commits into
integration/devops-bench-stage1from
feat/devops-bench-verification

Conversation

@pradeepvrd

Copy link
Copy Markdown
Owner

Restructures the legacy verifier into devops_bench/verification/ (← pkg/agents/verifier/*).

  • base.py (VerificationResult, BaseVerifier ABC), spec.py (discriminated union), runner.py (recursive dispatcher), verifiers/{pod_healthy,scaling_complete}.py.
  • kubectl routed through the Stage-1 devops_bench.k8s wrappers (no raw subprocess).
  • All 8 legacy verifier cases ported to pytest under tests/unit/verification/.

Stacked draft PR — part of the in-place Stage 2/3 restructure (see docs/migration/pr-plan.md). Base is the fork branch shown above; it will be retargeted to gke-labs/main once Stage 1 (gke-labs#89–92) merges. PRs are intended to be reviewed and merged in stage order.

Status: peer-reviewed by 2 teammates + senior sign-off on the full integration branch; full suite green (ruff + 374 unit tests). Do NOT mark ready until its stage is up for merge.

@pradeepvrd pradeepvrd force-pushed the feat/devops-bench-verification branch from 398f7fc to 12eb352 Compare June 18, 2026 07:57
Modules moved/refactored:
- pkg/agents/verifier/base.py             -> devops_bench/verification/base.py
- pkg/agents/verifier/verifier.py         -> devops_bench/verification/runner.py
- pkg/agents/verifier/spec.py             -> devops_bench/verification/spec.py
- pkg/agents/verifier/pod_healthy.py      -> devops_bench/verification/verifiers/pod_healthy.py
- pkg/agents/verifier/scaling_complete.py -> devops_bench/verification/verifiers/scaling_complete.py
- pkg/agents/verifier/test_verifier.py    -> tests/unit/verification/ (8 unittest cases ported to pytest, split across runner/pod_healthy/scaling_complete)

Bugs fixed vs legacy:
- (none in this commit; see fix(verification) commits)

Improvements vs legacy:
- Route all kubectl through the Stage-1 devops_bench.k8s wrappers (wait, get_json,
  poll_until) instead of raw subprocess; preserves pod Ready/Running and
  deployment readyReplicas>=min_replicas semantics with backoff polling.
- Replace print()/stderr noise with structured get_logger diagnostics in the
  pod_healthy fallback and scaling_complete except paths.
- Apache headers, Google-style docstrings, __all__, from __future__ annotations,
  and pydantic v2 models throughout; minimal package __init__ (no eager heavy imports).
Modules moved/refactored:
- see base move commit (devops_bench/verification/*; verifiers/{pod_healthy,scaling_complete}.py, runner.py)

Bugs fixed vs legacy:
- Null k8s status AttributeError (both verifiers): when the API returns
  "status": null, x.get("status", {}) returns None and the chained .get(...)
  raised AttributeError. Guard with (x.get("status") or {}).get(...) in
  pod_healthy._check_pods_status and scaling_complete._check_scaling.
- Timeout-budget overrun: VerifierAgent._remaining clamped to max(1, ...), so
  once the shared budget was exhausted every remaining list/dict member still
  got a borrowed 1s and ran. _remaining may now return <=0, and both compound
  loops short-circuit when remaining <= 0, recording the unrun member as timed
  out via the new _timed_out_result helper.
- Runner re-wrap: the compound loops re-wrapped each member with
  VerificationSpec(sub_spec); members are passed through to wait_for_condition
  directly (its guard handles raw dict/list), avoiding a needless re-validation
  that breaks once members are already VerificationSpec instances.

Improvements vs legacy:
- (none in this commit; see feat(verification) commit)
…timeouts

Modules moved/refactored:
- see base move commit (devops_bench/verification/{base,spec,runner}.py;
  verifiers/{pod_healthy,scaling_complete}.py)

Bugs fixed vs legacy:
- (none in this commit; see fix(verification) commit)

Improvements vs legacy:
- Recursive VerificationSpec: the RootModel union now references VerificationSpec
  for its list members and dict values (forward ref + model_rebuild()), so
  compound specs nest arbitrarily (list-of-lists, dict-of-lists, ...) instead of
  raising ValidationError. The dispatcher's recursion is now actually reachable.
- Optional kubeconfig on BaseVerifier (kubeconfig: str | None = None), forwarded
  to the devops_bench.k8s wrappers (wait/get_json) in both verifiers for
  multi-cluster targeting; None keeps the ambient kubeconfig.
- Widen timeout_sec int -> float through verify(), wait_for_condition, and
  _remaining for precision parity with the k8s wrappers.
- Tests: nested-spec dispatch (dict-of-list, list-of-lists) and kubeconfig
  forwarding for both verifiers.
@pradeepvrd

Copy link
Copy Markdown
Owner Author

Superseded by the reconciled cross-cutting refactor (see docs/refactor/e2e-refactor-sequencing-plan.md). Reworked into the layered devops_bench/ package on branch refactor/integration; replaced by the reworked component PRs and capstone #23. Closing as superseded.

@pradeepvrd pradeepvrd closed this Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant