feat(chaos): chaos injector with Trigger/Fault registries (Stage 2c) by pradeepvrd · Pull Request #7 · pradeepvrd/devops-bench

pradeepvrd · 2026-06-18T07:06:42Z

Splits the legacy chaos module into devops_bench/chaos/ (← pkg/agents/chaos/chaos.py).

base.py (Trigger/Fault ABCs + FAULTS/TRIGGERS registries), agent.py (ChaosAgent loop), faults/generate_load.py.
Model-agnostic: the chaos LLM loop runs through devops_bench.models (get_model/LLMClient tool-calling) — no provider SDK. chaos_active_event signaling preserved.
Tests under tests/unit/chaos/.

Stacked draft PR — part of the in-place Stage 2/3 restructure (see docs/migration/pr-plan.md). Base is the fork branch shown above; it will be retargeted to gke-labs/main once Stage 1 (gke-labs#89–92) merges. PRs are intended to be reviewed and merged in stage order.

Status: peer-reviewed by 2 teammates + senior sign-off on the full integration branch; full suite green (ruff + 374 unit tests). Do NOT mark ready until its stage is up for merge.

…BCs and registries (2c) Modules moved/refactored: - pkg/agents/chaos/chaos.py -> devops_bench/chaos/agent.py (ChaosAgent loop) + devops_bench/chaos/faults/generate_load.py (fault exec) - new devops_bench/chaos/base.py (Fault/Trigger ABCs + FAULTS/TRIGGERS registries) - new devops_bench/chaos/__init__.py + devops_bench/chaos/faults/__init__.py (light re-exports; no SDK imports) - new tests/unit/chaos/test_chaos_agent.py + test_chaos_generate_load.py (legacy chaos_test.py ported to pytest) Bugs fixed vs legacy: - none (pure structural move; behavioral fixes land in the following fix(chaos) commit) Improvements vs legacy: - split the monolithic ChaosAgent into an orchestration layer (agent.py) and a registered fault (faults/generate_load.py), so faults are pluggable - added Fault/Trigger ABCs and the FAULTS/TRIGGERS registries (base.py) per the component design, replacing ad-hoc dispatch on action "type" - made the LLM loop model-agnostic: drive it through the neutral devops_bench.models LLMClient interface (get_model + format_tools/generate_content/extract_function_calls/get_text_content) instead of the hardcoded google.genai chat client, with provider/model from CHAOS_PROVIDER/CHAOS_MODEL falling back to AGENT_PROVIDER/AGENT_MODEL - preserved the chaos_active_event signaling so the harness can detect an active load spike - exposed command execution as a single run_command tool and bounded the loop with a turn cap

…s, and event ordering Modules moved/refactored: - see base move commit (devops_bench/chaos/agent.py, devops_bench/chaos/faults/generate_load.py) Bugs fixed vs legacy: - ChaosAgent._run_async dropped the model's final text when a tool call landed on the last turn (or the turn cap): final_text was only assigned when there were no function calls. Now set final_text on every turn so an accompanying summary is never lost. - _execute_tool raised AttributeError when the model returned non-dict tool args (str/list/None): args.get(...) was called unconditionally. Now guard with isinstance(args, dict) and return "Error: tool args must be an object"; the caller passes raw args so the guard fires. - run_chaos_command raised IndexError on an empty command string (shlex.split -> [] -> run([])). Now short-circuit with "Error: command string is empty" before parsing. - run_chaos_command set chaos_active_event BEFORE parsing, so a command that failed shlex.split still told the harness "load active". Now signal the event only after a successful parse, immediately before execution. Improvements vs legacy: - none (behavioral bug fixes only; further improvements land in the following feat(chaos) commit)

…ndency injection Modules moved/refactored: - see base move commit (devops_bench/chaos/agent.py, devops_bench/chaos/faults/generate_load.py) Bugs fixed vs legacy: - none (fixes landed in the preceding fix(chaos) commit) Improvements vs legacy: - expand a leading ~ in each command token (os.path.expanduser) so model-emitted paths like ~/go/bin/fortio resolve under the shell-free argv executor instead of failing execvp; document that only single, non-piped commands are supported (no pipes/redirection/$VAR) in the run_command prompt and docstring. - drive the fortio target URL from the spec: read target.service_url (rewritten by the harness to the local port-forward) via target_url_from_spec() with a single _DEFAULT_TARGET_URL fallback, and inject it into both the goal and the system instruction (build_system_instruction(target_url)), removing the hardcoded http://localhost:8080 from SYSTEM_INSTRUCTION and goal(). - ChaosAgent.__init__ now accepts optional system_instruction and tools (defaulting to the module constants), used throughout the loop, so the agent is reusable for other faults. - decouple the orchestrator from the concrete fault: drop the top-level import of run_chaos_command and inject a tool_handler callable into the ctor (lazily defaulting to run_chaos_command); _execute_tool dispatches via self._tool_handler.

pradeepvrd · 2026-06-20T08:07:28Z

Superseded by the reconciled cross-cutting refactor (see docs/refactor/e2e-refactor-sequencing-plan.md). Reworked into the layered devops_bench/ package on branch refactor/integration; replaced by the reworked component PRs and capstone #23. Closing as superseded.

pradeepvrd force-pushed the feat/devops-bench-chaos branch from 2ab9ff8 to 3819a8e Compare June 18, 2026 07:57

pradeepvrd added 3 commits June 18, 2026 01:13

pradeepvrd force-pushed the feat/devops-bench-chaos branch from 3819a8e to 4a10e71 Compare June 18, 2026 08:23

This was referenced Jun 20, 2026

feat(agents): ApiAgent on run_tool_loop; skills⊥MCP; no env-smuggling (Phase 3 PR2) #14

Merged

feat(chaos): typed ChaosResult/Fault/Trigger; loop reuse; authoring contract (Phase 3) #15

Merged

pradeepvrd closed this Jun 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chaos): chaos injector with Trigger/Fault registries (Stage 2c)#7

feat(chaos): chaos injector with Trigger/Fault registries (Stage 2c)#7
pradeepvrd wants to merge 3 commits into
integration/devops-bench-stage1from
feat/devops-bench-chaos

pradeepvrd commented Jun 18, 2026

Uh oh!

pradeepvrd commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pradeepvrd commented Jun 18, 2026

Uh oh!

pradeepvrd commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant