feat(agents): API/MCP loop agent (Stage 2e) by pradeepvrd · Pull Request #9 · pradeepvrd/devops-bench

pradeepvrd · 2026-06-18T07:06:47Z

Restructures the legacy API/MCP agent into devops_bench/agents/api/ (← pkg/agents/runner/api/{api,mcp_client}.py).

loop.py (ApiAgent(AgentHarness) registered api) and mcp.py (MCPClient).
Model-agnostic via devops_bench.models; mcp SDK imported lazily; bounded turn cap (AGENT_MAX_TURNS, default 50). Legacy utils.filter_schema_for_gemini dropped (lives in the gemini adapter).
Stacked on the 2a agents PR (needs agents/base).
Tests under tests/unit/agents/api/.

Stacked draft PR — part of the in-place Stage 2/3 restructure (see docs/migration/pr-plan.md). Base is the fork branch shown above; it will be retargeted to gke-labs/main once Stage 1 (gke-labs#89–92) merges. PRs are intended to be reviewed and merged in stage order.

Status: peer-reviewed by 2 teammates + senior sign-off on the full integration branch; full suite green (ruff + 374 unit tests). Do NOT mark ready until its stage is up for merge.

@observe

Modules moved/refactored: - pkg/agents/runner/api/api.py -> devops_bench/agents/api/loop.py (ApiAgent, registered "api") - pkg/agents/runner/api/mcp_client.py -> devops_bench/agents/api/mcp.py - pkg/agents/runner/api/utils.py -> dropped (filter_schema_for_gemini) - pkg/agents/runner/api/llm_client.py + llm_adapters.py -> dropped (replaced by devops_bench.models) Bugs fixed vs legacy: - none in this commit; see the follow-up fix(agents) commit. Improvements vs legacy: - ApiAgent subclasses AgentHarness and self-registers via @AGENTS.register("api"), conforming to the standardized result contract and adding the previously-missing "skills" key. - Model-/provider-agnostic: the loop drives the neutral models-layer LLMClient obtained from get_model(provider, model_name) (AGENT_PROVIDER/AGENT_MODEL) instead of the legacy provider-specific LLMClient/adapters; no provider SDK is imported. - Lazy heavy imports: the `mcp` SDK and deepeval @observe are imported lazily inside functions, never at package import; __init__.py stays light. - Dropped legacy utils.filter_schema_for_gemini: tool formatting goes through LLMClient.format_tools(...), and Gemini schema filtering lives in models.google, so a provider-specific filter does not belong in the model-agnostic loop. - Bounded turn cap (AGENT_MAX_TURNS, default 50) replaces the legacy unbounded `while True`, guarding against a model that never stops requesting tools. - All legacy bare imports package-qualified; prints replaced with the core logger.

…parsing (api loop) Modules moved/refactored: - pkg/agents/runner/api/api.py -> devops_bench/agents/api/loop.py (ApiAgent, registered "api") (see base move commit) - pkg/agents/runner/api/mcp_client.py -> devops_bench/agents/api/mcp.py (see base move commit) Bugs fixed vs legacy: - MCPClient.__aenter__ passed the whole server_path to StdioServerParameters(command=...), so a multi-word command (e.g. "uv run mcp-server") was treated as a single executable name and failed with FileNotFoundError (the stdio transport spawns without a shell). Now shlex.split the command into command + args, with a clear ValueError for an empty/whitespace-only server_path. - process_query called mcp_client.call_tool(...) even when mcp_client is None (bench_use_mcp=False) but the model still requested a tool, raising AttributeError. Added an explicit `elif mcp_client is None` guard that returns a clear error string instead. - Tool-result extraction only read content[0], dropping later content blocks. Added _extract_tool_text to aggregate the text of all content blocks (newline-joined), falling back to str(result) when none carry text. Improvements vs legacy: - none in this commit; see the follow-up feat(agents) commit. Tests: None-client tool request, multi-block aggregation, _extract_tool_text fallback, multi-word server_path split, and empty server_path ValueError.

… test (api loop) Modules moved/refactored: - pkg/agents/runner/api/api.py -> devops_bench/agents/api/loop.py (ApiAgent, registered "api") (see base move commit) - pkg/agents/runner/api/mcp_client.py -> devops_bench/agents/api/mcp.py (see base move commit) Bugs fixed vs legacy: - none in this commit; see the preceding fix(agents) commit. Improvements vs legacy: - The async loop no longer performs blocking filesystem I/O on the event loop. The skill-file read in process_query is extracted into _read_skill_file and run via asyncio.to_thread; the skill discovery in run_api_agent (os.path.exists + recursive glob.glob + parse_skill_md) is extracted into _discover_skill_tools and likewise run via asyncio.to_thread. Behavior is unchanged; only the threading boundary moves. Tests: a happy-path MCPClient lifecycle test that mocks StdioServerParameters, stdio_client, and ClientSession and asserts connect (session set + initialize), list_tools/call_tool, and a clean __aexit__ (session and transport both closed).

pradeepvrd · 2026-06-20T08:07:32Z

Superseded by the reconciled cross-cutting refactor (see docs/refactor/e2e-refactor-sequencing-plan.md). Reworked into the layered devops_bench/ package on branch refactor/integration; replaced by the reworked component PRs and capstone #23. Closing as superseded.

pradeepvrd force-pushed the feat/devops-bench-agents branch from e5d7e5a to 3901757 Compare June 18, 2026 07:57

pradeepvrd force-pushed the feat/devops-bench-agents-api branch from d9dbe47 to ba52e26 Compare June 18, 2026 07:57

pradeepvrd added 3 commits June 18, 2026 01:22

pradeepvrd force-pushed the feat/devops-bench-agents branch from 3901757 to 99ba7ff Compare June 18, 2026 08:23

pradeepvrd force-pushed the feat/devops-bench-agents-api branch from ba52e26 to caddb25 Compare June 18, 2026 08:23

pradeepvrd mentioned this pull request Jun 20, 2026

feat(agents): ApiAgent on run_tool_loop; skills⊥MCP; no env-smuggling (Phase 3 PR2) #14

Merged

pradeepvrd closed this Jun 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): API/MCP loop agent (Stage 2e)#9

feat(agents): API/MCP loop agent (Stage 2e)#9
pradeepvrd wants to merge 3 commits into
feat/devops-bench-agentsfrom
feat/devops-bench-agents-api

pradeepvrd commented Jun 18, 2026

Uh oh!

pradeepvrd commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pradeepvrd commented Jun 18, 2026

Uh oh!

pradeepvrd commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant