Skip to content

feat(agents): base harness + CLI agents (gemini, openclaw)#29

Draft
pradeepvrd wants to merge 1 commit into
submit/3-agents-capabilitiesfrom
submit/4-agents-base-cli
Draft

feat(agents): base harness + CLI agents (gemini, openclaw)#29
pradeepvrd wants to merge 1 commit into
submit/3-agents-capabilitiesfrom
submit/4-agents-base-cli

Conversation

@pradeepvrd

@pradeepvrd pradeepvrd commented Jun 20, 2026

Copy link
Copy Markdown
Owner

The agent-execution layer used to live in pkg/agents/runner/ (gcli.py dispatch + openclaw.py), driven by evaluate.py; this adds devops_bench/agents/ — a typed AgentHarness base (base.py, config.py, result.py) plus the two CLI agents cli/gemini.py and cli/openclaw.py, built on the capability bindings.

Behavior changes

  • Agent selection is a registry (AGENTS.get) keyed by a canonical name (with cli/binary aliases), replacing the substring match on AGENT_TARGET.
  • Both agents return a typed AgentResult (output, canonical ToolCall trajectory, tokens, errors) instead of an ad-hoc dict.
  • Gemini trajectory is parsed from the official --output-format stream-json event stream; it used to glob and scrape ~/.gemini/tmp session files.
  • OpenClaw runs oc locally (the SSH transport is gone) and reads its trajectory from the oc sessions export-trajectory bundle (events.jsonl); it used to scrape the sessionFile= path out of debug stdout.
  • An unmatched tool_result is dropped from the trajectory and surfaced on AgentResult.errors rather than silently discarded, uniformly across both agents.

Comment thread devops_bench/agents/cli/gemini.py
Comment thread devops_bench/agents/cli/openclaw.py
@pradeepvrd pradeepvrd force-pushed the submit/3-agents-capabilities branch from 822d901 to 51db598 Compare June 21, 2026 01:30
@pradeepvrd pradeepvrd force-pushed the submit/4-agents-base-cli branch from d789dcd to c4a1093 Compare June 21, 2026 01:30
pradeepvrd added a commit that referenced this pull request Jun 23, 2026
…nels

Deliver MCP + skills to the OpenClaw agent the same way the Gemini agent does,
through oc's own per-run channels so nothing touches ~/.openclaw and concurrent
runs never race:

- State: OPENCLAW_STATE_DIR -> <run>/state isolates sessions + the skills root;
  replaces the global ~/.openclaw/.../sessions wipe.
- MCP: each binding with a launch command becomes an mcp.servers entry in
  <run>/openclaw.json, selected via OPENCLAW_CONFIG_PATH (command[0]->command,
  rest->args; command-less bindings skipped).
- Skills: SKILL.md files discovered under the bound paths (reusing parse_skill_md)
  are materialized to <OPENCLAW_STATE_DIR>/skills/<name>/SKILL.md.
- Model auth: config.api_key threaded into the provider env var.

OpenClawAgent now assigns self.mcp_servers + self.skills, so it structurally
satisfies SupportsMcp/SupportsSkills alongside SupportsRules. Default agent id is
"main" (oc's built-in default, present in every config incl. the isolated one).

Builds on the events.jsonl trajectory parser from PR #29 (this branch is rebased
on the fixed base): the per-run export-trajectory bundle is parsed via the same
dotted tool.call/tool.result/model.completed schema. Verified e2e on a real GKE
secret-rotation run (rules+MCP+skills granted; ToolInvocation 1.0). Unit suite green.

Completes the "Openclaw needs equivalent wiring" follow-up from the Gemini change.
@pradeepvrd pradeepvrd force-pushed the submit/4-agents-base-cli branch from 43c5d8f to 1962e0d Compare June 23, 2026 06:08
@pradeepvrd pradeepvrd changed the title feat(agents): base harness + CLI agents (gemini, openclaw) [+#20 orphan-result policy] feat(agents): base harness + CLI agents (gemini, openclaw) Jun 23, 2026
@pradeepvrd pradeepvrd force-pushed the submit/3-agents-capabilities branch from 51db598 to 887c755 Compare June 23, 2026 06:37
@pradeepvrd pradeepvrd force-pushed the submit/4-agents-base-cli branch from 1962e0d to 09c113b Compare June 23, 2026 06:37
@pradeepvrd pradeepvrd force-pushed the submit/3-agents-capabilities branch from 887c755 to 6e9a5ac Compare June 23, 2026 07:20
@pradeepvrd pradeepvrd force-pushed the submit/4-agents-base-cli branch from 09c113b to c785a87 Compare June 23, 2026 07:21
@pradeepvrd pradeepvrd force-pushed the submit/3-agents-capabilities branch from 6e9a5ac to 1f3c053 Compare June 23, 2026 18:05
The agent-execution layer used to live in `pkg/agents/runner/` (`gcli.py` dispatch + `openclaw.py`), driven by `evaluate.py`; this adds `devops_bench/agents/` — a typed `AgentHarness` base (`base.py`, `config.py`, `result.py`) plus the two CLI agents `cli/gemini.py` and `cli/openclaw.py`, built on the capability bindings.

**Behavior changes**
- Agent selection is a registry (`AGENTS.get`) keyed by a canonical name (with `cli`/`binary` aliases), replacing the substring match on `AGENT_TARGET`.
- Both agents return a typed `AgentResult` (output, canonical `ToolCall` trajectory, tokens, errors) instead of an ad-hoc dict.
- Gemini trajectory is parsed from the official `--output-format stream-json` event stream; it used to glob and scrape `~/.gemini/tmp` session files.
- OpenClaw runs `oc` locally (the SSH transport is gone) and reads its trajectory from the `oc sessions export-trajectory` bundle (`events.jsonl`); it used to scrape the `sessionFile=` path out of debug stdout.
- An unmatched `tool_result` is dropped from the trajectory and surfaced on `AgentResult.errors` rather than silently discarded, uniformly across both agents.
@pradeepvrd pradeepvrd force-pushed the submit/4-agents-base-cli branch from c785a87 to 4585a19 Compare June 23, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant