Skip to content

feat: add experimental Hermes and Pi agents#1032

Open
XcluEzy7 wants to merge 3 commits into
RunMaestro:mainfrom
XcluEzy7:auto-run-main-0521
Open

feat: add experimental Hermes and Pi agents#1032
XcluEzy7 wants to merge 3 commits into
RunMaestro:mainfrom
XcluEzy7:auto-run-main-0521

Conversation

@XcluEzy7
Copy link
Copy Markdown

@XcluEzy7 XcluEzy7 commented May 21, 2026

Summary

  • Add experimental Hermes and Pi agent IDs, display metadata, context-window defaults, and beta badges.
  • Register first-pass CLI definitions and conservative capabilities for Hermes and Pi.
  • Add promptArgs handling so agents with prompt flags receive one-shot prompts through buildAgentArgs.
  • Add research notes documenting the conservative Hermes/Pi parity baseline.

Validation

  • npm run build:prompts && npm run lint
  • npm run lint:eslint
  • npm run format:check
  • git diff --check

Notes

  • Focused Vitest run was attempted, but the current dependency tree fails before tests start with: ERR_PACKAGE_PATH_NOT_EXPORTED for vite ./module-runner imported by vitest 4.1.7 against vite 5.4.21.

Summary by CodeRabbit

  • New Features

    • Added Hermes and Pi as beta agents with resume, model selection, image input, large context windows, and probeable Phase‑01 parity UI (lazy-loaded).
  • Documentation

    • Added capability maps, shared-parity baseline, Phase‑01 findings, and an integration design note for Hermes/Pi.
  • Tests

    • Expanded coverage for agent IDs, capabilities, argument building, detection, and cross-registry completeness.
  • Chores

    • Added a browser stub to support local/dev agent detection.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 609c57e1-105d-44ff-85cd-c97d400bb055

📥 Commits

Reviewing files that changed from the base of the PR and between e0966da and e43838c.

📒 Files selected for processing (1)
  • docs/architecture/agent-parity/hermes-integration-design.md

📝 Walkthrough

Walkthrough

Adds two experimental agents (Hermes and Pi): registers IDs, metadata, capabilities, CLI definitions and prompt arg handling; updates detection/args/registry tests; adds research docs; and provides a dev harness, browser stub, and renderer wiring to exercise parity.

Changes

Hermes and Pi agent integration

Layer / File(s) Summary
Agent ID registration
src/shared/agentIds.ts, src/__tests__/shared/agentIds.test.ts
Canonical agent ID strings 'hermes' and 'pi' are added to AGENT_IDS, expanding the derived AgentId union and isValidAgentId type guard.
Agent metadata and beta flags
src/shared/agentMetadata.ts, src/__tests__/shared/agentMetadata.test.ts
Display names are mapped for both agents and both are included in BETA_AGENTS; tests assert display-name presence and isBetaAgent behavior.
Default context window configuration
src/shared/agentConstants.ts, src/__tests__/shared/agentConstants.test.ts
Context window default of 200000 is added for both Hermes and Pi in DEFAULT_CONTEXT_WINDOWS.
Agent capabilities definition
src/main/agents/capabilities.ts
Hermes and Pi capability entries enable session/resume, streaming/thinking, result messages, and model selection while keeping session storage/export and structured output disabled.
Agent CLI definitions and UI config
src/main/agents/definitions.ts
Registers Hermes (chat batch prefix, -Q, --yolo, -q prompt, --resume, -m, --image) and Pi (--mode json, -p prompt, --session, -m, -i) with argument builders and UI model/contextWindow config.
Prompt argument builder enhancement
src/main/utils/agent-args.ts, src/__tests__/main/utils/agent-args.test.ts
buildAgentArgs now appends agent.promptArgs when options.prompt is provided; tests cover inclusion, omission, correct ordering after resume args, and full constructed launch args for Hermes/Pi.
Agent detection tests
src/__tests__/main/agents/detector.test.ts
Detector tests derive expected counts from getAgentIds(), include Hermes in mixed-availability mocks, assert Pi unavailability when its binary is missing, and update concurrent edge-case expectations.
Registry completeness test
src/__tests__/shared/agent-completeness.test.ts
New Vitest suite enforces alignment across AGENT_DEFINITIONS, AGENT_IDS, display names, capabilities, DEFAULT_CONTEXT_WINDOWS, and BETA_AGENTS, including Hermes/Pi.
Capability research and Phase 01 findings
docs/research/agent-parity/*
Adds Hermes and Pi capability maps, a shared parity baseline, and Phase 01 findings documenting scope, shipped items, gated checklist steps, and deferred parity work.
Dev harness, browser stub, renderer wiring
src/renderer/Phase01AgentParityHarness.tsx, src/renderer/installBrowserMaestroStub.ts, src/renderer/main.tsx
Adds a dev-only parity harness that probes agent capabilities, installs a browser window.maestro stub for local testing, and lazy-loads the renderer to show the harness when phase01=agent-parity is present.
Hermes integration design
docs/architecture/agent-parity/hermes-integration-design.md
Architecture doc defining Phase 02 rules for Hermes: session storage, structured-output parser strategy, model discovery, image behavior, fallback matrix, and implementation order.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • RunMaestro/Maestro#546: Related to shared agent ID/context-window infrastructure extended here by adding hermes/pi.
  • RunMaestro/Maestro#521: Related to buildAgentArgs behavior that this PR modifies (promptArgs insertion).

Suggested reviewers

  • reachrazamair

Poem

🐰 Two agents hop, Hermes and Pi in tow,
Mapping flags where the CLI winds blow,
Tests and docs stitched, a harness to try,
A stub in the browser, a probe and a sigh—
The parity garden grows, soft and slow.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add experimental Hermes and Pi agents' clearly and specifically summarizes the main change: introducing two new experimental agents to the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR registers Hermes and Pi as experimental agents in Maestro's shared catalog (IDs, display metadata, context-window defaults, capabilities, and CLI definitions), and adds promptArgs dispatch inside buildAgentArgs so agents that use a flag-based one-shot prompt (e.g. -q for Hermes, -p for Pi) carry the prompt through the argument-building layer.

  • Catalog additions (agentIds, agentMetadata, agentConstants, capabilities) follow the established pattern and are conservative/beta-gated.
  • The new promptArgs block in buildAgentArgs conflicts with the pre-existing promptArgs call in ChildProcessSpawner.spawn(), causing the prompt to be appended twice for every agent with promptArgs defined — this includes the already-shipping gemini-cli definition.
  • Pi's batchModeArgs: ['--mode', 'json'] forces Pi into JSON event output mode while supportsJsonOutput: false leaves Maestro without a parser, so any batch-mode Pi session would display raw JSON to the user.

Confidence Score: 3/5

Not safe to merge as-is: the promptArgs change in buildAgentArgs will double-apply the prompt for Gemini CLI on every local non-SSH batch run, and Pi's batch mode will emit unreadable JSON output.

The catalog-only files (agentIds, agentMetadata, agentConstants) are straightforward and low-risk. The regression lives in agent-args.ts: buildAgentArgs now calls promptArgs(prompt), but ChildProcessSpawner already calls the same function when it builds its own finalArgs from the pre-built args. Gemini CLI ships promptArgs today, so this bug would surface for every Gemini CLI batch launch on the current release, not just for the new agents. The Pi batchModeArgs/supportsJsonOutput mismatch compounds the risk on the Pi side.

src/main/utils/agent-args.ts and src/main/agents/definitions.ts (Pi entry) need the most attention before merging.

Important Files Changed

Filename Overview
src/main/utils/agent-args.ts Adds promptArgs call inside buildAgentArgs, but ChildProcessSpawner already calls promptArgs at spawn time — causing any agent with promptArgs (including existing Gemini CLI) to receive the prompt twice on the CLI.
src/main/agents/definitions.ts Adds Hermes and Pi definitions with correct arg builders; Pi's batchModeArgs: ['--mode', 'json'] conflicts with its supportsJsonOutput: false capability, producing unparseable JSON output in batch mode.
src/main/agents/capabilities.ts Conservative capability registration for Hermes and Pi; both set supportsStreaming: true which may be optimistic given that structured streaming events were not confirmed for Hermes per the research notes.
src/tests/main/utils/agent-args.test.ts New tests cover promptArgs in isolation and in combination with resume args; the combined scenario test does not verify behavior when promptArgs is defined alongside other agents' existing flows, leaving the double-prompt regression untested.
src/shared/agentIds.ts Adds hermes and pi to the canonical ID list; follows the existing pattern correctly.
src/shared/agentMetadata.ts Adds display names and beta badge entries for Hermes and Pi; follows existing patterns.
src/shared/agentConstants.ts Adds 200k conservative context-window defaults for Hermes and Pi with explanatory comments.

Sequence Diagram

sequenceDiagram
    participant P as process.ts
    participant BA as buildAgentArgs
    participant CS as ChildProcessSpawner

    P->>BA: "buildAgentArgs(agent, { prompt, ... })"
    Note over BA: batchModePrefix/Args added (gated on prompt)
    BA->>BA: promptArgs(prompt) appended NEW
    BA-->>P: finalArgs with prompt already included

    P->>CS: "spawn({ args: finalArgs, promptArgs: agent.promptArgs, prompt })"
    Note over CS: else if prompt and not promptViaStdin
    CS->>CS: promptArgs(prompt) appended AGAIN
    Note over CS: finalArgs now has prompt twice
    CS->>CS: exec agent with doubled prompt
Loading

Reviews (1): Last reviewed commit: "feat: add experimental Hermes and Pi age..." | Re-trigger Greptile

Comment on lines +104 to +106
if (options.prompt && agent.promptArgs) {
finalArgs = [...finalArgs, ...agent.promptArgs(options.prompt)];
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Double-prompt regression for every agent with promptArgs defined

buildAgentArgs now appends promptArgs(prompt) to finalArgs, but ChildProcessSpawner.spawn() already does the same at its own line 173–174: finalArgs = [...args, ...promptArgs(prompt)]. Because args in the spawner is exactly what buildAgentArgs returns, any agent with promptArgs defined — including the existing gemini-cli and the new hermes/pi — will have the prompt appended twice. For example, Gemini CLI would be invoked as gemini -y --output-format stream-json -p "prompt" -p "prompt". The deduplication at the end of buildAgentArgs only runs on flags produced within that function and cannot prevent the second insertion in the spawner.

The fix depends on intent: either remove the promptArgs call from ChildProcessSpawner and let buildAgentArgs own prompt injection, or remove it from buildAgentArgs and keep the spawner as the sole owner. Because context-groomer.ts and group-chat-router.ts also invoke promptArgs directly on the spawner config, removing it from the spawner would require coordinated changes across multiple callers.

Comment on lines +284 to +313
{
id: 'pi',
name: 'Pi',
binaryName: 'pi',
command: 'pi',
args: [],
batchModeArgs: ['--mode', 'json'],
promptArgs: (prompt: string) => ['-p', prompt],
resumeArgs: (sessionId: string) => ['--session', sessionId],
modelArgs: (modelId: string) => ['-m', modelId],
imageArgs: (imagePath: string) => ['-i', imagePath],
configOptions: [
{
key: 'model',
type: 'text',
label: 'Model',
description:
'Documented Pi model override (for example, claude-sonnet-4.5). Leave empty for the CLI default.',
default: '',
argBuilder: (value: string) => (value.trim() ? ['-m', value.trim()] : []),
},
{
key: 'contextWindow',
type: 'number',
label: 'Context Window Size',
description:
'Fallback context window size in tokens until Pi reports a runtime-specific value.',
default: 200000,
},
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 batchModeArgs: ['--mode', 'json'] produces unreadable output when supportsJsonOutput is false

Pi's definition adds --mode json via batchModeArgs whenever a prompt is present, which causes the Pi CLI to emit structured JSON events on stdout. The corresponding capability entry sets both supportsJsonOutput: false and usesJsonLineOutput: false, meaning Maestro has no JSON parser wired up for Pi's output. Every batch-mode Pi session would therefore stream raw JSON event objects to the user instead of readable text.

If --mode json is required to make Pi exit non-interactively, that should be resolved before exposing Pi in any experimental UI. If -p/--print alone is sufficient to guarantee non-interactive exit, removing batchModeArgs (or setting it to []) until the JSON parser is ready would be the safer choice.

@pedramamini
Copy link
Copy Markdown
Collaborator

Thanks for the contribution @XcluEzy7! Conservative catalog/capability gating looks great, and the research notes are a nice touch. Before this lands, two issues from Greptile that I verified by walking the spawn path — both need addressing:

1. promptArgs will be appended twice for any agent that defines it

src/main/utils/agent-args.ts now appends agent.promptArgs(options.prompt) inside buildAgentArgs. But ChildProcessSpawner.spawn() already calls promptArgs(prompt) itself (see src/main/process-manager/spawners/ChildProcessSpawner.ts lines 129–130, 154–155, and 173–174). The IPC handler in src/main/ipc/handlers/process.ts calls buildAgentArgs at line 157 and then forwards both the prompt-containing argsToSpawn AND promptArgs: agent?.promptArgs plus prompt: config.prompt to spawn() at line 519. Net effect: the spawner re-appends the same prompt args on top of args that already include them.

This regression is not limited to Hermes/Pi — existing gemini-cli (src/main/agents/definitions.ts:212) defines promptArgs: (prompt) => ['-p', prompt], so every Gemini CLI batch launch on the current release would ship the prompt twice (gemini -p <prompt> -p <prompt>). The same is true on the group-chat path via src/main/group-chat/group-chat-router.ts.

The clean fix is to drop the new promptArgs block from buildAgentArgs and rely on ChildProcessSpawner.spawn() to handle prompt placement (the SSH path in src/main/utils/ssh-spawn-wrapper.ts:130-142 already does this consistently). The third new test in src/__tests__/main/utils/agent-args.test.ts (appends promptArgs after resume args for single-shot resume flows) would need to be removed/rewritten against the actual spawn path.

2. Pi forces --mode json but has supportsJsonOutput: false

In definitions.ts, Pi sets batchModeArgs: ['--mode', 'json'] — so every batch-mode Pi launch will go into JSON event mode. But capabilities.ts declares supportsJsonOutput: false (and usesJsonLineOutput: false), meaning Maestro has no parser wired up for that stream. Users would see raw JSON in the terminal. Until the Pi parser lands, please use a non-JSON batch mode (e.g. drop batchModeArgs, or use --mode text if that's supported) — or alternatively flip the capability flags and add the parser in this PR.

Bonus nits (not blocking, your call)

  • Hermes supportsStreaming: true is a little optimistic per your own research note ("Structured event output for Maestro was not confirmed"). Worth double-checking before users get a half-rendered stream.
  • Once Internal Logging #1 is fixed, please also drop or rewrite the new agent-args.test.ts cases so they reflect the actual ordering.

No merge conflicts as of now, so once those two are addressed I'm happy to take another look. Thanks again!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
src/renderer/installBrowserMaestroStub.ts (1)

1-14: 💤 Low value

Type the fixture array against AgentId to catch ID renames.

DEV_RENDERER_AGENT_FIXTURES uses bare string literals; if any of the canonical agent IDs (e.g., hermes, pi) get renamed in src/shared/agentIds.ts, this stub silently drifts. A typed annotation will surface the mismatch at compile time.

♻️ Proposed annotation
-const DEV_RENDERER_AGENT_FIXTURES = [
+import type { AgentId } from '../shared/agentIds';
+
+type DevFixture = {
+	id: AgentId;
+	name: string;
+	available: boolean;
+	hidden: boolean;
+	supportsBatch: boolean;
+};
+
+const DEV_RENDERER_AGENT_FIXTURES: readonly DevFixture[] = [
 	{ id: 'claude-code', name: 'Claude Code', available: true, hidden: false, supportsBatch: true },
 	...
-] as const;
+];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/renderer/installBrowserMaestroStub.ts` around lines 1 - 14,
DEV_RENDERER_AGENT_FIXTURES is typed with bare string literals so renaming
canonical agent IDs won't be caught; annotate the fixture array with the
AgentId-based type (e.g., use something like ReadonlyArray<{ id: AgentId; ... }>
or const assertion with id: AgentId) so the compiler verifies each id against
the AgentId union, updating the declaration for DEV_RENDERER_AGENT_FIXTURES to
use AgentId for the id property and keep the rest of the shape the same.
src/renderer/main.tsx (1)

6-6: ⚡ Quick win

Harness is eagerly imported in the production bundle.

Phase01AgentParityHarness is dev-only (Phase 01 quality gate), but the static import at line 6 forces it (and transitively AGENT_CAPABILITIES, metadata, Tailwind classes used only here) into the initial renderer chunk for every user. Lazy-load it the same way as MaestroConsole so production builds don't ship the harness when the URL flag isn't set.

♻️ Proposed fix
-import { Phase01AgentParityHarness } from './Phase01AgentParityHarness';
@@
-const MaestroConsole = lazy(() => import('./App'));
+const MaestroConsole = lazy(() => import('./App'));
+const Phase01AgentParityHarness = lazy(() =>
+	import('./Phase01AgentParityHarness').then((m) => ({ default: m.Phase01AgentParityHarness }))
+);
@@
-					{isPhase01AgentParityHarness ? (
-						<Phase01AgentParityHarness />
-					) : (
-						<Suspense fallback={null}>
-							<MaestroConsole />
-						</Suspense>
-					)}
+					<Suspense fallback={null}>
+						{isPhase01AgentParityHarness ? <Phase01AgentParityHarness /> : <MaestroConsole />}
+					</Suspense>

Also applies to: 105-111

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/renderer/main.tsx` at line 6, The dev-only Phase01AgentParityHarness is
statically imported and pulled into the production renderer bundle; change its
static import to a lazy dynamic import like MaestroConsole (use React.lazy(() =>
import('./Phase01AgentParityHarness')) and wrap with <Suspense>) so it only
loads when the URL/dev flag requires it; update any references to
Phase01AgentParityHarness accordingly (including the other occurrences around
the block noted 105-111) to use the lazy component and ensure Tailwind classes
and AGENT_CAPABILITIES metadata aren't shipped unless the harness is actually
rendered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/__tests__/main/utils/agent-args.test.ts`:
- Around line 330-368: The tests assert duplicated prompt flags and Pi JSON mode
that conflict with ChildProcessSpawner.spawn behavior and AGENT_CAPABILITIES;
update the tests to stop expecting buildAgentArgs to append promptArgs (remove
the trailing '-q <prompt>' in the Hermes spec and '-p <prompt>' in the Pi spec)
and remove or change the Pi expectation that includes '--mode','json' until you
resolve the source issues in buildAgentArgs, ChildProcessSpawner.spawn,
batchModeArgs, and AGENT_CAPABILITIES (or re-derive expected argv from the
end-to-end spawn path); reference AGENT_DEFINITIONS and buildAgentArgs to locate
the subjects to adjust.

In `@src/renderer/installBrowserMaestroStub.ts`:
- Line 53: The stub's hardcoded home path (the homeDir property in
installBrowserMaestroStub currently returning '/home/egsox') should be replaced
with a generic or dynamic value; change the homeDir implementation to return a
generic placeholder like '/home/user' or, better, use a dynamic resolver such as
os.homedir() or process.env.HOME so it isn't tied to a personal username.

In `@src/renderer/Phase01AgentParityHarness.tsx`:
- Around line 20-35: The detect() promise chain in the useEffect (inside
Phase01AgentParityHarness component) doesn’t handle rejections, so add a .catch
handler on window.maestro.agents.detect() that checks mounted, sets an error
status via setStatus (e.g., "Failed to detect agents: <brief message>"), clears
or leaves agents empty with setAgents([]) as appropriate, and optionally logs
the error; keep the existing mounted guard and return cleanup unchanged so the
UI recovers from rejection instead of remaining stuck on "Loading stubbed agent
detection…".

---

Nitpick comments:
In `@src/renderer/installBrowserMaestroStub.ts`:
- Around line 1-14: DEV_RENDERER_AGENT_FIXTURES is typed with bare string
literals so renaming canonical agent IDs won't be caught; annotate the fixture
array with the AgentId-based type (e.g., use something like ReadonlyArray<{ id:
AgentId; ... }> or const assertion with id: AgentId) so the compiler verifies
each id against the AgentId union, updating the declaration for
DEV_RENDERER_AGENT_FIXTURES to use AgentId for the id property and keep the rest
of the shape the same.

In `@src/renderer/main.tsx`:
- Line 6: The dev-only Phase01AgentParityHarness is statically imported and
pulled into the production renderer bundle; change its static import to a lazy
dynamic import like MaestroConsole (use React.lazy(() =>
import('./Phase01AgentParityHarness')) and wrap with <Suspense>) so it only
loads when the URL/dev flag requires it; update any references to
Phase01AgentParityHarness accordingly (including the other occurrences around
the block noted 105-111) to use the lazy component and ensure Tailwind classes
and AGENT_CAPABILITIES metadata aren't shipped unless the harness is actually
rendered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2130e34d-2972-4b97-91c4-77122387f6c6

📥 Commits

Reviewing files that changed from the base of the PR and between ba4ddb6 and e0966da.

📒 Files selected for processing (7)
  • docs/research/agent-parity/phase-01-findings.md
  • src/__tests__/main/agents/detector.test.ts
  • src/__tests__/main/utils/agent-args.test.ts
  • src/__tests__/shared/agent-completeness.test.ts
  • src/renderer/Phase01AgentParityHarness.tsx
  • src/renderer/installBrowserMaestroStub.ts
  • src/renderer/main.tsx
✅ Files skipped from review due to trivial changes (1)
  • docs/research/agent-parity/phase-01-findings.md

Comment on lines +330 to +368
it('builds Hermes batch args for the documented Maestro launch path', () => {
const hermes = AGENT_DEFINITIONS.find((agent) => agent.id === 'hermes');
expect(hermes).toBeDefined();

const result = buildAgentArgs(hermes!, {
baseArgs: [],
prompt: 'Summarize the current branch status',
modelId: 'anthropic/claude-sonnet-4-20250514',
});

expect(result).toEqual([
'chat',
'-Q',
'-m',
'anthropic/claude-sonnet-4-20250514',
'-q',
'Summarize the current branch status',
]);
});

it('builds Pi batch args for the documented Maestro launch path', () => {
const pi = AGENT_DEFINITIONS.find((agent) => agent.id === 'pi');
expect(pi).toBeDefined();

const result = buildAgentArgs(pi!, {
baseArgs: [],
prompt: 'Plan the next implementation step',
modelId: 'claude-sonnet-4.5',
});

expect(result).toEqual([
'--mode',
'json',
'-m',
'claude-sonnet-4.5',
'-p',
'Plan the next implementation step',
]);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

These tests lock in the duplicated promptArgs behavior and Pi's unsupported JSON mode.

Both new assertions encode promptArgs being appended by buildAgentArgs (the trailing -q <prompt> for Hermes and -p <prompt> for Pi). Per the PR objectives, ChildProcessSpawner.spawn() already appends promptArgs(prompt) to the args it receives, so once the duplicate append is removed from buildAgentArgs, these expectations will be wrong — and until then, the actual spawned argv for both agents will contain the prompt flags twice. The Hermes assertion (line 340-347) and the Pi assertion (line 360-367) need to drop the trailing -q/-p pair once the source fix lands.

The Pi case also bakes in the second blocking issue: expect(result).toEqual(['--mode', 'json', ...]) asserts the documented launch path emits --mode json, but AGENT_CAPABILITIES.pi declares supportsJsonOutput: false / usesJsonLineOutput: false with no parser wired up, so this is asserting behavior the rest of Maestro can't consume. Whichever direction the fix goes (drop --mode json from batchModeArgs, switch to a text mode, or flip the capability flags and add a parser), this expected array will need to change.

Recommend deferring/removing these two cases until both the promptArgs double-append and the Pi JSON-mode mismatch are resolved in src/main/utils/agent-args.ts and src/main/agents/definitions.ts, then re-deriving the expected argv from the actual spawn path (i.e., what ChildProcessSpawner.spawn() produces end-to-end).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/__tests__/main/utils/agent-args.test.ts` around lines 330 - 368, The
tests assert duplicated prompt flags and Pi JSON mode that conflict with
ChildProcessSpawner.spawn behavior and AGENT_CAPABILITIES; update the tests to
stop expecting buildAgentArgs to append promptArgs (remove the trailing '-q
<prompt>' in the Hermes spec and '-p <prompt>' in the Pi spec) and remove or
change the Pi expectation that includes '--mode','json' until you resolve the
source issues in buildAgentArgs, ChildProcessSpawner.spawn, batchModeArgs, and
AGENT_CAPABILITIES (or re-derive expected argv from the end-to-end spawn path);
reference AGENT_DEFINITIONS and buildAgentArgs to locate the subjects to adjust.

listDocs: async () => ({ success: true, docs: [] }),
},
fs: {
homeDir: async () => '/home/egsox',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Hardcoded user-specific home path.

'/home/egsox' looks like a personal username and will be confusing/misleading for other developers who use this stub. Replace with a generic placeholder.

🛠️ Proposed fix
-			homeDir: async () => '/home/egsox',
+			homeDir: async () => '/home/dev',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
homeDir: async () => '/home/egsox',
homeDir: async () => '/home/dev',
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/renderer/installBrowserMaestroStub.ts` at line 53, The stub's hardcoded
home path (the homeDir property in installBrowserMaestroStub currently returning
'/home/egsox') should be replaced with a generic or dynamic value; change the
homeDir implementation to return a generic placeholder like '/home/user' or,
better, use a dynamic resolver such as os.homedir() or process.env.HOME so it
isn't tied to a personal username.

Comment on lines +20 to +35
useEffect(() => {
let mounted = true;
void window.maestro.agents.detect().then((detected) => {
if (!mounted) return;
const normalized = detected.filter((agent) =>
HARNESS_AGENT_IDS.includes(agent.id as AgentId)
) as DetectedAgent[];
setAgents(normalized);
setStatus(
'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
);
});
return () => {
mounted = false;
};
}, []);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Loading state never recovers on detect() rejection.

If window.maestro.agents.detect() rejects, the status remains stuck on "Loading stubbed agent detection…" forever (the global unhandledrejection handler in main.tsx reports to Sentry but the harness UI gives no feedback). Add a .catch that updates status so the harness reports failure visibly.

🛡️ Proposed fix
-		void window.maestro.agents.detect().then((detected) => {
-			if (!mounted) return;
-			const normalized = detected.filter((agent) =>
-				HARNESS_AGENT_IDS.includes(agent.id as AgentId)
-			) as DetectedAgent[];
-			setAgents(normalized);
-			setStatus(
-				'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
-			);
-		});
+		window.maestro.agents
+			.detect()
+			.then((detected) => {
+				if (!mounted) return;
+				const normalized = detected.filter((agent) =>
+					HARNESS_AGENT_IDS.includes(agent.id as AgentId)
+				) as DetectedAgent[];
+				setAgents(normalized);
+				setStatus(
+					'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
+				);
+			})
+			.catch((err: unknown) => {
+				if (!mounted) return;
+				setStatus(
+					`Stubbed detection failed: ${err instanceof Error ? err.message : String(err)}`
+				);
+			});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
useEffect(() => {
let mounted = true;
void window.maestro.agents.detect().then((detected) => {
if (!mounted) return;
const normalized = detected.filter((agent) =>
HARNESS_AGENT_IDS.includes(agent.id as AgentId)
) as DetectedAgent[];
setAgents(normalized);
setStatus(
'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
);
});
return () => {
mounted = false;
};
}, []);
useEffect(() => {
let mounted = true;
window.maestro.agents
.detect()
.then((detected) => {
if (!mounted) return;
const normalized = detected.filter((agent) =>
HARNESS_AGENT_IDS.includes(agent.id as AgentId)
) as DetectedAgent[];
setAgents(normalized);
setStatus(
'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
);
})
.catch((err: unknown) => {
if (!mounted) return;
setStatus(
`Stubbed detection failed: ${err instanceof Error ? err.message : String(err)}`
);
});
return () => {
mounted = false;
};
}, []);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/renderer/Phase01AgentParityHarness.tsx` around lines 20 - 35, The
detect() promise chain in the useEffect (inside Phase01AgentParityHarness
component) doesn’t handle rejections, so add a .catch handler on
window.maestro.agents.detect() that checks mounted, sets an error status via
setStatus (e.g., "Failed to detect agents: <brief message>"), clears or leaves
agents empty with setAgents([]) as appropriate, and optionally logs the error;
keep the existing mounted guard and return cleanup unchanged so the UI recovers
from rejection instead of remaining stuck on "Loading stubbed agent detection…".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants