feat: add experimental Hermes and Pi agents#1032
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds two experimental agents (Hermes and Pi): registers IDs, metadata, capabilities, CLI definitions and prompt arg handling; updates detection/args/registry tests; adds research docs; and provides a dev harness, browser stub, and renderer wiring to exercise parity. ChangesHermes and Pi agent integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR registers Hermes and Pi as experimental agents in Maestro's shared catalog (IDs, display metadata, context-window defaults, capabilities, and CLI definitions), and adds
Confidence Score: 3/5Not safe to merge as-is: the The catalog-only files (agentIds, agentMetadata, agentConstants) are straightforward and low-risk. The regression lives in
Important Files Changed
Sequence DiagramsequenceDiagram
participant P as process.ts
participant BA as buildAgentArgs
participant CS as ChildProcessSpawner
P->>BA: "buildAgentArgs(agent, { prompt, ... })"
Note over BA: batchModePrefix/Args added (gated on prompt)
BA->>BA: promptArgs(prompt) appended NEW
BA-->>P: finalArgs with prompt already included
P->>CS: "spawn({ args: finalArgs, promptArgs: agent.promptArgs, prompt })"
Note over CS: else if prompt and not promptViaStdin
CS->>CS: promptArgs(prompt) appended AGAIN
Note over CS: finalArgs now has prompt twice
CS->>CS: exec agent with doubled prompt
Reviews (1): Last reviewed commit: "feat: add experimental Hermes and Pi age..." | Re-trigger Greptile |
| if (options.prompt && agent.promptArgs) { | ||
| finalArgs = [...finalArgs, ...agent.promptArgs(options.prompt)]; | ||
| } |
There was a problem hiding this comment.
Double-prompt regression for every agent with
promptArgs defined
buildAgentArgs now appends promptArgs(prompt) to finalArgs, but ChildProcessSpawner.spawn() already does the same at its own line 173–174: finalArgs = [...args, ...promptArgs(prompt)]. Because args in the spawner is exactly what buildAgentArgs returns, any agent with promptArgs defined — including the existing gemini-cli and the new hermes/pi — will have the prompt appended twice. For example, Gemini CLI would be invoked as gemini -y --output-format stream-json -p "prompt" -p "prompt". The deduplication at the end of buildAgentArgs only runs on flags produced within that function and cannot prevent the second insertion in the spawner.
The fix depends on intent: either remove the promptArgs call from ChildProcessSpawner and let buildAgentArgs own prompt injection, or remove it from buildAgentArgs and keep the spawner as the sole owner. Because context-groomer.ts and group-chat-router.ts also invoke promptArgs directly on the spawner config, removing it from the spawner would require coordinated changes across multiple callers.
| { | ||
| id: 'pi', | ||
| name: 'Pi', | ||
| binaryName: 'pi', | ||
| command: 'pi', | ||
| args: [], | ||
| batchModeArgs: ['--mode', 'json'], | ||
| promptArgs: (prompt: string) => ['-p', prompt], | ||
| resumeArgs: (sessionId: string) => ['--session', sessionId], | ||
| modelArgs: (modelId: string) => ['-m', modelId], | ||
| imageArgs: (imagePath: string) => ['-i', imagePath], | ||
| configOptions: [ | ||
| { | ||
| key: 'model', | ||
| type: 'text', | ||
| label: 'Model', | ||
| description: | ||
| 'Documented Pi model override (for example, claude-sonnet-4.5). Leave empty for the CLI default.', | ||
| default: '', | ||
| argBuilder: (value: string) => (value.trim() ? ['-m', value.trim()] : []), | ||
| }, | ||
| { | ||
| key: 'contextWindow', | ||
| type: 'number', | ||
| label: 'Context Window Size', | ||
| description: | ||
| 'Fallback context window size in tokens until Pi reports a runtime-specific value.', | ||
| default: 200000, | ||
| }, | ||
| ], |
There was a problem hiding this comment.
batchModeArgs: ['--mode', 'json'] produces unreadable output when supportsJsonOutput is false
Pi's definition adds --mode json via batchModeArgs whenever a prompt is present, which causes the Pi CLI to emit structured JSON events on stdout. The corresponding capability entry sets both supportsJsonOutput: false and usesJsonLineOutput: false, meaning Maestro has no JSON parser wired up for Pi's output. Every batch-mode Pi session would therefore stream raw JSON event objects to the user instead of readable text.
If --mode json is required to make Pi exit non-interactively, that should be resolved before exposing Pi in any experimental UI. If -p/--print alone is sufficient to guarantee non-interactive exit, removing batchModeArgs (or setting it to []) until the JSON parser is ready would be the safer choice.
|
Thanks for the contribution @XcluEzy7! Conservative catalog/capability gating looks great, and the research notes are a nice touch. Before this lands, two issues from Greptile that I verified by walking the spawn path — both need addressing: 1.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
src/renderer/installBrowserMaestroStub.ts (1)
1-14: 💤 Low valueType the fixture array against
AgentIdto catch ID renames.
DEV_RENDERER_AGENT_FIXTURESuses bare string literals; if any of the canonical agent IDs (e.g.,hermes,pi) get renamed insrc/shared/agentIds.ts, this stub silently drifts. A typed annotation will surface the mismatch at compile time.♻️ Proposed annotation
-const DEV_RENDERER_AGENT_FIXTURES = [ +import type { AgentId } from '../shared/agentIds'; + +type DevFixture = { + id: AgentId; + name: string; + available: boolean; + hidden: boolean; + supportsBatch: boolean; +}; + +const DEV_RENDERER_AGENT_FIXTURES: readonly DevFixture[] = [ { id: 'claude-code', name: 'Claude Code', available: true, hidden: false, supportsBatch: true }, ... -] as const; +];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/renderer/installBrowserMaestroStub.ts` around lines 1 - 14, DEV_RENDERER_AGENT_FIXTURES is typed with bare string literals so renaming canonical agent IDs won't be caught; annotate the fixture array with the AgentId-based type (e.g., use something like ReadonlyArray<{ id: AgentId; ... }> or const assertion with id: AgentId) so the compiler verifies each id against the AgentId union, updating the declaration for DEV_RENDERER_AGENT_FIXTURES to use AgentId for the id property and keep the rest of the shape the same.src/renderer/main.tsx (1)
6-6: ⚡ Quick winHarness is eagerly imported in the production bundle.
Phase01AgentParityHarnessis dev-only (Phase 01 quality gate), but the static import at line 6 forces it (and transitivelyAGENT_CAPABILITIES, metadata, Tailwind classes used only here) into the initial renderer chunk for every user. Lazy-load it the same way asMaestroConsoleso production builds don't ship the harness when the URL flag isn't set.♻️ Proposed fix
-import { Phase01AgentParityHarness } from './Phase01AgentParityHarness'; @@ -const MaestroConsole = lazy(() => import('./App')); +const MaestroConsole = lazy(() => import('./App')); +const Phase01AgentParityHarness = lazy(() => + import('./Phase01AgentParityHarness').then((m) => ({ default: m.Phase01AgentParityHarness })) +); @@ - {isPhase01AgentParityHarness ? ( - <Phase01AgentParityHarness /> - ) : ( - <Suspense fallback={null}> - <MaestroConsole /> - </Suspense> - )} + <Suspense fallback={null}> + {isPhase01AgentParityHarness ? <Phase01AgentParityHarness /> : <MaestroConsole />} + </Suspense>Also applies to: 105-111
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/renderer/main.tsx` at line 6, The dev-only Phase01AgentParityHarness is statically imported and pulled into the production renderer bundle; change its static import to a lazy dynamic import like MaestroConsole (use React.lazy(() => import('./Phase01AgentParityHarness')) and wrap with <Suspense>) so it only loads when the URL/dev flag requires it; update any references to Phase01AgentParityHarness accordingly (including the other occurrences around the block noted 105-111) to use the lazy component and ensure Tailwind classes and AGENT_CAPABILITIES metadata aren't shipped unless the harness is actually rendered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/__tests__/main/utils/agent-args.test.ts`:
- Around line 330-368: The tests assert duplicated prompt flags and Pi JSON mode
that conflict with ChildProcessSpawner.spawn behavior and AGENT_CAPABILITIES;
update the tests to stop expecting buildAgentArgs to append promptArgs (remove
the trailing '-q <prompt>' in the Hermes spec and '-p <prompt>' in the Pi spec)
and remove or change the Pi expectation that includes '--mode','json' until you
resolve the source issues in buildAgentArgs, ChildProcessSpawner.spawn,
batchModeArgs, and AGENT_CAPABILITIES (or re-derive expected argv from the
end-to-end spawn path); reference AGENT_DEFINITIONS and buildAgentArgs to locate
the subjects to adjust.
In `@src/renderer/installBrowserMaestroStub.ts`:
- Line 53: The stub's hardcoded home path (the homeDir property in
installBrowserMaestroStub currently returning '/home/egsox') should be replaced
with a generic or dynamic value; change the homeDir implementation to return a
generic placeholder like '/home/user' or, better, use a dynamic resolver such as
os.homedir() or process.env.HOME so it isn't tied to a personal username.
In `@src/renderer/Phase01AgentParityHarness.tsx`:
- Around line 20-35: The detect() promise chain in the useEffect (inside
Phase01AgentParityHarness component) doesn’t handle rejections, so add a .catch
handler on window.maestro.agents.detect() that checks mounted, sets an error
status via setStatus (e.g., "Failed to detect agents: <brief message>"), clears
or leaves agents empty with setAgents([]) as appropriate, and optionally logs
the error; keep the existing mounted guard and return cleanup unchanged so the
UI recovers from rejection instead of remaining stuck on "Loading stubbed agent
detection…".
---
Nitpick comments:
In `@src/renderer/installBrowserMaestroStub.ts`:
- Around line 1-14: DEV_RENDERER_AGENT_FIXTURES is typed with bare string
literals so renaming canonical agent IDs won't be caught; annotate the fixture
array with the AgentId-based type (e.g., use something like ReadonlyArray<{ id:
AgentId; ... }> or const assertion with id: AgentId) so the compiler verifies
each id against the AgentId union, updating the declaration for
DEV_RENDERER_AGENT_FIXTURES to use AgentId for the id property and keep the rest
of the shape the same.
In `@src/renderer/main.tsx`:
- Line 6: The dev-only Phase01AgentParityHarness is statically imported and
pulled into the production renderer bundle; change its static import to a lazy
dynamic import like MaestroConsole (use React.lazy(() =>
import('./Phase01AgentParityHarness')) and wrap with <Suspense>) so it only
loads when the URL/dev flag requires it; update any references to
Phase01AgentParityHarness accordingly (including the other occurrences around
the block noted 105-111) to use the lazy component and ensure Tailwind classes
and AGENT_CAPABILITIES metadata aren't shipped unless the harness is actually
rendered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2130e34d-2972-4b97-91c4-77122387f6c6
📒 Files selected for processing (7)
docs/research/agent-parity/phase-01-findings.mdsrc/__tests__/main/agents/detector.test.tssrc/__tests__/main/utils/agent-args.test.tssrc/__tests__/shared/agent-completeness.test.tssrc/renderer/Phase01AgentParityHarness.tsxsrc/renderer/installBrowserMaestroStub.tssrc/renderer/main.tsx
✅ Files skipped from review due to trivial changes (1)
- docs/research/agent-parity/phase-01-findings.md
| it('builds Hermes batch args for the documented Maestro launch path', () => { | ||
| const hermes = AGENT_DEFINITIONS.find((agent) => agent.id === 'hermes'); | ||
| expect(hermes).toBeDefined(); | ||
|
|
||
| const result = buildAgentArgs(hermes!, { | ||
| baseArgs: [], | ||
| prompt: 'Summarize the current branch status', | ||
| modelId: 'anthropic/claude-sonnet-4-20250514', | ||
| }); | ||
|
|
||
| expect(result).toEqual([ | ||
| 'chat', | ||
| '-Q', | ||
| '-m', | ||
| 'anthropic/claude-sonnet-4-20250514', | ||
| '-q', | ||
| 'Summarize the current branch status', | ||
| ]); | ||
| }); | ||
|
|
||
| it('builds Pi batch args for the documented Maestro launch path', () => { | ||
| const pi = AGENT_DEFINITIONS.find((agent) => agent.id === 'pi'); | ||
| expect(pi).toBeDefined(); | ||
|
|
||
| const result = buildAgentArgs(pi!, { | ||
| baseArgs: [], | ||
| prompt: 'Plan the next implementation step', | ||
| modelId: 'claude-sonnet-4.5', | ||
| }); | ||
|
|
||
| expect(result).toEqual([ | ||
| '--mode', | ||
| 'json', | ||
| '-m', | ||
| 'claude-sonnet-4.5', | ||
| '-p', | ||
| 'Plan the next implementation step', | ||
| ]); | ||
| }); |
There was a problem hiding this comment.
These tests lock in the duplicated promptArgs behavior and Pi's unsupported JSON mode.
Both new assertions encode promptArgs being appended by buildAgentArgs (the trailing -q <prompt> for Hermes and -p <prompt> for Pi). Per the PR objectives, ChildProcessSpawner.spawn() already appends promptArgs(prompt) to the args it receives, so once the duplicate append is removed from buildAgentArgs, these expectations will be wrong — and until then, the actual spawned argv for both agents will contain the prompt flags twice. The Hermes assertion (line 340-347) and the Pi assertion (line 360-367) need to drop the trailing -q/-p pair once the source fix lands.
The Pi case also bakes in the second blocking issue: expect(result).toEqual(['--mode', 'json', ...]) asserts the documented launch path emits --mode json, but AGENT_CAPABILITIES.pi declares supportsJsonOutput: false / usesJsonLineOutput: false with no parser wired up, so this is asserting behavior the rest of Maestro can't consume. Whichever direction the fix goes (drop --mode json from batchModeArgs, switch to a text mode, or flip the capability flags and add a parser), this expected array will need to change.
Recommend deferring/removing these two cases until both the promptArgs double-append and the Pi JSON-mode mismatch are resolved in src/main/utils/agent-args.ts and src/main/agents/definitions.ts, then re-deriving the expected argv from the actual spawn path (i.e., what ChildProcessSpawner.spawn() produces end-to-end).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/__tests__/main/utils/agent-args.test.ts` around lines 330 - 368, The
tests assert duplicated prompt flags and Pi JSON mode that conflict with
ChildProcessSpawner.spawn behavior and AGENT_CAPABILITIES; update the tests to
stop expecting buildAgentArgs to append promptArgs (remove the trailing '-q
<prompt>' in the Hermes spec and '-p <prompt>' in the Pi spec) and remove or
change the Pi expectation that includes '--mode','json' until you resolve the
source issues in buildAgentArgs, ChildProcessSpawner.spawn, batchModeArgs, and
AGENT_CAPABILITIES (or re-derive expected argv from the end-to-end spawn path);
reference AGENT_DEFINITIONS and buildAgentArgs to locate the subjects to adjust.
| listDocs: async () => ({ success: true, docs: [] }), | ||
| }, | ||
| fs: { | ||
| homeDir: async () => '/home/egsox', |
There was a problem hiding this comment.
Hardcoded user-specific home path.
'/home/egsox' looks like a personal username and will be confusing/misleading for other developers who use this stub. Replace with a generic placeholder.
🛠️ Proposed fix
- homeDir: async () => '/home/egsox',
+ homeDir: async () => '/home/dev',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| homeDir: async () => '/home/egsox', | |
| homeDir: async () => '/home/dev', |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/renderer/installBrowserMaestroStub.ts` at line 53, The stub's hardcoded
home path (the homeDir property in installBrowserMaestroStub currently returning
'/home/egsox') should be replaced with a generic or dynamic value; change the
homeDir implementation to return a generic placeholder like '/home/user' or,
better, use a dynamic resolver such as os.homedir() or process.env.HOME so it
isn't tied to a personal username.
| useEffect(() => { | ||
| let mounted = true; | ||
| void window.maestro.agents.detect().then((detected) => { | ||
| if (!mounted) return; | ||
| const normalized = detected.filter((agent) => | ||
| HARNESS_AGENT_IDS.includes(agent.id as AgentId) | ||
| ) as DetectedAgent[]; | ||
| setAgents(normalized); | ||
| setStatus( | ||
| 'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.' | ||
| ); | ||
| }); | ||
| return () => { | ||
| mounted = false; | ||
| }; | ||
| }, []); |
There was a problem hiding this comment.
Loading state never recovers on detect() rejection.
If window.maestro.agents.detect() rejects, the status remains stuck on "Loading stubbed agent detection…" forever (the global unhandledrejection handler in main.tsx reports to Sentry but the harness UI gives no feedback). Add a .catch that updates status so the harness reports failure visibly.
🛡️ Proposed fix
- void window.maestro.agents.detect().then((detected) => {
- if (!mounted) return;
- const normalized = detected.filter((agent) =>
- HARNESS_AGENT_IDS.includes(agent.id as AgentId)
- ) as DetectedAgent[];
- setAgents(normalized);
- setStatus(
- 'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
- );
- });
+ window.maestro.agents
+ .detect()
+ .then((detected) => {
+ if (!mounted) return;
+ const normalized = detected.filter((agent) =>
+ HARNESS_AGENT_IDS.includes(agent.id as AgentId)
+ ) as DetectedAgent[];
+ setAgents(normalized);
+ setStatus(
+ 'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.'
+ );
+ })
+ .catch((err: unknown) => {
+ if (!mounted) return;
+ setStatus(
+ `Stubbed detection failed: ${err instanceof Error ? err.message : String(err)}`
+ );
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| useEffect(() => { | |
| let mounted = true; | |
| void window.maestro.agents.detect().then((detected) => { | |
| if (!mounted) return; | |
| const normalized = detected.filter((agent) => | |
| HARNESS_AGENT_IDS.includes(agent.id as AgentId) | |
| ) as DetectedAgent[]; | |
| setAgents(normalized); | |
| setStatus( | |
| 'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.' | |
| ); | |
| }); | |
| return () => { | |
| mounted = false; | |
| }; | |
| }, []); | |
| useEffect(() => { | |
| let mounted = true; | |
| window.maestro.agents | |
| .detect() | |
| .then((detected) => { | |
| if (!mounted) return; | |
| const normalized = detected.filter((agent) => | |
| HARNESS_AGENT_IDS.includes(agent.id as AgentId) | |
| ) as DetectedAgent[]; | |
| setAgents(normalized); | |
| setStatus( | |
| 'Stubbed detection loaded. Probe an agent below to validate copy and fallback behavior.' | |
| ); | |
| }) | |
| .catch((err: unknown) => { | |
| if (!mounted) return; | |
| setStatus( | |
| `Stubbed detection failed: ${err instanceof Error ? err.message : String(err)}` | |
| ); | |
| }); | |
| return () => { | |
| mounted = false; | |
| }; | |
| }, []); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/renderer/Phase01AgentParityHarness.tsx` around lines 20 - 35, The
detect() promise chain in the useEffect (inside Phase01AgentParityHarness
component) doesn’t handle rejections, so add a .catch handler on
window.maestro.agents.detect() that checks mounted, sets an error status via
setStatus (e.g., "Failed to detect agents: <brief message>"), clears or leaves
agents empty with setAgents([]) as appropriate, and optionally logs the error;
keep the existing mounted guard and return cleanup unchanged so the UI recovers
from rejection instead of remaining stuck on "Loading stubbed agent detection…".
Summary
Validation
Notes
Summary by CodeRabbit
New Features
Documentation
Tests
Chores