Skip to content

feat(mcp): suppress inter-tool-call prose — final answer is what people actually read#38

Merged
aomerk merged 1 commit into
mainfrom
feat-suppress-intertool-prose
May 1, 2026
Merged

feat(mcp): suppress inter-tool-call prose — final answer is what people actually read#38
aomerk merged 1 commit into
mainfrom
feat-suppress-intertool-prose

Conversation

@aomerk

@aomerk aomerk commented May 1, 2026

Copy link
Copy Markdown
Owner

Summary

User feedback: "even during investigations it tells a fuck ton of stuff that no one reads. everyone only looks at the final result."

Tightens the existing keeba style (~/.claude/output-styles/keeba.md) + CLAUDE.md template with a "Silence between tool calls" rule:

Tool call → tool result → next tool call → final consolidated answer. No interleaved prose. The single exception: pause to ask if a tool call genuinely requires user input to continue.

Why

Real keeba-arm output from a recent investigation:

● Found seeder. Now check HOLDS write path.
  Searched for 2 patterns (ctrl+o to expand)
● Confirmed dupe vector. Different MERGE shapes per writer:
  ...
● Now check why whales missed.

Each "Found X. Now Y." line is 5-15 output tokens × N tool calls. On a multi-turn investigation (5-8 tool calls) that's 20-40% of the output budget burned on prose nobody reads — users scroll past it to the final consolidated answer block.

Existing keeba style already commits to terseness ("no preamble", "no closing summary", "quote don't restate"). This PR adds the next logical rule: also no prose between tool calls.

Why one rule, not a separate strict preset

Two reasons:

  1. The "default keeba" style is already aggressive. Users who installed --with-output-style opted into terseness; inter-tool silence is the next step on the same axis, not a different policy.
  2. Splitting into a separate keeba-strict.md preset doubles maintenance and forces the user to remember to switch styles per task. Single style with one logical default is simpler.

Users who want progress markers can /output-style default to revert.

Test plan

  • go test ./internal/cli/... — green (test pins for both output-style and CLAUDE.md added)
  • gofumpt -l clean
  • golangci-lint run — 0 issues
  • Post-merge manual A/B (run by user): rebuild keeba, re-install, /output-style keeba in interactive session, re-run any multi-turn investigation prompt, compare /cost against the prior run. Expected: output drops 15-30% further on top of current keeba-arm baseline. If it doesn't move, the model is ignoring the directive and the next lever is API-side proxy interception (separate, expensive, not in this PR).

Why no headless bench in this PR

claude --print doesn't activate output styles without explicit --append-system-prompt-file plumbing, so a headless bench would measure "appended system prompt content" not "/output-style keeba active". Different shape than what the user actually gets. Manual validation in interactive Claude Code is the right test.

🤖 Generated with Claude Code

…thing read

User feedback: "even during investigations it tells a fuck ton of
stuff that no one reads. everyone only looks at the final result."

Real waste in a typical Claude Code investigation:

  ● Found seeder. Now check HOLDS write path.
  Searched for 2 patterns (ctrl+o to expand)
  ● Confirmed dupe vector. Different MERGE shapes per writer:
  ...
  ● Now check why whales missed.

Each "Found X. Now Y." is 5-15 output tokens × N tool calls. On a
multi-turn investigation that's 20-40% of the keeba-arm output budget,
all of it spent on prose nobody reads — users scroll to the final
consolidated answer block.

Tightens the existing keeba style + CLAUDE.md template with a
"Silence between tool calls" rule:

  Tool call → tool result → next tool call → final consolidated answer.
  No interleaved prose. The single exception: pause to ask if a tool
  call genuinely needs user input to continue.

Two new test phrase pins (output-style + CLAUDE.md) so future edits
can't quietly soften the rule.

This is layered on top of the existing keeba style, not a separate
strict preset — the existing style already commits to terseness and
inter-tool silence is the next logical rule. Users who want progress
markers can /output-style default to revert to baseline behavior.

No bench validation in this PR: claude --print doesn't activate
output styles without explicit --append-system-prompt-file plumbing,
so the headless bench would measure a different thing than what the
user gets in interactive mode after /output-style keeba. Validate
post-merge manually.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aomerk aomerk merged commit bcf2916 into main May 1, 2026
2 checks passed
@aomerk aomerk deleted the feat-suppress-intertool-prose branch May 1, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant