Skip to content

feat(mcp): --with-output-style + stop-condition CLAUDE.md additions (banger phase 16)#35

Merged
aomerk merged 1 commit into
mainfrom
feat-output-style-and-stop-conditions
May 1, 2026
Merged

feat(mcp): --with-output-style + stop-condition CLAUDE.md additions (banger phase 16)#35
aomerk merged 1 commit into
mainfrom
feat-output-style-and-stop-conditions

Conversation

@aomerk

@aomerk aomerk commented May 1, 2026

Copy link
Copy Markdown
Owner

Summary

  • Ships keeba mcp install --with-output-style — drops ~/.claude/output-styles/keeba.md, a terse engineering style that suppresses preamble + tool-result restatement + closing summaries (the three biggest output-token sinks in a typical Claude Code investigation). Activate per-session with /output-style keeba.
  • Strengthens the --with-claude-md template with an Answer discipline (output-token savings) subsection: one-find_def-hit-stops, quote-don't-restate, no-preamble, no-closing-summary, conclusion-first.
  • README install snippet now leads with all four UX patches (--patch-agents --with-claude-md --with-hook --with-output-style).

Why

Real-world A/B (Slack-thread investigation prompt) showed our existing codec layers ceiling at ~30% session-cost reduction:

arm cost cache_read output
no-keeba $3.73 2.6M 26.2k
keeba (lean codec L2) $2.73 1.3M 19.2k

L1 (hook codec) + L2 (lean MCP responses) both shrink cache_read (priced ~$0.30/M). But session cost is dominated by output tokens (priced ~$15/M — 50× more) and output didn't shrink. To break the ceiling we have to attack the output side.

Output tokens come from five places. We control three:

source controllable? lever
Preamble ("I'll investigate by...") yes output-style: No preamble directive
Tool-use args no (small)
Restatement of tool results yes output-style + CLAUDE.md: Quote, don't restate
Wandering / "let me also check" reasoning yes CLAUDE.md Answer discipline stop conditions
Internal thinking blocks no (Opus extended thinking, user-toggled)

This PR ships the levers for the three controllable sources. They are opt-in (existing installs unaffected) and idempotent (re-running keeba mcp install prints "no change").

Why opt-in, not default

Output styles in Claude Code replace the default engineering system prompt. Mutating the user's defaultOutputStyle from an install is too invasive — they may have their own style or rely on the default. Drop the file, tell the user to /output-style keeba. They opt in.

What's NOT in this PR

  • Lever 4 (vertical agent — keeba investigate runs the loop in our control plane and returns one finished answer for Claude to synthesize). That's a much bigger build; defer until we measure whether output-style + Answer discipline already break the ceiling.
  • Mutation of ~/.claude/settings.json defaultOutputStyle. Opt-in only.

Test plan

  • go test ./... — all green
  • gofumpt -l clean on changed files
  • go vet ./... clean
  • golangci-lint run ./internal/cli/... — 0 issues
  • End-to-end install dry run against a temp HOME:
    • --with-output-style lands ~/.claude/output-styles/keeba.md with valid frontmatter (name: keeba)
    • --with-claude-md injects ## Code investigation in keeba-indexed repos with the new Answer discipline subsection containing NO preamble, NO closing summary, Quote tool-result rows verbatim, 50× cache_read
    • re-run prints "no change" for both
  • Verification A/B (post-merge, run by user): install with --with-output-style, activate /output-style keeba, re-run the same Slack-thread prompt, compare cost vs the $2.73 lean-codec baseline. Expected: <$2.00 if the lever is real; ~$2.73 if output bloat is structural and we need lever 4.

🤖 Generated with Claude Code

…banger phase 16)

Attack output-token bloat — the dominant cost component our codec
layers don't touch. Real-world A/B showed L1 (hook) + L2 (lean codec)
ceiling at ~30% session-cost reduction because they only shrink
cache_read; output tokens (preamble + tool-result restatement +
closing summaries) sit on a different axis at ~50× cache_read
pricing. This is the lever that pushes savings past the ceiling.

Two new install patches, both opt-in, both idempotent:

  --with-output-style  drops ~/.claude/output-styles/keeba.md, a terse
                       engineering style suppressing preamble +
                       restatement + summaries while leaving
                       tool-calling competence intact. Activated
                       per-session with /output-style keeba (we don't
                       mutate the user's defaultOutputStyle setting —
                       too invasive for an install).

  CLAUDE.md template   gains an "Answer discipline (output-token
                       savings)" subsection: one-find_def-hit-stops,
                       quote-don't-restate, no-preamble,
                       no-closing-summary, conclusion-first. Same
                       sentinels keep --with-claude-md re-runs
                       idempotent; new test pins the rules so future
                       edits can't quietly soften them.

Tests: full-file/idempotent/overwrite-stale for installKeebaOutputStyle,
phrase pins for both the output-style content and the new CLAUDE.md
subsection. End-to-end install dry run against a temp HOME confirms
both files land at the right paths and re-runs print "no change".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aomerk aomerk merged commit 133d9ab into main May 1, 2026
2 checks passed
@aomerk aomerk deleted the feat-output-style-and-stop-conditions branch May 1, 2026 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant