feat(mcp): --with-output-style + stop-condition CLAUDE.md additions (banger phase 16)#35
Merged
Merged
Conversation
…banger phase 16)
Attack output-token bloat — the dominant cost component our codec
layers don't touch. Real-world A/B showed L1 (hook) + L2 (lean codec)
ceiling at ~30% session-cost reduction because they only shrink
cache_read; output tokens (preamble + tool-result restatement +
closing summaries) sit on a different axis at ~50× cache_read
pricing. This is the lever that pushes savings past the ceiling.
Two new install patches, both opt-in, both idempotent:
--with-output-style drops ~/.claude/output-styles/keeba.md, a terse
engineering style suppressing preamble +
restatement + summaries while leaving
tool-calling competence intact. Activated
per-session with /output-style keeba (we don't
mutate the user's defaultOutputStyle setting —
too invasive for an install).
CLAUDE.md template gains an "Answer discipline (output-token
savings)" subsection: one-find_def-hit-stops,
quote-don't-restate, no-preamble,
no-closing-summary, conclusion-first. Same
sentinels keep --with-claude-md re-runs
idempotent; new test pins the rules so future
edits can't quietly soften them.
Tests: full-file/idempotent/overwrite-stale for installKeebaOutputStyle,
phrase pins for both the output-style content and the new CLAUDE.md
subsection. End-to-end install dry run against a temp HOME confirms
both files land at the right paths and re-runs print "no change".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
keeba mcp install --with-output-style— drops~/.claude/output-styles/keeba.md, a terse engineering style that suppresses preamble + tool-result restatement + closing summaries (the three biggest output-token sinks in a typical Claude Code investigation). Activate per-session with/output-style keeba.--with-claude-mdtemplate with an Answer discipline (output-token savings) subsection: one-find_def-hit-stops, quote-don't-restate, no-preamble, no-closing-summary, conclusion-first.--patch-agents --with-claude-md --with-hook --with-output-style).Why
Real-world A/B (Slack-thread investigation prompt) showed our existing codec layers ceiling at ~30% session-cost reduction:
L1 (hook codec) + L2 (lean MCP responses) both shrink cache_read (priced ~$0.30/M). But session cost is dominated by output tokens (priced ~$15/M — 50× more) and output didn't shrink. To break the ceiling we have to attack the output side.
Output tokens come from five places. We control three:
No preambledirectiveQuote, don't restateAnswer disciplinestop conditionsThis PR ships the levers for the three controllable sources. They are opt-in (existing installs unaffected) and idempotent (re-running
keeba mcp installprints "no change").Why opt-in, not default
Output styles in Claude Code replace the default engineering system prompt. Mutating the user's
defaultOutputStylefrom an install is too invasive — they may have their own style or rely on the default. Drop the file, tell the user to/output-style keeba. They opt in.What's NOT in this PR
keeba investigateruns the loop in our control plane and returns one finished answer for Claude to synthesize). That's a much bigger build; defer until we measure whether output-style + Answer discipline already break the ceiling.~/.claude/settings.jsondefaultOutputStyle. Opt-in only.Test plan
go test ./...— all greengofumpt -lclean on changed filesgo vet ./...cleangolangci-lint run ./internal/cli/...— 0 issues--with-output-stylelands~/.claude/output-styles/keeba.mdwith valid frontmatter (name: keeba)--with-claude-mdinjects## Code investigation in keeba-indexed reposwith the newAnswer disciplinesubsection containingNO preamble,NO closing summary,Quote tool-result rows verbatim,50× cache_read--with-output-style, activate/output-style keeba, re-run the same Slack-thread prompt, compare cost vs the $2.73 lean-codec baseline. Expected: <$2.00 if the lever is real; ~$2.73 if output bloat is structural and we need lever 4.🤖 Generated with Claude Code