Implement spec 33 pipeline vocabulary patterns by bnomei · Pull Request #7 · bnomei/muninn

bnomei · 2026-03-18T09:50:13Z

Summary

document the bounded vocabulary-JSON pattern as prompt shaping for refine, including global and contextual examples in the README and sample config
add a generic system_prompt_append composition helper so base config, voices, and profile transcript overrides can layer hint blocks without a dedicated vocabulary subsystem
add config/refine coverage proving appended prompt fragments resolve into the refine hint path while baseline behavior stays intact

Validation

cargo fmt
PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q config
PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q refine
PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q
PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo clippy -q --all-targets -- -D warnings

Summary by CodeRabbit

New Features
- Optional ability to append bounded vocabulary JSON to system prompts for best-effort vocabulary biasing (disabled by default; no change to default behavior).
Documentation
- Expanded docs and examples showing how to configure, layer, and use system_prompt_append vocabulary hints across transcripts and profiles.

coderabbitai · 2026-03-18T09:50:34Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e91c8de6-f558-4ba3-bf8c-b7d9d2a96a06

📥 Commits

Reviewing files that changed from the base of the PR and between 94446eb and a48eb1a.

📒 Files selected for processing (1)

src/config.rs

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📝 Walkthrough

Walkthrough

Adds optional system_prompt_append across configs, validation, composition, and materialization; exposes prompt-fragment helpers; preserves vocabulary JSON blocks in refine prompt construction; updates docs, samples, benchmarks, tests, and task tracking metadata.

Changes

Cohort / File(s)	Summary
Config core `src/config.rs`	Added `system_prompt_append: Option<String>` to `TranscriptConfig`, `VoiceConfig`, and `TranscriptOverrides`. Added validation helpers (`validate_prompt_fragment`, `append_prompt_fragment`, `compose_prompt_text`), materialization (`materialize_system_prompt`), and updated apply/validation flows to handle replace vs append semantics. Defaults set to `None`.
Refine prompt construction `src/refine.rs`	Extracted user-prompt build into `build_refine_user_prompt(hint_prompt, raw_text)` and added unit test to ensure vocabulary JSON blocks are preserved and formatting remains correct.
Config samples & benches `configs/config.sample.toml`, `benches/runtime_bottlenecks.rs`	Added commented examples showing `system_prompt_append` usage (vocabulary JSON blocks) in various config sections; benchmark initialization updated to include `system_prompt_append: None`.
Documentation `README.md`	Documented `system_prompt_append` usage and bounded vocabulary JSON structure, clarifying that Muninn forwards the JSON to the refine pass without parsing and how to layer hints.
Spec / Tasks `specs/33-pipeline-vocabulary-patterns/tasks.md`	Moved tasks T001–T004 from Todo to Done with owner `codex` and timestamps (2026-03-18T00:00:00Z).
Minor formatting `src/stt_deepgram_tool.rs`	Whitespace/indentation tweak in an error-handling closure; no behavior change.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant AppConfig as AppConfig
    participant TranscriptCfg as TranscriptConfig
    participant Resolver as ResolvedBuiltinStepConfig
    participant Refine as Refine Module

    User->>AppConfig: Load config (may include system_prompt_append)
    AppConfig->>TranscriptCfg: Validate system_prompt and system_prompt_append
    TranscriptCfg-->>AppConfig: Validation result
    AppConfig->>Resolver: Build ResolvedBuiltinStepConfig
    Resolver->>TranscriptCfg: materialize_system_prompt()
    TranscriptCfg->>TranscriptCfg: Compose base system_prompt + system_prompt_append
    TranscriptCfg-->>Resolver: Return composed system_prompt (append cleared)
    Resolver-->>AppConfig: Resolved config ready
    AppConfig->>Refine: Provide composed system_prompt
    Refine->>Refine: build_refine_user_prompt(hint_prompt, raw_text)
    Refine-->>User: Send refine request with preserved vocabulary JSON

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Implement spec 33 pipeline vocabulary patterns #7 — Implements the spec-33 vocabulary append feature: adds system_prompt_append fields, validation/materialization logic, and refine prompt helper to preserve appended vocabulary JSON blocks.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: implementing a specification for pipeline vocabulary patterns. The changeset adds vocabulary JSON pattern support, configuration examples, documentation, and the underlying composition logic across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bnomei · 2026-03-18T09:52:42Z

@Auggie review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 94446eb185

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T09:54:43Z

        if let Some(system_prompt) = self.system_prompt.as_ref() {
            transcript.system_prompt = system_prompt.clone();
        }
+        if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
+            transcript.append_system_prompt(system_prompt_append);


Clear inherited append blocks on system_prompt replacement

When a later voice/profile switches to a different system_prompt, any previously accumulated system_prompt_append text is left intact until materialize_system_prompt() runs. In configs that set a global vocabulary JSON block and then use a contextual system_prompt as a full replacement, the old append block still gets sent to refine, so a mail/profile-specific prompt can unexpectedly inherit developer-only vocabulary hints from the base config. The same replacement path in TranscriptOverrides::apply_to below has this behavior too.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-18T09:54:43Z


+impl TranscriptConfig {
+    fn validate(&self, field_prefix: &str) -> Result<(), ConfigValidationError> {
+        validate_prompt_fragment(&self.system_prompt, format!("{field_prefix}.system_prompt"))?;


Allow blank transcript.system_prompt for opt-out configs

This now treats [transcript].system_prompt = "" as a validation error, even though the refine step already has its own built-in system prompt and previously accepted an empty per-project hint. Existing configs that intentionally blank the hint prompt to opt out of extra guidance will now fail AppConfig::load(); for the standalone refine tool, load_refine_config_from_config() will even discard the whole config and fall back to defaults. That is a backward-compatibility regression unrelated to the new append helper.

Useful? React with 👍 / 👎.

augmentcode · 2026-03-18T09:55:28Z

🤖 Augment PR Summary

Summary: This PR implements the “spec 33” pipeline vocabulary pattern by letting users layer bounded vocabulary-hint blocks into the existing refine prompt surface.

Changes:

Documented a vocabulary-JSON prompt pattern in README.md and configs/config.sample.toml using a new system_prompt_append field.
Extended config schema to support system_prompt_append on base transcript config, voices, and profile transcript overrides.
Added generic prompt composition utilities (append_prompt_fragment / compose_prompt_text) and prompt-fragment validation.
Materialized the composed transcript prompt during effective-config resolution and when building built-in step configs.
Added tests proving layered fragments reach refine and that baseline defaults remain unchanged.
Refactored OpenAI refine user prompt formatting into build_refine_user_prompt and added a regression test for vocabulary JSON blocks.

Technical Notes: Muninn forwards appended blocks as plain text to refine (best-effort prompt biasing); it does not parse JSON or integrate provider-native vocabulary/adaptation APIs.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-03-18T09:55:40Z


+impl TranscriptConfig {
+    fn validate(&self, field_prefix: &str) -> Result<(), ConfigValidationError> {
+        validate_prompt_fragment(&self.system_prompt, format!("{field_prefix}.system_prompt"))?;


TranscriptConfig::validate now rejects an empty transcript.system_prompt, which seems like it could prevent intentionally disabling refine hints (or using only system_prompt_append), even though other code paths treat an empty prompt as meaningful (e.g., skipping refine context). Was tightening this validation intended to be a breaking config change?

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/config.rs (1)

721-727: ⚠️ Potential issue | 🟠 Major

Clear inherited append fragments when system_prompt is replaced.

Line 723 and Line 898 overwrite system_prompt but keep any previously accumulated system_prompt_append. That means a voice/profile “full replacement” still inherits earlier vocabulary blocks, so the final refine prompt can include stale hints from the base config or prior voice layer.

Suggested fix

 impl TranscriptConfig {
+    fn replace_system_prompt(&mut self, prompt: &str) {
+        self.system_prompt = prompt.to_string();
+        self.system_prompt_append = None;
+    }
+
     fn append_system_prompt(&mut self, fragment: &str) {
         append_prompt_fragment(&mut self.system_prompt_append, fragment);
     }
 }
@@
     fn apply_to(&self, transcript: &mut TranscriptConfig, refine: &mut RefineConfig) {
         if let Some(system_prompt) = self.system_prompt.as_ref() {
-            transcript.system_prompt = system_prompt.clone();
+            transcript.replace_system_prompt(system_prompt);
         }
         if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
             transcript.append_system_prompt(system_prompt_append);
         }
@@
     fn apply_to(&self, transcript: &mut TranscriptConfig) {
         if let Some(system_prompt) = self.system_prompt.as_ref() {
-            transcript.system_prompt = system_prompt.clone();
+            transcript.replace_system_prompt(system_prompt);
         }
         if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
             transcript.append_system_prompt(system_prompt_append);
         }
     }

Please add a regression test for “base append + overriding system_prompt” at both the voice and profile layers.

Also applies to: 896-902

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 721 - 727, When replacing system_prompt in
apply_to, clear any accumulated append fragments so a full replacement does not
inherit prior appends: in apply_to (the block that checks
self.system_prompt.as_ref()) after setting transcript.system_prompt =
system_prompt.clone(), reset the transcript's system-prompt-append storage
(e.g., clear or set to empty the field used by transcript.append_system_prompt);
do the same for the analogous replacement block that handles
RefineConfig/profile (the other block referenced around lines 896–902). Also add
regression tests that create a base config with system_prompt_append, then apply
a voice-layer and a profile-layer config that sets system_prompt (full
replacement) and assert that no prior append fragments remain in the final
refine/system prompt.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/config.rs`:
- Around line 721-727: When replacing system_prompt in apply_to, clear any
accumulated append fragments so a full replacement does not inherit prior
appends: in apply_to (the block that checks self.system_prompt.as_ref()) after
setting transcript.system_prompt = system_prompt.clone(), reset the transcript's
system-prompt-append storage (e.g., clear or set to empty the field used by
transcript.append_system_prompt); do the same for the analogous replacement
block that handles RefineConfig/profile (the other block referenced around lines
896–902). Also add regression tests that create a base config with
system_prompt_append, then apply a voice-layer and a profile-layer config that
sets system_prompt (full replacement) and assert that no prior append fragments
remain in the final refine/system prompt.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4b0aaa57-6ba7-4967-afc4-e0bf5922bc3c

📥 Commits

Reviewing files that changed from the base of the PR and between ba7b3da and 94446eb.

📒 Files selected for processing (7)

README.md
benches/runtime_bottlenecks.rs
configs/config.sample.toml
specs/33-pipeline-vocabulary-patterns/tasks.md
src/config.rs
src/refine.rs
src/stt_deepgram_tool.rs

bnomei added 2 commits March 18, 2026 09:49

Implement spec 33 pipeline vocabulary patterns

b1c75eb

Normalize Deepgram tool formatting

94446eb

chatgpt-codex-connector Bot reviewed Mar 18, 2026

View reviewed changes

augmentcode Bot reviewed Mar 18, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

Address spec 33 review feedback

a48eb1a

bnomei merged commit 0368d06 into main Mar 18, 2026
4 of 5 checks passed

bnomei deleted the codex/spec-33-pipeline-vocabulary-patterns branch March 18, 2026 10:01

Uh oh!

Conversation

bnomei commented Mar 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

bnomei commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bnomei commented Mar 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading

augmentcode Bot commented Mar 18, 2026 •

edited

Loading