Skip to content

Implement spec 33 pipeline vocabulary patterns#7

Merged
bnomei merged 3 commits into
mainfrom
codex/spec-33-pipeline-vocabulary-patterns
Mar 18, 2026
Merged

Implement spec 33 pipeline vocabulary patterns#7
bnomei merged 3 commits into
mainfrom
codex/spec-33-pipeline-vocabulary-patterns

Conversation

@bnomei

@bnomei bnomei commented Mar 18, 2026

Copy link
Copy Markdown
Owner

Summary

  • document the bounded vocabulary-JSON pattern as prompt shaping for refine, including global and contextual examples in the README and sample config
  • add a generic system_prompt_append composition helper so base config, voices, and profile transcript overrides can layer hint blocks without a dedicated vocabulary subsystem
  • add config/refine coverage proving appended prompt fragments resolve into the refine hint path while baseline behavior stays intact

Validation

  • cargo fmt
  • PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q config
  • PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q refine
  • PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo test -q
  • PATH=/opt/homebrew/bin:$PATH CARGO_HOME=/tmp/muninn-cargo-home cargo clippy -q --all-targets -- -D warnings

Summary by CodeRabbit

  • New Features

    • Optional ability to append bounded vocabulary JSON to system prompts for best-effort vocabulary biasing (disabled by default; no change to default behavior).
  • Documentation

    • Expanded docs and examples showing how to configure, layer, and use system_prompt_append vocabulary hints across transcripts and profiles.

@coderabbitai

coderabbitai Bot commented Mar 18, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e91c8de6-f558-4ba3-bf8c-b7d9d2a96a06

📥 Commits

Reviewing files that changed from the base of the PR and between 94446eb and a48eb1a.

📒 Files selected for processing (1)
  • src/config.rs

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.


📝 Walkthrough

Walkthrough

Adds optional system_prompt_append across configs, validation, composition, and materialization; exposes prompt-fragment helpers; preserves vocabulary JSON blocks in refine prompt construction; updates docs, samples, benchmarks, tests, and task tracking metadata.

Changes

Cohort / File(s) Summary
Config core
src/config.rs
Added system_prompt_append: Option<String> to TranscriptConfig, VoiceConfig, and TranscriptOverrides. Added validation helpers (validate_prompt_fragment, append_prompt_fragment, compose_prompt_text), materialization (materialize_system_prompt), and updated apply/validation flows to handle replace vs append semantics. Defaults set to None.
Refine prompt construction
src/refine.rs
Extracted user-prompt build into build_refine_user_prompt(hint_prompt, raw_text) and added unit test to ensure vocabulary JSON blocks are preserved and formatting remains correct.
Config samples & benches
configs/config.sample.toml, benches/runtime_bottlenecks.rs
Added commented examples showing system_prompt_append usage (vocabulary JSON blocks) in various config sections; benchmark initialization updated to include system_prompt_append: None.
Documentation
README.md
Documented system_prompt_append usage and bounded vocabulary JSON structure, clarifying that Muninn forwards the JSON to the refine pass without parsing and how to layer hints.
Spec / Tasks
specs/33-pipeline-vocabulary-patterns/tasks.md
Moved tasks T001–T004 from Todo to Done with owner codex and timestamps (2026-03-18T00:00:00Z).
Minor formatting
src/stt_deepgram_tool.rs
Whitespace/indentation tweak in an error-handling closure; no behavior change.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant AppConfig as AppConfig
    participant TranscriptCfg as TranscriptConfig
    participant Resolver as ResolvedBuiltinStepConfig
    participant Refine as Refine Module

    User->>AppConfig: Load config (may include system_prompt_append)
    AppConfig->>TranscriptCfg: Validate system_prompt and system_prompt_append
    TranscriptCfg-->>AppConfig: Validation result
    AppConfig->>Resolver: Build ResolvedBuiltinStepConfig
    Resolver->>TranscriptCfg: materialize_system_prompt()
    TranscriptCfg->>TranscriptCfg: Compose base system_prompt + system_prompt_append
    TranscriptCfg-->>Resolver: Return composed system_prompt (append cleared)
    Resolver-->>AppConfig: Resolved config ready
    AppConfig->>Refine: Provide composed system_prompt
    Refine->>Refine: build_refine_user_prompt(hint_prompt, raw_text)
    Refine-->>User: Send refine request with preserved vocabulary JSON
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: implementing a specification for pipeline vocabulary patterns. The changeset adds vocabulary JSON pattern support, configuration examples, documentation, and the underlying composition logic across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bnomei

bnomei commented Mar 18, 2026

Copy link
Copy Markdown
Owner Author

@Auggie review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 94446eb185

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/config.rs
Comment on lines 722 to +726
if let Some(system_prompt) = self.system_prompt.as_ref() {
transcript.system_prompt = system_prompt.clone();
}
if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
transcript.append_system_prompt(system_prompt_append);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear inherited append blocks on system_prompt replacement

When a later voice/profile switches to a different system_prompt, any previously accumulated system_prompt_append text is left intact until materialize_system_prompt() runs. In configs that set a global vocabulary JSON block and then use a contextual system_prompt as a full replacement, the old append block still gets sent to refine, so a mail/profile-specific prompt can unexpectedly inherit developer-only vocabulary hints from the base config. The same replacement path in TranscriptOverrides::apply_to below has this behavior too.

Useful? React with 👍 / 👎.

Comment thread src/config.rs Outdated

impl TranscriptConfig {
fn validate(&self, field_prefix: &str) -> Result<(), ConfigValidationError> {
validate_prompt_fragment(&self.system_prompt, format!("{field_prefix}.system_prompt"))?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow blank transcript.system_prompt for opt-out configs

This now treats [transcript].system_prompt = "" as a validation error, even though the refine step already has its own built-in system prompt and previously accepted an empty per-project hint. Existing configs that intentionally blank the hint prompt to opt out of extra guidance will now fail AppConfig::load(); for the standalone refine tool, load_refine_config_from_config() will even discard the whole config and fall back to defaults. That is a backward-compatibility regression unrelated to the new append helper.

Useful? React with 👍 / 👎.

@augmentcode

augmentcode Bot commented Mar 18, 2026

Copy link
Copy Markdown
🤖 Augment PR Summary

Summary: This PR implements the “spec 33” pipeline vocabulary pattern by letting users layer bounded vocabulary-hint blocks into the existing refine prompt surface.

Changes:

  • Documented a vocabulary-JSON prompt pattern in README.md and configs/config.sample.toml using a new system_prompt_append field.
  • Extended config schema to support system_prompt_append on base transcript config, voices, and profile transcript overrides.
  • Added generic prompt composition utilities (append_prompt_fragment / compose_prompt_text) and prompt-fragment validation.
  • Materialized the composed transcript prompt during effective-config resolution and when building built-in step configs.
  • Added tests proving layered fragments reach refine and that baseline defaults remain unchanged.
  • Refactored OpenAI refine user prompt formatting into build_refine_user_prompt and added a regression test for vocabulary JSON blocks.

Technical Notes: Muninn forwards appended blocks as plain text to refine (best-effort prompt biasing); it does not parse JSON or integrate provider-native vocabulary/adaptation APIs.

🤖 Was this summary useful? React with 👍 or 👎

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/config.rs Outdated

impl TranscriptConfig {
fn validate(&self, field_prefix: &str) -> Result<(), ConfigValidationError> {
validate_prompt_fragment(&self.system_prompt, format!("{field_prefix}.system_prompt"))?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TranscriptConfig::validate now rejects an empty transcript.system_prompt, which seems like it could prevent intentionally disabling refine hints (or using only system_prompt_append), even though other code paths treat an empty prompt as meaningful (e.g., skipping refine context). Was tightening this validation intended to be a breaking config change?

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/config.rs (1)

721-727: ⚠️ Potential issue | 🟠 Major

Clear inherited append fragments when system_prompt is replaced.

Line 723 and Line 898 overwrite system_prompt but keep any previously accumulated system_prompt_append. That means a voice/profile “full replacement” still inherits earlier vocabulary blocks, so the final refine prompt can include stale hints from the base config or prior voice layer.

Suggested fix
 impl TranscriptConfig {
+    fn replace_system_prompt(&mut self, prompt: &str) {
+        self.system_prompt = prompt.to_string();
+        self.system_prompt_append = None;
+    }
+
     fn append_system_prompt(&mut self, fragment: &str) {
         append_prompt_fragment(&mut self.system_prompt_append, fragment);
     }
 }
@@
     fn apply_to(&self, transcript: &mut TranscriptConfig, refine: &mut RefineConfig) {
         if let Some(system_prompt) = self.system_prompt.as_ref() {
-            transcript.system_prompt = system_prompt.clone();
+            transcript.replace_system_prompt(system_prompt);
         }
         if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
             transcript.append_system_prompt(system_prompt_append);
         }
@@
     fn apply_to(&self, transcript: &mut TranscriptConfig) {
         if let Some(system_prompt) = self.system_prompt.as_ref() {
-            transcript.system_prompt = system_prompt.clone();
+            transcript.replace_system_prompt(system_prompt);
         }
         if let Some(system_prompt_append) = self.system_prompt_append.as_deref() {
             transcript.append_system_prompt(system_prompt_append);
         }
     }

Please add a regression test for “base append + overriding system_prompt” at both the voice and profile layers.

Also applies to: 896-902

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 721 - 727, When replacing system_prompt in
apply_to, clear any accumulated append fragments so a full replacement does not
inherit prior appends: in apply_to (the block that checks
self.system_prompt.as_ref()) after setting transcript.system_prompt =
system_prompt.clone(), reset the transcript's system-prompt-append storage
(e.g., clear or set to empty the field used by transcript.append_system_prompt);
do the same for the analogous replacement block that handles
RefineConfig/profile (the other block referenced around lines 896–902). Also add
regression tests that create a base config with system_prompt_append, then apply
a voice-layer and a profile-layer config that sets system_prompt (full
replacement) and assert that no prior append fragments remain in the final
refine/system prompt.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/config.rs`:
- Around line 721-727: When replacing system_prompt in apply_to, clear any
accumulated append fragments so a full replacement does not inherit prior
appends: in apply_to (the block that checks self.system_prompt.as_ref()) after
setting transcript.system_prompt = system_prompt.clone(), reset the transcript's
system-prompt-append storage (e.g., clear or set to empty the field used by
transcript.append_system_prompt); do the same for the analogous replacement
block that handles RefineConfig/profile (the other block referenced around lines
896–902). Also add regression tests that create a base config with
system_prompt_append, then apply a voice-layer and a profile-layer config that
sets system_prompt (full replacement) and assert that no prior append fragments
remain in the final refine/system prompt.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4b0aaa57-6ba7-4967-afc4-e0bf5922bc3c

📥 Commits

Reviewing files that changed from the base of the PR and between ba7b3da and 94446eb.

📒 Files selected for processing (7)
  • README.md
  • benches/runtime_bottlenecks.rs
  • configs/config.sample.toml
  • specs/33-pipeline-vocabulary-patterns/tasks.md
  • src/config.rs
  • src/refine.rs
  • src/stt_deepgram_tool.rs

@bnomei bnomei merged commit 0368d06 into main Mar 18, 2026
4 of 5 checks passed
@bnomei bnomei deleted the codex/spec-33-pipeline-vocabulary-patterns branch March 18, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant