fix(plan): drop low-relevance citations + word-boundary snippets#645
Conversation
Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-05T01-33-ryan-OxBJh7/view
|
Warning Review limit reached
More reviews will be available in 93 minutes and 1 second. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (9)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-05T01-33-ryan-OxBJh7/view
Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-05T01-33-ryan-OxBJh7/view
… avatar marker Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-05T01-33-ryan-OxBJh7/view
What broke
A rendered enriched plan (
/ox-plan) surfaced low-quality citations: snippets clipped mid-word at both ends ("...HLS retry semantics","...ive intelligence_1"), an unfilled ox starter template (glossary.md—"Define your team's critical domain-specific terms…") cited as a DISCUSSION, and the rawcontext[]bundle pasted verbatim as a flat card list with no provenance affordance. Investigation confirmed the "garbage" was real ledger content, badly retrieved and badly rendered — not a test fixture (e.g.scribe-y7r8is a real beads issue ID), so the fix is relevance + snippet quality + render discipline, not a blocklist.What this PR ships
Two independent layers — stop the noise at the source (CLI), and stop dumping it (render spec).
A — CLI retrieval quality
internal/plan/context_bundle.go):minBundleScore = 0.55applied inrankAndCap. Drops single-token team-doc matches (~0.43) and barely-relevant sessions (~0.48); a topic-only murmur (0.60) and solid session/doc hits still clear.internal/teamdocs/discover.go,frontmatter.go): skip unedited ox starter docs via atemplate: truefrontmatter flag (forward-compatible) and body sentinel detection. Guarded so a filled-in doc survives.internal/ledgersearch/ledgersearch.go):snippetAroundnow snaps both edges to whole-word boundaries; cutting only at ASCII whitespace is inherently UTF-8 safe, with a rune-aligned fallback for over-long single tokens.rankAndCapcollapses duplicate(kind, ref)items, keeping the highest score.B — render spec (
extensions/claude/commands/ox-plan.md)context[]bundle verbatim" rule + a worked bundle-item to cited-badge example. The bundle is reasoning input, not render output.file://-safe, theme-aware — showing source kind, author, when, cleaned snippet, and resolved link.Test Plan
internal/plan,internal/ledgersearch,internal/teamdocs):(kind,ref)collapsed to highest score.template:trueexcluded; filled glossary survives.make lint→ 0 issues.make formatclean (unrelated repo-wide gofmt drift reverted to keep this PR focused).Checklist (expand if needed)
/security-review— N/A (nointernal/auth/,internal/mcp/,raw_writer.go,prepush_scan.go,internal/upgrade/, orgo.sumtouched)Co-Authored-By: SageOx