feat: add stale label automation for closed bugs#8
Conversation
WalkthroughAdds stale Jira issue detection to the triage bot. New stale-label configuration is wired through Helm, env bindings, and scanner config. The scanner now runs a stale scan each cycle, builds stale JQL from progression labels, and relabels matching Jira issues with test coverage. ChangesStale Issue Scan Feature
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (11 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
adalton
left a comment
There was a problem hiding this comment.
Review: feature makes sense in triage-bot (label hygiene is a triage concern, not an autofix concern). One bug to fix before merge, plus design feedback.
| autofixLabel := s.cfg.Triage.AutoFixLabel | ||
| staleLabel := s.cfg.Triage.StaleLabel | ||
|
|
||
| allExcluded := append(s.cfg.Triage.ProgressionLabels, staleLabel) |
There was a problem hiding this comment.
Bug: append may mutate the config slice.
append(s.cfg.Triage.ProgressionLabels, staleLabel) will modify the underlying array of ProgressionLabels if it has spare capacity. Since buildStaleJQL() is called on every tick, the second call could see a corrupted config.
Safe alternative:
allExcluded := make([]string, len(s.cfg.Triage.ProgressionLabels)+1)
copy(allExcluded, s.cfg.Triage.ProgressionLabels)
allExcluded[len(allExcluded)-1] = staleLabelOr with Go 1.21+: allExcluded := slices.Concat(s.cfg.Triage.ProgressionLabels, []string{staleLabel})
There was a problem hiding this comment.
Fixed — using slices.Concat(s.cfg.Triage.ProgressionLabels, []string{staleLabel}) which always allocates a new slice. Go 1.26 so slices is available.
| wg.Wait() | ||
| } | ||
|
|
||
| func (s *Scanner) scanStale(ctx context.Context) { |
There was a problem hiding this comment.
Design: scanStale bypasses the InFlight semaphore.
The main scan() method respects ai.max_concurrent via the semaphore and worker pool. scanStale() issues Jira API calls (search + label add/remove) sequentially in the scanner goroutine with no concurrency bounding.
This is probably fine since these are lightweight label mutations (not AI invocations), but it blocks the next scan() tick if there are many stale issues. Worth a short comment explaining why the semaphore is intentionally not used here, so a future reader doesn't "fix" it.
There was a problem hiding this comment.
Added a comment above the for loop explaining this is intentional — label mutations are lightweight API calls, not AI invocations.
| continue | ||
| } | ||
|
|
||
| if err := s.jiraClient.AddLabel(ctx, issue.Key, staleLabel); err != nil { |
There was a problem hiding this comment.
Design: scanner uses concrete *jira.Client, making scanStale untestable with mocks.
The existing scan() delegates Jira interactions to IssueProcessor (which has its own JiraClient interface, enabling mocks). scanStale() calls s.jiraClient.AddLabel / RemoveLabel / SearchTickets directly on the concrete client.
This means the stale scan path can't be unit-tested without a real HTTP server. Consider extracting a StaleLabelClient interface (or extending IssueProcessor) so the interesting logic — partial failure handling, dry-run, pagination — can be tested.
There was a problem hiding this comment.
Extracted ScannerJiraClient interface with SearchTickets, AddLabel, RemoveLabel. Scanner field and NewScanner param now use the interface. Tests use a mockJiraClient that implements it — 7 test cases covering guard clauses, happy path, dry-run, and partial failure.
| - "jira-autofix-ci-failing" | ||
| - "jira-autofix-rejected" | ||
| - "jira-autofix-blocked" | ||
| - "jira-autofix-max-retries" |
There was a problem hiding this comment.
Question: does jira-autofix-max-retries actually exist?
Bug Buddy's config model defines FailureLabels (ci-failing, rejected, blocked) and LifecycleLabels (queued, review, merged) — I don't see a max-retries label in its model or anywhere it's applied. Is this label actually set by Bug Buddy today, or is it aspirational?
If it's never applied, it's harmless in the JQL NOT IN clause but misleading in the config — it suggests the stale scan accounts for a state that doesn't exist yet.
There was a problem hiding this comment.
Good catch — removed jira-autofix-max-retries from the OSAC config. Can add it back if Bug Buddy starts using it.
| "triage-bot/config" | ||
| ) | ||
|
|
||
| func TestBuildStaleJQL(t *testing.T) { |
There was a problem hiding this comment.
Test coverage: buildStaleJQL is well-tested, but scanStale has no tests.
The interesting behavior lives in scanStale: dry-run logging, partial failure (add stale succeeds but remove autofix fails), empty-result early return, pagination, and the "stale label empty → skip" guard. None of these are tested.
This is partly a consequence of the concrete *jira.Client usage (see other comment). If an interface is extracted, these paths become straightforward to cover with table-driven tests.
There was a problem hiding this comment.
Added TestScanStale with 7 cases via the new mockJiraClient: skips when stale_label empty, skips when autofix_label empty, skips when progression_labels empty, no issues found, marks stale issue (verifies both AddLabel and RemoveLabel), dry-run (no mutations), partial failure (AddLabel fails → no RemoveLabel).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scanner/scanner_test.go`:
- Around line 29-31: The skip-case tests currently only assert outputs, so they
do not prove that scanStale avoids Jira lookups when guard conditions are met.
Update mockJiraClient and the scanStale-related tests to track whether
SearchTickets is invoked, then add explicit “no Jira call” assertions in the
guard-clause cases while keeping the existing output checks. Use the
mockJiraClient.SearchTickets method and the scanStale tests around the skip
scenarios to locate the changes.
In `@scanner/scanner.go`:
- Around line 206-219: The stale-issue logging in scanner.go is misleading
because Marked issue as stale is emitted even when jiraClient.RemoveLabel fails.
Update the stale-label flow in the scanner logic so the success log only runs
after both AddLabel and RemoveLabel succeed, and either return or continue on
RemoveLabel failure in the same way as AddLabel; use the existing scanner method
and jiraClient.AddLabel/RemoveLabel calls to keep the partial-failure state from
being logged as complete.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 50405679-13eb-410b-8a1e-4f1234e17251
📒 Files selected for processing (6)
chart/triage-bot/values.yamlconfig/config.godeploy/osac/values.yamldeploy/shared-values.yamlscanner/scanner.goscanner/scanner_test.go
Adds a periodic scan that detects closed OSAC bugs with the autofix label but no progression label (merged/review/rejected/etc.) and relabels them as stale. All labels are config-driven via progression_labels to avoid hardcoded coupling to Bug Buddy's label scheme. Includes the stale label in the JQL exclusion list so partial failures (add stale succeeds, remove autofix fails) don't cause infinite re-processing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix slice mutation bug: use slices.Concat instead of append on config-owned ProgressionLabels slice (Andy's review) - Extract ScannerJiraClient interface so scanStale is testable with mocks, matching the IssueProcessor pattern used by scan() - Add comment explaining why scanStale skips the concurrency semaphore - Remove jira-autofix-max-retries from OSAC config (not set by Bug Buddy) - Add stale_label and progression_labels to chart values with Helm docs - Add TestScanStale with 7 test cases: guard clauses, happy path, dry-run, and partial failure Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add continue after RemoveLabel failure so success log only fires when both label operations succeed - Track searchCalls in mock to prove guard-clause tests skip all Jira interactions, not just label mutations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
863bbc6 to
e8aa5d1
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scanner/scanner_test.go`:
- Around line 21-22: Add a focused subtest for the partial-failure relabel path
in scanStale where AddLabel succeeds but RemoveLabel returns an error, using
mockJiraClient with removeErr set and asserting the RemoveLabel attempt occurs;
place it alongside the existing Scanner test cases so the error branch is
covered in addition to the happy path. Use the existing mockJiraClient,
Scanner.scanStale, addLabelCalls, and removeCalls fields to verify the flow, and
keep the setup consistent with the current jira searchResults-based tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: b1f749ba-c009-4855-968f-ce2357a14d27
📒 Files selected for processing (6)
chart/triage-bot/values.yamlconfig/config.godeploy/osac/values.yamldeploy/shared-values.yamlscanner/scanner.goscanner/scanner_test.go
| removeCalls []labelCall | ||
| removeErr error |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
Add a test for the RemoveLabel-failure branch.
removeErr is wired into mockJiraClient but no subtest sets it, so the RemoveLabel error path in scanStale (where AddLabel succeeds but RemoveLabel fails, leaving the issue with both labels) is never exercised. This is the more interesting partial-failure case for the relabel flow. As per coding guidelines, error cases must be covered alongside the happy path.
💚 Suggested subtest
t.Run("add succeeds but remove fails", func(t *testing.T) {
mock := &mockJiraClient{
searchResults: &jira.JiraSearchResponse{
Issues: []jira.JiraIssue{{Key: "OSAC-400"}},
IsLast: true,
},
removeErr: fmt.Errorf("503 Service Unavailable"),
}
s := &Scanner{jiraClient: mock, cfg: baseCfg, logger: zap.NewNop()}
s.scanStale(context.Background())
if len(mock.addLabelCalls) != 1 {
t.Errorf("expected 1 AddLabel call, got %d", len(mock.addLabelCalls))
}
if len(mock.removeCalls) != 1 {
t.Errorf("expected 1 RemoveLabel attempt, got %d", len(mock.removeCalls))
}
})Also applies to: 261-282
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scanner/scanner_test.go` around lines 21 - 22, Add a focused subtest for the
partial-failure relabel path in scanStale where AddLabel succeeds but
RemoveLabel returns an error, using mockJiraClient with removeErr set and
asserting the RemoveLabel attempt occurs; place it alongside the existing
Scanner test cases so the error branch is covered in addition to the happy path.
Use the existing mockJiraClient, Scanner.scanStale, addLabelCalls, and
removeCalls fields to verify the flow, and keep the setup consistent with the
current jira searchResults-based tests.
Source: Coding guidelines
Summary
progression_labels— no hardcoded coupling to Bug Buddy's label schemestale_label: ""in shared-values); enabled for OSAC withjira-triage-staleChanges
config/config.go— addStaleLabelandProgressionLabelstoTriageConfigscanner/scanner.go— addscanStale()andbuildStaleJQL()methodsscanner/scanner_test.go— addTestBuildStaleJQLwith 3 test cases (first test file for scanner package)deploy/osac/values.yaml— set stale label + progression labels for OSACdeploy/shared-values.yaml— add disabled defaults for new fieldsTest plan
go build ./...passesgo test ./scanner/ ./config/— all pass (3 new scanner tests)gofmtcleandry_run: true: verify stale scan logs correct JQL and "DRY RUN: would mark stale" for expected ticketsdry_run: false: verify closed autofix tickets get relabeled to stale🤖 Generated with Claude Code
Affected packages: config/, scanner/ (plus Helm chart/deployment values under chart/triage-bot and deploy/)
Control plane impact: yes — the scanner polling loop now runs an additional “stale-issue” pass alongside the existing closed-bug triage scan, using Jira label search + mutations to relabel closed autofix issues as stale. No changes to webhooks, comment state machine, or the AI invocation/executor path.
Configuration: yes — adds
triage.stale_labelandtriage.progression_labelsto config/env bindings; stale detection is driven byprogression_labels(no hardcoded coupling), and stale label is included in the JQL exclusion set to avoid repeated reprocessing after partial failures.Helm/deployment: yes — updates shared defaults to disable stale scanning (
stale_label: "", emptyprogression_labels) and enables it for OSAC viadeploy/osac/values.yamlusingstale_label: "jira-triage-stale"and progression label list.Testing: yes — adds/updates scanner tests for
buildStaleJQLandscanStale, including guard clauses (skip when config is empty), exact JQL expectations, dry-run behavior, happy path label updates, and partial-failure handling (continue onAddLabelerrors).