Skip to content

fix(test): resolve flaky integration tests with polling helper#654

Open
mvillmow wants to merge 5 commits into
mainfrom
428-auto-impl
Open

fix(test): resolve flaky integration tests with polling helper#654
mvillmow wants to merge 5 commits into
mainfrom
428-auto-impl

Conversation

@mvillmow

@mvillmow mvillmow commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Fix three flaky integration tests by replacing fixed-duration asyncio.sleep() calls with a deterministic polling helper that waits for messages to arrive via NATS callbacks.

  • test_publish_task_event_delivers_message — sleep 0.1s → poll up to 5s
  • test_concurrent_webhooks — sleep 0.2s → poll up to 5s + add missing flush
  • test_consecutive_publishes_both_succeed — sleep 0.1s → poll up to 5s + add missing flush

Root cause: under full-suite CPU contention, 0.1–0.2s sleeps are insufficient for NATS callbacks to fire. The polling helper checks len(received) every 25ms with a 5s timeout, eliminating the race while remaining bounded. Two tests also lacked explicit nats_client.flush() calls that would ensure subscriptions are ACK'd before publishing begins.

Test plan

  • All 541 tests pass (unit + integration)
  • Three targeted tests pass in isolation
  • Integration tests pass 5× back-to-back under full load (confirms race is gone)
  • Code passes ruff lint and format checks
  • All commits are GPG-signed

Closes #428

🤖 Generated with Claude Code

@mvillmow mvillmow left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GO: 3 named flaky tests fixed correctly (poll helper + flush); rest is ruff-forced formatting. Remove stray .claude-followup-428.md artifact.

Comment thread .claude-followup-428.md Outdated

@mvillmow mvillmow left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test fix is correct; flush()+poll helper match root cause. Blocker: committed coverage.xml build artifact (absolute path leak, unrelated to #428) must be removed.

Comment thread coverage.xml Outdated

@mvillmow mvillmow left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOGO: #428 test fixes are correct, but coverage.xml (851-line generated artifact w/ absolute machine path) is committed and not gitignored. Remove + ignore it.

Comment thread .gitignore
@mvillmow mvillmow enabled auto-merge (squash) June 28, 2026 18:48
mvillmow and others added 5 commits June 28, 2026 17:34
Document identified safety concern: 8 additional integration tests share the
same fixed-sleep race condition as the 3 fixed tests. Recommend mechanical
follow-on to apply wait_for_messages() helper to all 8 call sites.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
The followup analysis file should contain raw JSON without markdown code fence
wrappers per the specification.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
Drop .claude-followup-428.md (a Claude pipeline artifact with no
repo consumer) and add patterns for .claude-followup-*.md,
.claude-pr-review-*.md, and .claude-address-review-*.md to .gitignore
so these scratch files are never committed again.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
3 pre-existing flaky integration tests need investigation

Closes #428

Implemented-By: claude-sonnet-4-6
Co-Authored-By: Claude Code <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 pre-existing flaky integration tests need investigation

1 participant