Skip to content

Add probing, announcing and conflict resolution (RFC 6762 §8-9)#271

Open
backkem wants to merge 6 commits into
mainfrom
probing-conflict-resolution
Open

Add probing, announcing and conflict resolution (RFC 6762 §8-9)#271
backkem wants to merge 6 commits into
mainfrom
probing-conflict-resolution

Conversation

@backkem

@backkem backkem commented Mar 4, 2026

Copy link
Copy Markdown
Member

Summary

  • probe.go: State machine for name probing (delay→probing→announcing→established), simultaneous probe tiebreaking (§8.2), conflict detection across all RR sections (§9), rename + re-probe with rate limiting
  • config.go: WithProbing(), WithConflictHandler() options
  • conn.go: WaitReady(), atomic.Value conflict handler (unexported setter, ready to expose as OnConflict later)
  • server.go: probeManager integration, buildProbeSessions, onProbeRenamed callback
  • 37 unit tests in probe_test.go, deterministic via fake timers + sync.Cond

Opening as draft for early review feedback.

Open questions

  1. probe.go is ~950 lines. Should we split it further, e.g.:

    • probe.go — state machine + manager
    • probe_tiebreak.golexicographicCompare, sortResources, compareResource, packRData
    • probe_rename.godefaultRename, defaultRenameHost, defaultRenameServiceInstance

    Or is one file fine given the tight coupling?

  2. WaitReady(ctx) error is currently exported on Conn. Is this the right API surface, or should readiness be signaled differently (e.g. callback, channel field)?

@backkem backkem force-pushed the probing-conflict-resolution branch from c604b8b to c1b926f Compare March 4, 2026 21:17
@backkem

backkem commented Mar 4, 2026

Copy link
Copy Markdown
Member Author

cc @JoTurk early feedback appreciated.

@codecov

codecov Bot commented Mar 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.47079% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.54%. Comparing base (84f698c) to head (dfb946c).

Files with missing lines Patch % Lines
probe.go 93.99% 15 Missing and 10 partials ⚠️
server.go 93.22% 4 Missing and 4 partials ⚠️
conn.go 86.48% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #271      +/-   ##
==========================================
+ Coverage   84.11%   85.54%   +1.43%     
==========================================
  Files           8        9       +1     
  Lines        1970     2525     +555     
==========================================
+ Hits         1657     2160     +503     
- Misses        202      241      +39     
- Partials      111      124      +13     
Flag Coverage Δ
go 85.54% <93.47%> (+1.43%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@backkem backkem force-pushed the probing-conflict-resolution branch from c1b926f to 161bfe1 Compare June 3, 2026 19:31
@backkem backkem marked this pull request as ready for review June 3, 2026 19:31
@backkem backkem force-pushed the probing-conflict-resolution branch from 5794c46 to b28f669 Compare June 10, 2026 13:13
@backkem backkem requested review from JoTurk and jwetzell June 10, 2026 13:19
backkem added 6 commits June 10, 2026 16:43
RFC 6762 §8-9 implementation:
- probe.go: state machine (delay→probing→announcing→established)
- Simultaneous probe tiebreaking with lexicographic compare (§8.2)
- Conflict detection in all resource record sections (§9)
- Rename strategy: "-N" for hosts, " (N)" for services
- Rate limiting: 15 conflicts/10s → 5s backoff, 60s give-up
- Cache-flush bit on unique records in announcements (§10.2)
- WithProbing, WithConflictHandler options
- WaitReady() blocks until initial probing completes
- Conn.setConflictHandler/conflictHandler via atomic.Value
- Name inbound channel buffer size as constant
- Test onProbeRenamed for host, service instance, escaped dots,
  case-insensitive match, and no-match no-op
- Test WaitReady for nil probes, context cancellation, conn close
- Test sequential conflicts (two rename cycles with record header
  verification)
- Test full conflict-rename-announce cycle with wire content
  verification (renamed name, cache-flush bit, preserved rdata)
- Answer qtype ANY for established names (§6) so probes from
  other hosts see a conflict; previously we stayed silent and
  lost the name to the newcomer
- Re-probe the same name on post-probing conflicts (§9) instead
  of renaming immediately; rename only if the re-probe conflicts
- Announce the service-type PTR as a shared record (§8.3),
  without cache-flush and excluded from probe/tiebreak
- Derive default renames from the original base name so user
  names ending in "-2" or " (2)" are not mangled
- Iterate RR sections in place in processResponse; appending
  aliased the message shared with other handlers
- Keep ready open on shutdown so WaitReady deterministically
  returns errConnectionClosed; drop dead probeManager.ttl
Only link-local answers carry a zone. The second QueryAddr
assertion required a zone on any non-4in6 address and failed on
hosts with a global IPv6 adapter (e.g. Cloudflare WARP). Mirror
the link-local guard used by the first assertion.
@backkem backkem force-pushed the probing-conflict-resolution branch from b28f669 to dfb946c Compare June 10, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant