Skip to content

fix: elevate broadcast hostname resolution errors from DEBUG to ERROR#884

Closed
285729101 wants to merge 1 commit into
permitio:masterfrom
285729101:fix/broadcast-hostname-error-logging
Closed

fix: elevate broadcast hostname resolution errors from DEBUG to ERROR#884
285729101 wants to merge 1 commit into
permitio:masterfrom
285729101:fix/broadcast-hostname-error-logging

Conversation

@285729101

Copy link
Copy Markdown

Description

Fixes the silent swallowing of DNS resolution errors for the broadcast URI. The gaierror when resolving the broadcast hostname was previously logged at DEBUG level, making it extremely difficult for operators to diagnose misconfigured broadcast URIs.

Closes #716

Changes

  • Added _validate_broadcast_uri() in PubSub.__init__() that performs an early DNS resolution check at startup and logs at ERROR level if it fails, with a descriptive message pointing operators to check their OPAL_BROADCAST_URI configuration
  • Replaced the anonymous lambda broadcaster disconnect callback in OpalServer with _on_broadcaster_disconnected() that inspects the task exception (e.g. gaierror) and logs it at ERROR level instead of silently swallowing it

Notes

  • Both changes are advisory only -- they log errors but do not prevent startup or alter existing control flow, preserving full backward compatibility
  • Minimal change footprint: only pubsub.py and server.py are modified

/claim #716

When the OPAL server is configured with a broadcast URI containing an
unresolvable hostname, the gaierror exception was silently swallowed and
only logged at DEBUG level by the underlying fastapi_websocket_pubsub
library. This made it extremely difficult for operators to diagnose
misconfigured broadcast URIs.

Changes:
- Add _validate_broadcast_uri() in PubSub that performs an early DNS
  resolution check at startup and logs at ERROR level if it fails
- Replace the anonymous lambda broadcaster disconnect callback with
  _on_broadcaster_disconnected() that inspects the task exception and
  logs it at ERROR level (catches the gaierror at runtime too)

Both changes are advisory only -- they do not prevent startup or change
existing control flow, preserving backward compatibility.

Closes permitio#716

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@netlify

netlify Bot commented Feb 17, 2026

Copy link
Copy Markdown

Deploy Preview for opal-docs canceled.

Name Link
🔨 Latest commit f49f8c5
🔍 Latest deploy log https://app.netlify.com/projects/opal-docs/deploys/6994385eea562b0008a68c2f

@285729101

Copy link
Copy Markdown
Author

@asafc @orweis elevates broadcast hostname resolution errors from DEBUG to ERROR so failed resolution is visible in logs.

@zeevmoney

Copy link
Copy Markdown
Contributor

Thanks for this fix, @285729101 — the diagnosis is correct and the implementation is clean and minimal. Unfortunately the linked issue #716 was closed as stale on 2026-02-17 ("the bounty is more than a year old"), so we're winding down the bounty work tied to it, and the reader-task error path is being reworked in the broadcaster reconnect effort (#915). Given the issue is now closed and the work is duplicated, we're going to close this PR. For the record, your approach here — inspecting task.exception() in the reader-task done-callback plus an early getaddrinfo() validation — is exactly the right one. Appreciate the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error resolving broadcast hostname being swallowed

2 participants