Fix: don't mark connections unhealthy due to a shared destination by chrisdoehring · Pull Request #434 · PADAS/cdip

chrisdoehring · 2026-06-19T16:34:26Z

This is a draft/rfc feature because it changes the app behavior. It highlights the need for health statuses at a more granular level (ex. destination unhealthy rather than connection unhealthy).

Problem

Several connections show as Unhealthy in the Gundi portal even though they have no error activity logs of their own within the last few hours.

Root cause

A Connection's health is derived live from the stored IntegrationStatus of its provider and all of its destinations (ConnectionRetrieveSerializer.get_status, and the queryset equivalent filter_connections_by_status).

A destination integration is shared across every connection that routes to it (Integration.destinations is an M2M through routing rules). So when one destination is marked UNHEALTHY — typically from its own dispatcher/custom-log errors — every connection routing to that destination was reported as UNHEALTHY, regardless of whether that connection's own provider had any errors.

From the operator's point of view this looks broken: the connection is Unhealthy but its activity log is clean, because the errors live under the shared destination, not the connection.

Note: connection-level delivery failures are already attributed to the provider (event_consumers/dispatcher_events_consumer.py), so a genuinely failing connection still surfaces as Unhealthy via its provider status. The destination's aggregate health was the source of the false positives.

Fix

A healthy provider with an unhealthy or disabled shared destination is now surfaced as Needs review instead of Unhealthy, consistently in both code paths:

ConnectionRetrieveSerializer.get_status — the live status shown in the UI
filter_connections_by_status — the ?status= filter and the daily unhealthy-connections email

A connection is reported UNHEALTHY only when its own provider is unhealthy.

Provider	Destination	Before	After
unhealthy	any	unhealthy	unhealthy
healthy	unhealthy	unhealthy	needs_review
healthy	disabled	needs_review	needs_review
healthy	healthy	healthy	healthy

New management command

Adds recalculate_integration_statuses — an on-demand way to run the same health calculation as the hourly "Calculate Integration Statuses" beat task (previously only triggerable via the schedule or an enable/disable save; there was no management command for it).

# Recalculate every integration, synchronously, printing each result
python manage.py recalculate_integration_statuses

# Limit scope to specific integration(s) (--integration-id is repeatable)
python manage.py recalculate_integration_statuses --integration-id <uuid> --integration-id <uuid>

# Enqueue the Celery task instead of running inline
python manage.py recalculate_integration_statuses --integration-id <uuid> --async

Unknown ids are reported to stderr rather than silently doing nothing. Useful for ops/debugging and for forcing a recalc after this fix lands.

Tests

New integrations/tests/test_connection_status_derivation.py — function-level coverage of both filter_connections_by_status and get_status for the shared-destination case.
Updated test_filter_connections_by_status_unhealthy/needs_review_as_superuser to the new buckets, plus a new HTTP-level assertion on the serialized status field.
New tests in integrations/tests/test_commands.py covering the management command's sync path (stale status recomputed to unhealthy from error logs) and --async path (enqueues the Celery task).
Existing test_email_alerts.py and test_calc_integration_status.py remain green (17 passed).

🤖 Generated with Claude Code

A Connection's health was derived from the stored status of its provider AND all of its destinations. Because a destination integration is shared across every connection that routes to it, an unhealthy destination marked ALL of its connections "Unhealthy" — even connections whose own provider had no errors. This produced connections shown as Unhealthy in the portal with no error activity logs of their own (the destination's errors live under the destination, not the connection). Connection-level delivery failures are already attributed to the provider, so a genuinely failing connection still surfaces as Unhealthy. A healthy provider with an unhealthy (or disabled) shared destination is now surfaced as "Needs review" instead of "Unhealthy", in both: - ConnectionRetrieveSerializer.get_status (live UI status) - filter_connections_by_status (status filter + unhealthy-connections email) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provides an on-demand way to run the same health calculation as the hourly "Calculate Integration Statuses" beat task. Recalculates all integrations by default, or specific ones via --integration-id (repeatable). Runs inline and reports each resulting status; --async enqueues the Celery task instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

victorlujanearthranger

Thanks!

Chris Doehring and others added 3 commits June 19, 2026 10:52

Add CHANGELOG with shared-destination health fix and new command

4f97a39

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chrisdoehring force-pushed the gundi-fix-shared-destination-health branch from 378a558 to 4f97a39 Compare June 19, 2026 17:54

chrisdoehring requested review from marianobrc and victorlujanearthranger June 19, 2026 18:07

chrisdoehring marked this pull request as draft June 19, 2026 18:09

victorlujanearthranger approved these changes Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: don't mark connections unhealthy due to a shared destination#434

Fix: don't mark connections unhealthy due to a shared destination#434
chrisdoehring wants to merge 3 commits into
mainfrom
gundi-fix-shared-destination-health

chrisdoehring commented Jun 19, 2026 •

edited

Loading

Uh oh!

victorlujanearthranger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chrisdoehring commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

New management command

Tests

Uh oh!

victorlujanearthranger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chrisdoehring commented Jun 19, 2026 •

edited

Loading