Skip to content

212 bug centroid nans sources not removed#213

Merged
ChaitanyaChawak merged 5 commits into
developfrom
212-bug-centroid-nans-sources-not-removed
Apr 28, 2026
Merged

212 bug centroid nans sources not removed#213
ChaitanyaChawak merged 5 commits into
developfrom
212-bug-centroid-nans-sources-not-removed

Conversation

@jeipollack

@jeipollack jeipollack commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes an issue where sources with invalid centroid estimates (NaNs/Infs) were not removed, leading to downstream error during misalignment calculations using Centroid positions.

Introduces a robust batch filtering utility to enforce alignment and prevent silent data corruption.

Closes #212

What’s changed

  • Added safe_batch utilities to compute validity masks from anchor arrays (e.g. centroids) and filter aligned datasets.
  • Implemented safe_batch_builder to consistently apply masking across all sample-aligned arrays (images, SEDs, masks, object IDs, etc.).
  • Enforced strict validation:
    • Raises on misaligned array lengths
    • Raises on invalid array type
    • Raises when all samples are invalid (prevents silent downstream failures)
  • Extended support to sequence-based metadata (e.g. lists of object_id) to ensure proper alignment.
  • Added logging utility to trace filtered samples.
  • Added comprehensive unit tests covering:
    • Valid / partially invalid / fully invalid inputs
    • Alignment guarantees
    • Failure modes

How to test / verify

  • Run unit tests:
pytest
  • Specific checks:
    • Verify that datasets containing NaN centroids correctly remove affected sources.
    • Confirm that all output arrays remain aligned after filtering.
    • Confirm that fully invalid batches raise a ValueError.
    • Confirm that misaligned inputs raise errors.
  • (Optional) Re-run a pipeline stage using real data where centroid failures previously occurred and verify:
    • No silent misalignment
    • Correct number of sources after filtering

Scope

Indicate the type of PR:

  • Feature
  • Bug fix
  • Hotfix
  • Documentation / process change
  • Internal / refactor
  • Release

Optionally, note if this PR is part of a larger milestone or set of related PRs.

Changelog

  • Changelog fragment added (if applicable)

Reviewer Checklist

Reviewers should confirm the following before approving and merging:

  • The PR targets the correct base branch (develop, or main for release PRs)
  • The PR is assigned to the developer
  • Appropriate labels are applied
  • The PR is included in relevant projects and/or milestones
  • Description clearly explains what has changed
  • Issue references included, if applicable
  • Code and documentation adhere to current standards (ruff)
  • Documentation updates included, if relevant
  • CI tests are passing
  • All reviewer comments have been addressed

Next Steps / Notes (if applicable)

Jennifer Pollack added 5 commits April 22, 2026 13:21
… tests

- Added support for filtering both NumPy arrays and sequence-based metadata (e.g. object IDs) based on mask alignment.
- Added comprehensive unit tests covering edge cases such as partial invalid samples, fully invalid batches, and misaligned inputs.
- Enforced stricter validation to prevent silent misalignment (e.g. raising on inconsistent array lengths or fully invalid batches).
- Improved handling of array-like inputs and clarified expected behavior for aligned vs non-aligned data
- Add missing changelog fragment for new utilities module

pick 8b0e16b Update pull request template to include release prep
@jeipollack jeipollack self-assigned this Apr 24, 2026
@jeipollack jeipollack added the bug Something isn't working label Apr 24, 2026
@ChaitanyaChawak ChaitanyaChawak merged commit b1b8e56 into develop Apr 28, 2026
2 checks passed
@ChaitanyaChawak ChaitanyaChawak deleted the 212-bug-centroid-nans-sources-not-removed branch April 28, 2026 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Development

Successfully merging this pull request may close these issues.

2 participants