Add filename pattern cross-validation grouping#398
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new leave-one-group-out cross-validation grouping strategy, Filename Pattern, allowing users to group videos into CV folds based on a regex-derived key from filenames. This extends the existing Individual Animal and Video grouping modes and threads the new setting through UI, project plumbing, reporting, CLI, export, docs, and tests.
Changes:
- Add
FILENAME_PATTERNgrouping strategy plus core helpers (compile_grouping_regex,filename_group_key) and a persisted project settingcv_grouping_regex. - Implement filename-pattern grouping in
Project._assign_cv_group_idsand update downstream consumers (excluded-video handling, CV labels, threshold gating, exports, reports, CLI). - Add UI controls with inline regex validation and a live “group preview”, plus comprehensive tests and doc updates (both in-app and online copies).
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/ui/test_settings_dialog.py | Adds UI tests for strategy/regex roundtrip, validation behavior, and preview rendering/visibility. |
| tests/ui/_fakes.py | Extends UI fakes to include cv_grouping_regex. |
| tests/project/test_settings_manager.py | Verifies defaulting and persistence reading for cv_grouping_regex. |
| tests/project/test_cv_grouping.py | Adds focused tests for Project._assign_cv_group_ids under filename-pattern grouping. |
| tests/classifier/test_multi_class_classifier.py | Tests multi-class threshold counting behavior under filename-pattern grouping (including invalid regex). |
| tests/classifier/test_classifier.py | Tests binary threshold counting and gating under filename-pattern grouping (including invalid regex). |
| src/jabs/ui/training_thread.py | Threads cv_grouping_regex into training report construction. |
| src/jabs/ui/training_strategy.py | Extends report-building API to carry cv_grouping_regex into TrainingReportData. |
| src/jabs/ui/settings_dialog/settings_group.py | Adds a validate() hook to settings groups to block saving invalid input. |
| src/jabs/ui/settings_dialog/settings_dialog.py | Adds “validate all groups before save” behavior and passes project videos into the CV settings group for preview. |
| src/jabs/ui/settings_dialog/cross_validation_settings_group.py | Implements regex field visibility, inline validation, debounced live preview, and updated help text. |
| src/jabs/ui/main_window/central_widget.py | Threads cv_grouping_regex into train-button gating counters for both binary and multi-class. |
| src/jabs/scripts/cli/cross_validation.py | Includes cv_grouping_regex in CLI-generated training reports. |
| src/jabs/resources/docs/user_guide/gui.md | Updates in-app user guide with Filename Pattern explanation and example. |
| src/jabs/project/settings_manager.py | Adds cv_grouping_regex property reading from project settings. |
| src/jabs/project/project.py | Implements filename-pattern group assignment; updates excluded-group logic; threads regex into feature extraction grouping. |
| src/jabs/project/export_training.py | Updates training export to store group labels appropriately for filename-pattern groups. |
| src/jabs/classifier/training_report.py | Records cv_grouping_regex in markdown and JSON report outputs. |
| src/jabs/classifier/multi_class_classifier.py | Adds filename-pattern aggregation support to multi-class threshold counting/gating. |
| src/jabs/classifier/cross_validation.py | Updates CV test-group label rendering to prefer regex label when present. |
| src/jabs/classifier/classifier.py | Adds filename-pattern aggregation support to binary threshold counting/gating. |
| packages/jabs-core/tests/test_cv_grouping.py | Adds unit tests for the new enum member and helper functions. |
| packages/jabs-core/src/jabs/core/enums/cv_grouping.py | Adds FILENAME_PATTERN and regex helper utilities. |
| packages/jabs-core/src/jabs/core/enums/init.py | Re-exports new helpers for consumers. |
| packages/jabs-core/src/jabs/core/constants.py | Adds CV_GROUPING_REGEX_KEY constant. |
| docs/user-guide/gui.md | Updates online user guide with Filename Pattern explanation and example. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…attern-grouping # Conflicts: # tests/ui/test_settings_dialog.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a third leave-one-group-out cross-validation grouping option, Filename Pattern, alongside the existing Individual Animal and Video options (configured in Project Settings → Cross-Validation).
What it does
cage_(\d+)→0042); otherwise the whole match is used. Files that don't match are each placed in their own group.cv_grouping_regexproject setting (stored inproject.json); only affects new training runs.UI
Plumbing
Project._assign_cv_group_ids; the regex is threaded through the binary/multiclass Train-button gating counters, the training report, and the CLI.Docs & tests
docs/and in-app help).