Skip to content

Add filename pattern cross-validation grouping#398

Merged
gbeane merged 3 commits into
mainfrom
feature/cv-filename-pattern-grouping
Jun 19, 2026
Merged

Add filename pattern cross-validation grouping#398
gbeane merged 3 commits into
mainfrom
feature/cv-filename-pattern-grouping

Conversation

@gbeane

@gbeane gbeane commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Adds a third leave-one-group-out cross-validation grouping option, Filename Pattern, alongside the existing Individual Animal and Video options (configured in Project Settings → Cross-Validation).

What it does

  • A regular expression is applied to each video filename; videos that produce the same key form one CV group. If the pattern has a capture group, the captured text is the key (cage_(\d+)0042); otherwise the whole match is used. Files that don't match are each placed in their own group.
  • New cv_grouping_regex project setting (stored in project.json); only affects new training runs.

UI

  • Regex field shown only for the Filename Pattern strategy, with live validation that blocks saving an empty or invalid pattern.
  • Live preview: a summary line plus a collapsible breakdown of how the project's videos partition into groups, with videos excluded from training marked.
  • Inline help updated with a cage-ID example.
image

Plumbing

  • Grouping implemented in Project._assign_cv_group_ids; the regex is threaded through the binary/multiclass Train-button gating counters, the training report, and the CLI.
  • Group-mapping consumers (CV result labels, training export, excluded-video logic) updated to handle groups that span multiple videos.

Docs & tests

  • User guide updated in both copies (online docs/ and in-app help).
  • New tests for the helpers, group assignment, threshold counters, settings persistence, and the dialog preview/validation. Full suite green (796 tests); lint and format clean.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new leave-one-group-out cross-validation grouping strategy, Filename Pattern, allowing users to group videos into CV folds based on a regex-derived key from filenames. This extends the existing Individual Animal and Video grouping modes and threads the new setting through UI, project plumbing, reporting, CLI, export, docs, and tests.

Changes:

  • Add FILENAME_PATTERN grouping strategy plus core helpers (compile_grouping_regex, filename_group_key) and a persisted project setting cv_grouping_regex.
  • Implement filename-pattern grouping in Project._assign_cv_group_ids and update downstream consumers (excluded-video handling, CV labels, threshold gating, exports, reports, CLI).
  • Add UI controls with inline regex validation and a live “group preview”, plus comprehensive tests and doc updates (both in-app and online copies).

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/ui/test_settings_dialog.py Adds UI tests for strategy/regex roundtrip, validation behavior, and preview rendering/visibility.
tests/ui/_fakes.py Extends UI fakes to include cv_grouping_regex.
tests/project/test_settings_manager.py Verifies defaulting and persistence reading for cv_grouping_regex.
tests/project/test_cv_grouping.py Adds focused tests for Project._assign_cv_group_ids under filename-pattern grouping.
tests/classifier/test_multi_class_classifier.py Tests multi-class threshold counting behavior under filename-pattern grouping (including invalid regex).
tests/classifier/test_classifier.py Tests binary threshold counting and gating under filename-pattern grouping (including invalid regex).
src/jabs/ui/training_thread.py Threads cv_grouping_regex into training report construction.
src/jabs/ui/training_strategy.py Extends report-building API to carry cv_grouping_regex into TrainingReportData.
src/jabs/ui/settings_dialog/settings_group.py Adds a validate() hook to settings groups to block saving invalid input.
src/jabs/ui/settings_dialog/settings_dialog.py Adds “validate all groups before save” behavior and passes project videos into the CV settings group for preview.
src/jabs/ui/settings_dialog/cross_validation_settings_group.py Implements regex field visibility, inline validation, debounced live preview, and updated help text.
src/jabs/ui/main_window/central_widget.py Threads cv_grouping_regex into train-button gating counters for both binary and multi-class.
src/jabs/scripts/cli/cross_validation.py Includes cv_grouping_regex in CLI-generated training reports.
src/jabs/resources/docs/user_guide/gui.md Updates in-app user guide with Filename Pattern explanation and example.
src/jabs/project/settings_manager.py Adds cv_grouping_regex property reading from project settings.
src/jabs/project/project.py Implements filename-pattern group assignment; updates excluded-group logic; threads regex into feature extraction grouping.
src/jabs/project/export_training.py Updates training export to store group labels appropriately for filename-pattern groups.
src/jabs/classifier/training_report.py Records cv_grouping_regex in markdown and JSON report outputs.
src/jabs/classifier/multi_class_classifier.py Adds filename-pattern aggregation support to multi-class threshold counting/gating.
src/jabs/classifier/cross_validation.py Updates CV test-group label rendering to prefer regex label when present.
src/jabs/classifier/classifier.py Adds filename-pattern aggregation support to binary threshold counting/gating.
packages/jabs-core/tests/test_cv_grouping.py Adds unit tests for the new enum member and helper functions.
packages/jabs-core/src/jabs/core/enums/cv_grouping.py Adds FILENAME_PATTERN and regex helper utilities.
packages/jabs-core/src/jabs/core/enums/init.py Re-exports new helpers for consumers.
packages/jabs-core/src/jabs/core/constants.py Adds CV_GROUPING_REGEX_KEY constant.
docs/user-guide/gui.md Updates online user guide with Filename Pattern explanation and example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gbeane gbeane self-assigned this Jun 17, 2026
…attern-grouping

# Conflicts:
#	tests/ui/test_settings_dialog.py
@gbeane gbeane merged commit 3dfb667 into main Jun 19, 2026
5 checks passed
@gbeane gbeane deleted the feature/cv-filename-pattern-grouping branch June 19, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants