Fix IndexError in mut_param_dataset_correlation for 1-column groups#239
Open
jaredgalloway wants to merge 1 commit into
Open
Fix IndexError in mut_param_dataset_correlation for 1-column groups#239jaredgalloway wants to merge 1 commit into
jaredgalloway wants to merge 1 commit into
Conversation
`ModelCollection.mut_param_dataset_correlation` crashed with `IndexError: index 1 is out of bounds for axis 0 with size 1` when a `(mut_param, x)` group reduced to a single column — i.e. when only one of the two replicates had a non-zero entry for that mutation. This surfaces under sparse shift solutions (e.g. continuation-strategy fits with strong fusion regularization). Extract the per-group correlation reduction into a helper `_pairwise_correlation` that returns NaN for 1-column groups instead of indexing past the matrix bounds. Add unit tests covering both the 2-column happy path and the 1-column regression case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a latent
IndexErrorinModelCollection.mut_param_dataset_correlationthat surfaces when a(mut_param, x)group has only one replicate with a non-zero entry — a regime that's reached under sparse shift solutions (e.g. continuation-strategy fits with strong fusion regularization).The bug was at
multidms/model_collection.py:1424:When
replicate_params_df.Thas only one column,.corr()returns a 1×1 DataFrame andiloc[0, 1]raises:What changed
_pairwise_correlation(replicate_params_df, r)that returnsNaNfor 1-column groups instead of indexing past the matrix bounds.How this was discovered
Running the spike pipeline with
strategy: continuation,tol=1e-5,maxiter=100,beta0_ridge=1e-3against the prod fusionreg grid produced sparse shift solutions where some(mut_param, fusionreg)cells had a single surviving replicate, tripping the latent bug. The independent-strategy run on the same hyperparameters did not hit it because its converged solutions are denser. Fit pickles from the failed run were preserved, so re-running only theevaluaterule reproduces the issue cheaply.Test plan
pixi run fmt-check— cleanpixi run lint— cleanpixi run pytest tests/test_model_collection.py -k "pairwise_correlation or mut_param_dataset_correlation"— 5 passedtest_pairwise_correlation_two_replicates(new)test_pairwise_correlation_one_replicate_returns_nan(new regression test)test_mut_param_dataset_correlation(existing, still green)test_mut_param_dataset_correlation_return_data_r1(existing, still green)test_mut_param_dataset_correlation_return_data_r2(existing, still green)evaluaterule against the cached continuationfit_collection.pkland confirm it now completes🤖 Generated with Claude Code