Skip to content

Add cross_cov_matrix to transition data#13416

Open
xjules wants to merge 10 commits into
equinor:mainfrom
xjules:store_cross_cov_mtx
Open

Add cross_cov_matrix to transition data#13416
xjules wants to merge 10 commits into
equinor:mainfrom
xjules:store_cross_cov_mtx

Conversation

@xjules
Copy link
Copy Markdown
Contributor

@xjules xjules commented Apr 24, 2026

Issue
Resolves #13296
Relates to #13378

Approach
This introduces:

  • AnalysisMatrixEvent - for sending the matrix to update_run_model.
  • AnalysisStorageEvent - for storing the event
  • ensemble endpoint updated to account for blobs

Update: I might need to re-think this a bit due to fact when loading the data back how the endpoint should look like.

(Screenshot of new behavior in GUI if applicable)

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure unit tests pass locally after every commit (git rebase -i main --exec 'just rapid-tests')

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Add backport label to latest release (format: 'backport release-branch-name')

@xjules xjules self-assigned this Apr 24, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 97.46835% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.54%. Comparing base (9def804) to head (1e551ee).
⚠️ Report is 8 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/ert/dark_storage/endpoints/ensembles.py 84.61% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #13416   +/-   ##
=======================================
  Coverage   89.54%   89.54%           
=======================================
  Files         464      464           
  Lines       32776    32845   +69     
=======================================
+ Hits        29349    29411   +62     
- Misses       3427     3434    +7     
Flag Coverage Δ
cli-tests 35.83% <68.35%> (+0.07%) ⬆️
fuzz 43.93% <45.56%> (+0.11%) ⬆️
gui-tests 59.81% <64.55%> (-0.01%) ⬇️
performance-and-unit-tests 78.10% <97.46%> (+0.02%) ⬆️
test 45.38% <39.24%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/ert/analysis/__init__.py 100.00% <ø> (ø)
src/ert/analysis/_enif_update.py 96.19% <ø> (ø)
src/ert/analysis/_es_update.py 94.35% <ø> (ø)
src/ert/analysis/_update_strategies/_adaptive.py 98.24% <100.00%> (+0.32%) ⬆️
src/ert/analysis/event.py 98.27% <100.00%> (+0.09%) ⬆️
src/ert/dark_storage/json_schema/__init__.py 100.00% <100.00%> (ø)
src/ert/dark_storage/json_schema/ensemble.py 100.00% <100.00%> (ø)
src/ert/run_models/ensemble_information_filter.py 100.00% <ø> (ø)
src/ert/run_models/event.py 100.00% <100.00%> (ø)
src/ert/run_models/manual_update_enif.py 93.75% <ø> (ø)
... and 5 more

... and 4 files with indirect coverage changes

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 24, 2026

Merging this PR will not alter performance

✅ 36 untouched benchmarks


Comparing xjules:store_cross_cov_mtx (1e551ee) with main (c81c829)

Open in CodSpeed

@xjules xjules force-pushed the store_cross_cov_mtx branch 3 times, most recently from 903391e to 71010a6 Compare April 29, 2026 10:57
@xjules xjules marked this pull request as ready for review April 29, 2026 10:57
@xjules xjules force-pushed the store_cross_cov_mtx branch from 7fcb860 to 77e71a6 Compare May 4, 2026 13:38
@xjules
Copy link
Copy Markdown
Contributor Author

xjules commented May 5, 2026

Locally the test passes.
I've created an issue for the flaky test: #13475

@xjules xjules force-pushed the store_cross_cov_mtx branch 2 times, most recently from db79d7f to 291f99c Compare May 6, 2026 07:04
@xjules xjules force-pushed the store_cross_cov_mtx branch 2 times, most recently from c050e3d to 41e1e07 Compare May 21, 2026 12:54
This adds a new update event AnalysisMatrixEvent, which sends the
correlation matrix in the callback as a part of the event.
Is is saved to posterior / transition storage section together with
serialized event.

Add AnalysisStorageEvent and sparse flag
Rename posterior_id to ensemble_id

Add artifacts endpoint

This returns all the AnalysisStorageEvents as a list

Add update endpoint

Save matrix after threshold being applied

Replace transition with blob

Rebase with main

Fixup for corr matrix to bytes conv

Make progress_callback a partial function to provide ensemble automatyically

Fixups for posterior ensemble

Fixup test
@xjules xjules force-pushed the store_cross_cov_mtx branch from 427c302 to db39c86 Compare May 21, 2026 21:00
Comment thread src/ert/analysis/event.py Outdated
Comment thread src/ert/dark_storage/endpoints/update.py Outdated
Comment thread src/ert/dark_storage/json_schema/update.py Outdated
Comment thread src/ert/storage/local_ensemble.py Outdated
Comment on lines +176 to +177
buf = io.BytesIO()
np.save(buf, corr_XY_matrix)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to do this manually, or can we use numpy.array.tobytes() directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is the tobytes just stores the data itself while np.save saves also the header, which might be a thing we want.

Comment thread src/ert/dark_storage/endpoints/update.py Outdated
Comment thread src/ert/storage/local_ensemble.py Outdated
sp.sparse.save_npz(blob_path, sparse_blob)
else:
blob_path = blob_dir / f"{stem}.npy"
np.save(blob_path, blob)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be bytes.

userdata: Mapping[str, Any] = {}


class BlobOut(BaseModel):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with the actual BlobStorageData | BlobStorageMatrix and init with validate_python

uri: str
file_size: int
ensemble_id: str

Copy link
Copy Markdown
Contributor Author

@xjules xjules May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_algorithm: ...



class MatrixStorageData(BlobStorageData):
sparse: bool = False
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype: float64, ...

@@ -15,3 +16,8 @@ class BlobStorageData(BaseModel):
uri: str
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{uuid}.blob <- bytes

Comment thread src/ert/storage/blob_data.py Outdated
@@ -15,3 +16,8 @@ class BlobStorageData(BaseModel):
uri: str
file_size: int
ensemble_id: str
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_type: "parquet", "numpy"

data_type = str(matrix.dtype)

sparsity = 1.0 - (np.count_nonzero(matrix) / matrix.size)
sparse = bool(sparsity > 0.5)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason for 0.5?

@xjules xjules added the blocked label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transition data: Add local storage API for cross-correlation matrix in Adaptive localization

4 participants