Draft: Make fof compare work with radar fof files#103
Conversation
huppd
left a comment
There was a problem hiding this comment.
Thanks a lot for this new feature. It is minimal invasive, what I like very much.
I only have some minor comments. Final verdict should come from an expert and should be checked if it is compatible with the letkf pipeline.
| # A value present (non-NaN) in one file but missing (NaN) in the other is a | ||
| # real difference -- an observation appeared or disappeared. The relative | ||
| # diff is NaN there and check_variable would treat NaN as within tolerance, | ||
| # silently passing it; force those cells to fail. Both-NaN stays NaN (and | ||
| # passes), since both being missing means they are equal. | ||
| only_one_nan = df_ref.isna() ^ df_cur.isna() |
There was a problem hiding this comment.
Wouldn't this be useful for every compute_rel_diff_dataframe and coud be used there?
There was a problem hiding this comment.
Yes, should we move this out of the fof file only code path and into the general dataframe code path?
There was a problem hiding this comment.
I can't see any reason why not. Do you see any drawbacks?
| n_rows_file1 = xr.open_dataset(file1_path).attrs["n_body"] | ||
| n_rows_file2 = xr.open_dataset(file2_path).attrs["n_body"] |
There was a problem hiding this comment.
I guess that needs to be discussed with an expert.
cghielmini
left a comment
There was a problem hiding this comment.
Thanks for your work, I just have a little modificaiton to suggest to avoid the LETKF check from failing
| if ds.sizes.get("d_veri", 1) != 1: | ||
| raise ValueError( | ||
| f"fof file has d_veri = {ds.sizes['d_veri']} verification runs; " | ||
| "comparison supports exactly one. Select a single run before comparing." | ||
| ) |
There was a problem hiding this comment.
These lines are causing an error in the LETKF check. If they are not needed, could we please remove them?
- Strip the over-allocated padding by the real n_hdr/n_body counts (radar files allocate d_* larger than the real data), replacing the previous zero-append hack with an isel slice - Sort radar observations by the per-observation position dlat/dlon instead of the radar-station lat/lon, so the row order is deterministic - Size the fof_compare consistency gate and tolerance vector by n_body (real count) rather than the padded d_body - Flag veri_data cells that are NaN in one file but real in the other as differences instead of silently passing them - Guard against non-front-packed padding (via the dlat discriminator) and against multiple verification runs (d_veri > 1), failing loudly - Skip empty data frames in fof_compare's detailed log via log_dataframe - Add a realistic radar fixture (2-D veri_data, float fields, NaN padding) and unit/integration tests: padding strip, sort determinism, scattered-padding guard, and fof_compare consistent / not-consistent / different-padding / NaN-vs-real cases
Make fof comparison work for radar fof files
files allocate d_* larger than the real data), replacing the previous
zero-append hack with an isel slice
of the radar-station lat/lon, so the row order is deterministic
count) rather than the padded d_body
differences instead of silently passing them
against multiple verification runs (d_veri > 1), failing loudly
unit/integration tests: padding strip, sort determinism, scattered-padding
guard, and fof_compare consistent / not-consistent / different-padding /
NaN-vs-real cases
Co-authored-by: Alina Yapparova alina.yapparova@meteoswiss.ch