Add grid_mapping exception for ocean grids#40
Conversation
See ESGF#36 Co-authored-by: Copilot <copilot@github.com>
| # The allowed exceptions for omitting the grid_mapping attribute, based on the global attribute 'grid' | ||
| gmomittedif = ["tripolar",] | ||
| grid_description = str(getattr(CheckerObject.ds, "grid", "") or "").lower() | ||
| grid_mapping_optional = any( | ||
| token in grid_description for token in gmomittedif | ||
| ) |
There was a problem hiding this comment.
For ICON-CLM EUR-12 we use the following grid description:
Rotated-pole latitude-longitude with 0.11 degree grid spacing, interpolated by nearest neighbor remapping from the original unstructured icosahedral ICON grid R13B05 (~12.1 km).
For the coupled ICON-CLM+NEMO, for the ocean data, this could as well be
Rotated-pole latitude-longitude with 0.04 degree grid spacing, interpolated by first order conservative remapping from the original tripolar NEMO grid XYZ (~4 km).
Which would register as an exception despite potentially lacking the proper grid_mapping?!
Maybe the way to go would be via the variable metadata (cell_measures containing areacello or volcello as you also mentioned as a possibility in the issue thread). What do you think?
There was a problem hiding this comment.
Good counter-example, thanks!
but this only reinforces the idea that we cannot use variable properties to trigger the need for a grid_mapping. In your example, ocean variables have a planar projection and must have a grid_mapping. So the requirement of having a grid_mapping relies entirely on the grid actually used, which is to be described in the grid attribute. We could have a special systematic keyword to invoke the exception (instead of having to collect popular words likely to appear, such as tripolar). E.g. start with "Ocean", like in "Ocean NEMO ORCA tripolar grid with a 1/12 degree (6-8km) grid spacing; Mediterranean Sea only".
We could have a similar keyword for other grids which are being debated (e.g. rivers WCRP-CORDEX/data-request-table#69).
| if grid_mapping_optional: | ||
| testctx.add_pass() | ||
| elif ("rlat" in CheckerObject.xrds and "rlon" in CheckerObject.xrds) or ( |
There was a problem hiding this comment.
If I'm correct, check_grid_mapping is currently the only function that verifies whether coordinate variables are present in the file. In that case, even when grid_mapping attribute is omitted, we should still check for native coordinate variables using common naming conventions (e.g. lat/lon or latitude/longitude or y/x).
So lines R130-132 could be updated to something like:
coord_pairs = [
("lat", "lon"),
("latitude", "longitude"),
("y", "x"),
]
if (
(grid_mapping_optional and any(
a in CheckerObject.xrds and b in CheckerObject.xrds
for a, b in coord_pairs
))
or ("rlat" in CheckerObject.xrds and "rlon" in CheckerObject.xrds) or (
There was a problem hiding this comment.
according to the example by @ssirviente in #36 (comment), the tripolar grid is not defining 1D coordinate variables. x and y are defined just as dummy dimensions and only 2D lat and lon variables are provided, right?
There was a problem hiding this comment.
Ok, I thought there needed to be at least a check for the existence of a coordinate variable in the file, regardless of whether it is 1D or 2D or the grid_mapping_name we use. As done in lines L118-127
There was a problem hiding this comment.
you are right. I'm not sure if this is explicitly tested. 2D lat/lon variables are tested here:
cc-plugin-wcrp/checks/variable_checks/check_coords_cordex_cmip6.py
Lines 28 to 43 in 54398a0
but if the variables do not exist, the check is passed without failure (L33). @sol1105?
There was a problem hiding this comment.
@jesusff @gkara00 Sorry for my long response time. I checked this by removing lat and lon coordinates from a file using ncks, and got:
- Required
- [CDXA001] domain_id
- Cannot check 'domain_id' as latitude and longitude coordinate variables could not be identified.
- [CDXA001] domain_id
- Recommended
- [CDXV001] Existence of latitude and longitude bounds
- It is recommended for the variables 'lat' and 'lon' to have bounds defined.
- [CDXV001] Existence of latitude and longitude bounds
Also the CF-checker then reports the following in that case:
- Error (i.e. required / mandatory)
- §5.6 Horizontal Coordinate Reference Systems, Grid Mappings, Projections
- tas's coordinate variable "rlat" is not one of the variables identifying true latitude/longitude and its dimensions are not a subset of tas's dimensions
- tas's coordinate variable "rlon" is not one of the variables identifying true latitude/longitude and its dimensions are not a subset of tas's dimensions
- §5.6 Horizontal Coordinate Reference Systems, Grid Mappings, Projections
So I think we are covered in that regard.
There was a problem hiding this comment.
The part of its dimensions are not a subset of tas's dimensions is not correct however as tas in my test case has dimensions (time, rlat, rlon) and rlat and rlon have dimensions rlat and rlon, respectively.
to the literal text "(no grid_mapping)" included in the `grid` attribute, as suggested in ESGF#36 and WCRP-CORDEX/archive-specifications#44 Co-authored-by: Copilot <copilot@github.com>
sol1105
left a comment
There was a problem hiding this comment.
Thanks for setting up the PR. I left one comment requesting a change and one plain comment.
| testctx.add_pass() | ||
| else: | ||
| testctx.add_failure( | ||
| f"The grid_mapping variable '{crs}', describing the coordinate reference system," |
There was a problem hiding this comment.
If the grid_mapping variable does not exist crs is "False", so this should be omitted here (was my mistake originally). Alternatively, the failure message could be, eg.:
"No grid_mapping variable, describing the coordinate reference system, could be found in the file."
| # The allowed exception for omitting the grid_mapping attribute, based on the global attribute 'grid' | ||
| gmomittedtext = "(no grid_mapping)" | ||
| grid_description = str(getattr(CheckerObject.ds, "grid", "") or "").lower() | ||
| grid_mapping_optional = gmomittedtext in grid_description |
There was a problem hiding this comment.
I think it is a good solution because assessing if a grid_mapping is mandatory, would be hard, but: it also allows anyone to bypass this check by specifying "(no grid_mapping)" 😉
One thought however: for geographic lat-lon grids (whether tripolar, gaussian, regular, ...), there is a grid_mapping name (which is "latitude_longitude"), and this one could be optionally set for a tripolar ocean grid. So we could alternatively (or additionally, since my suggestion comes too late with regards to the arch spec v3 release) add "latitude_longitude" to the list of allowed grid_mapping names. This would also allow to specify the shape of the Earth as part of the grid_mapping-variable's metadata, which is also an arch. spec. requirement for cases where grid_mapping is mandatory (according to CF and the archive specs.), but certainly would also be nice to have for other grids.
See #36