Skip to content

Add grid_mapping exception for ocean grids#40

Open
jesusff wants to merge 2 commits into
ESGF:developfrom
jesusff:add-cordex-grid-exception
Open

Add grid_mapping exception for ocean grids#40
jesusff wants to merge 2 commits into
ESGF:developfrom
jesusff:add-cordex-grid-exception

Conversation

@jesusff
Copy link
Copy Markdown

@jesusff jesusff commented Apr 30, 2026

See #36

See ESGF#36

Co-authored-by: Copilot <copilot@github.com>
Comment on lines +33 to +38
# The allowed exceptions for omitting the grid_mapping attribute, based on the global attribute 'grid'
gmomittedif = ["tripolar",]
grid_description = str(getattr(CheckerObject.ds, "grid", "") or "").lower()
grid_mapping_optional = any(
token in grid_description for token in gmomittedif
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ICON-CLM EUR-12 we use the following grid description:

Rotated-pole latitude-longitude with 0.11 degree grid spacing, interpolated by nearest neighbor remapping from the original unstructured icosahedral ICON grid R13B05 (~12.1 km).

For the coupled ICON-CLM+NEMO, for the ocean data, this could as well be

Rotated-pole latitude-longitude with 0.04 degree grid spacing, interpolated by first order conservative remapping from the original tripolar NEMO grid XYZ (~4 km).

Which would register as an exception despite potentially lacking the proper grid_mapping?!

Maybe the way to go would be via the variable metadata (cell_measures containing areacello or volcello as you also mentioned as a possibility in the issue thread). What do you think?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good counter-example, thanks!

but this only reinforces the idea that we cannot use variable properties to trigger the need for a grid_mapping. In your example, ocean variables have a planar projection and must have a grid_mapping. So the requirement of having a grid_mapping relies entirely on the grid actually used, which is to be described in the grid attribute. We could have a special systematic keyword to invoke the exception (instead of having to collect popular words likely to appear, such as tripolar). E.g. start with "Ocean", like in "Ocean NEMO ORCA tripolar grid with a 1/12 degree (6-8km) grid spacing; Mediterranean Sea only".

We could have a similar keyword for other grids which are being debated (e.g. rivers WCRP-CORDEX/data-request-table#69).

Comment on lines +130 to +132
if grid_mapping_optional:
testctx.add_pass()
elif ("rlat" in CheckerObject.xrds and "rlon" in CheckerObject.xrds) or (
Copy link
Copy Markdown

@gkara00 gkara00 May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm correct, check_grid_mapping is currently the only function that verifies whether coordinate variables are present in the file. In that case, even when grid_mapping attribute is omitted, we should still check for native coordinate variables using common naming conventions (e.g. lat/lon or latitude/longitude or y/x).

So lines R130-132 could be updated to something like:
coord_pairs = [
("lat", "lon"),
("latitude", "longitude"),
("y", "x"),
]

if (
(grid_mapping_optional and any(
a in CheckerObject.xrds and b in CheckerObject.xrds
for a, b in coord_pairs
))
or ("rlat" in CheckerObject.xrds and "rlon" in CheckerObject.xrds) or (

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the example by @ssirviente in #36 (comment), the tripolar grid is not defining 1D coordinate variables. x and y are defined just as dummy dimensions and only 2D lat and lon variables are provided, right?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought there needed to be at least a check for the existence of a coordinate variable in the file, regardless of whether it is 1D or 2D or the grid_mapping_name we use. As done in lines L118-127

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. I'm not sure if this is explicitly tested. 2D lat/lon variables are tested here:

if "longitude" in CheckerObject.xrds.cf.coordinates:
lon = CheckerObject.xrds[CheckerObject.xrds.cf.coordinates["longitude"][0]]
elif "lon" in CheckerObject.xrds:
lon = CheckerObject.xrds["lon"]
else:
testctx.add_pass()
return [testctx.to_result()]
# Get domain_id from global attributes
domain_id = CheckerObject._get_attr("domain_id", default="")
if not isinstance(domain_id, str):
domain_id = ""
# Check if longitude coordinates are strictly monotonically increasing
if lon.ndim != 2:
testctx.add_failure("The longitude coordinate should have two dimensions.")

but if the variables do not exist, the check is passed without failure (L33). @sol1105?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jesusff @gkara00 Sorry for my long response time. I checked this by removing lat and lon coordinates from a file using ncks, and got:

  • Required
    • [CDXA001] domain_id
      • Cannot check 'domain_id' as latitude and longitude coordinate variables could not be identified.
  • Recommended
    • [CDXV001] Existence of latitude and longitude bounds
      • It is recommended for the variables 'lat' and 'lon' to have bounds defined.

Also the CF-checker then reports the following in that case:

  • Error (i.e. required / mandatory)
    • §5.6 Horizontal Coordinate Reference Systems, Grid Mappings, Projections
      • tas's coordinate variable "rlat" is not one of the variables identifying true latitude/longitude and its dimensions are not a subset of tas's dimensions
      • tas's coordinate variable "rlon" is not one of the variables identifying true latitude/longitude and its dimensions are not a subset of tas's dimensions

So I think we are covered in that regard.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part of its dimensions are not a subset of tas's dimensions is not correct however as tas in my test case has dimensions (time, rlat, rlon) and rlat and rlon have dimensions rlat and rlon, respectively.

to the literal text "(no grid_mapping)" included in the `grid` attribute, as suggested in ESGF#36 and WCRP-CORDEX/archive-specifications#44

Co-authored-by: Copilot <copilot@github.com>
@jesusff jesusff requested a review from sol1105 May 18, 2026 18:15
Copy link
Copy Markdown
Collaborator

@sol1105 sol1105 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for setting up the PR. I left one comment requesting a change and one plain comment.

testctx.add_pass()
else:
testctx.add_failure(
f"The grid_mapping variable '{crs}', describing the coordinate reference system,"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the grid_mapping variable does not exist crs is "False", so this should be omitted here (was my mistake originally). Alternatively, the failure message could be, eg.:

"No grid_mapping variable, describing the coordinate reference system, could be found in the file."

# The allowed exception for omitting the grid_mapping attribute, based on the global attribute 'grid'
gmomittedtext = "(no grid_mapping)"
grid_description = str(getattr(CheckerObject.ds, "grid", "") or "").lower()
grid_mapping_optional = gmomittedtext in grid_description
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good solution because assessing if a grid_mapping is mandatory, would be hard, but: it also allows anyone to bypass this check by specifying "(no grid_mapping)" 😉

One thought however: for geographic lat-lon grids (whether tripolar, gaussian, regular, ...), there is a grid_mapping name (which is "latitude_longitude"), and this one could be optionally set for a tripolar ocean grid. So we could alternatively (or additionally, since my suggestion comes too late with regards to the arch spec v3 release) add "latitude_longitude" to the list of allowed grid_mapping names. This would also allow to specify the shape of the Earth as part of the grid_mapping-variable's metadata, which is also an arch. spec. requirement for cases where grid_mapping is mandatory (according to CF and the archive specs.), but certainly would also be nice to have for other grids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants