From ed7984cffb2881f9dbd8e2fdd27ca0ac8cfb55c1 Mon Sep 17 00:00:00 2001 From: Vecko <36369090+VeckoTheGecko@users.noreply.github.com> Date: Wed, 17 Jun 2026 15:26:12 +0200 Subject: [PATCH 1/2] Add data cataloging tutorial --- intermediate/cataloging.ipynb | 61 +++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) create mode 100644 intermediate/cataloging.ipynb diff --git a/intermediate/cataloging.ipynb b/intermediate/cataloging.ipynb new file mode 100644 index 00000000..827a0e7f --- /dev/null +++ b/intermediate/cataloging.ipynb @@ -0,0 +1,61 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "1a1c33fb", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "# Data cataloguing for Xarray\n", + "\n", + ":::{admonition} Under construction\n", + "This notebook is very much still under construction\n", + ":::\n", + "\n", + "**Goals:** At the end of this tutorial, you'll have an overview about what data cataloging, why it is done, what tools are available. TODO: Refine goal\n", + "\n", + "## What is cataloging? Why is it useful?\n", + "\n", + "- Many different ways to open Xarray datasets\n", + " - From file\n", + " - Netcdf\n", + " - Zarr\n", + " - From Icechunk store\n", + " - From remote URLs\n", + " - From custom engines (see tutorial x)\n", + "- Copying all of this from script to script is TIRING\n", + "- Its a data engineering problem (and one that's on the org level)\n", + "- What if we could map out the datasets that we use on an org level, and expose that as a collection of datasets? A ✨catalogue✨ if you will 🧐\n", + "\n", + "\n", + "## \n", + "\n", + "\n", + "## Packages\n", + "\n", + "- Odc-stac\n", + "- Stackstac\n", + "- Xpystac\n", + "- lazycogs(?)\n", + "- intake v2\n", + "\n", + "\n", + "## More resources\n", + "\n", + "https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-consumers/ \n", + "\n", + "\n" + ] + } + ], + "metadata": { + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 2840202800432dfbc7c08b2c8e45f5474d37c4aa Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 17 Jun 2026 14:34:12 +0000 Subject: [PATCH 2/2] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- intermediate/cataloging.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/intermediate/cataloging.ipynb b/intermediate/cataloging.ipynb index 827a0e7f..89225128 100644 --- a/intermediate/cataloging.ipynb +++ b/intermediate/cataloging.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "1a1c33fb", + "id": "0", "metadata": { "vscode": { "languageId": "plaintext"