Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions intermediate/cataloging.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"source": [
"# Data cataloguing for Xarray\n",
"\n",
":::{admonition} Under construction\n",
"This notebook is very much still under construction\n",
":::\n",
"\n",
"**Goals:** At the end of this tutorial, you'll have an overview about what data cataloging, why it is done, what tools are available. TODO: Refine goal\n",
"\n",
"## What is cataloging? Why is it useful?\n",
"\n",
"- Many different ways to open Xarray datasets\n",
" - From file\n",
" - Netcdf\n",
" - Zarr\n",
" - From Icechunk store\n",
" - From remote URLs\n",
" - From custom engines (see tutorial x)\n",
"- Copying all of this from script to script is TIRING\n",
"- Its a data engineering problem (and one that's on the org level)\n",
"- What if we could map out the datasets that we use on an org level, and expose that as a collection of datasets? A ✨catalogue✨ if you will 🧐\n",
"\n",
"\n",
"## \n",
"\n",
"\n",
"## Packages\n",
"\n",
"- Odc-stac\n",
"- Stackstac\n",
"- Xpystac\n",
"- lazycogs(?)\n",
"- intake v2\n",
"\n",
"\n",
"## More resources\n",
"\n",
"https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-consumers/ \n",
"\n",
"\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading