Backend Modernization#47
Open
manuvanegas wants to merge 20 commits into
Open
Conversation
All deploy yamls in a single dir, next to timeseries. Dockerfile only copies the necessary yaml configs.
Drop Numba; add anyio, httpx, aioboto3. Pin pydantic==2.x and pydantic-settings. Replace custom yaml_config_settings_source with YamlConfigSettingsSource.
JSON-serializable models for the background task pipeline. Add TimeseriesAnalyzeRequest to support workflow that decouples raster data extraction and summarization from downstream analysis (extract-once/analyze-many)
Job store for background tasks, registry and lookup dict loaders, data reader abstraction to support S3 and local json dict loading
…ny logic validation: regex + registry ID filtering, geometry cell-count estimation. slice_resolver: temporal range → file+band mapping via lookup dict. tiles: async TiTiler proxy using app-level httpx.AsyncClient. timeseries_processing: anyio CapacityLimiter(10) + to_thread for concurrent rasterio reads; z-score and rolling-average transforms. timeseries_tasks: background task orchestrator for full pipeline lifecycle.
…s` and `transform` to metadata
Add: - store/: test_jobs (atomic writes, stale cleanup), test_data_reader (S3/local factory), test_lookup_dicts (schema + ISO-8601 ordering validation, cache) - core/: test_registry (YAML validation, transform lengths, temporal resolution), test_validation (CRS detection heuristics, cell-count estimation, geom size), test_timeseries_processing (band chunking, z-score/rolling transforms, execute_analyze_request edge cases) - schemas/: test_geometry_schemas (bbox bounds, DE-9IM, reprojection), test_timeseries_schemas (pattern matching, path-traversal/injection rejection, smoother width constraints) pytest.ini: asyncio_mode=auto; integration marker for tests needing raster I/O. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
graph LR
UI["skopeui"]
subgraph offline["skope-datasets (offline pipeline)"]
Pipeline["'convert to stac' pipeline"]
end
subgraph skope-api["skope-api (FastAPI)"]
subgraph endpoints["v3 Router endpoints"]
EP1["GET /metadata"]
EP2["GET /tiles/{ds}/{var}/{t}/{z}/{x}/{y}"]
EP3["POST /timeseries/extract"]
EP4["POST /timeseries/analyze"]
EP5["GET /timeseries/status/{id}"]
end
Lookup[("Lookup Cache\n(JSON/Dict)")]
BgTask["Background Task\n(Worker Process)"]
Jobs[("Job Store\n(JSON/JobStore)")]
Titiler["TiTiler (internal)"]
end
subgraph storage["Storage"]
subgraph localst["local"]
MetaYAML["metadata.yml\n(local registry)"]
end
subgraph s3st["AWS S3"]
COGs["COGs + STAC\n(paleocar_v3/)"]
lookupdict["Lookup Dict"]
end
end
%% ==========================================
%% ACTUAL CONNECTIONS (No Hacks)
%% ==========================================
UI --> EP1
UI --> EP2
UI --> EP3
UI --> EP4
UI --> EP5
EP1 -.->|"registry (loaded at startup)"| MetaYAML
EP2 -->|"A2: request then proxy tile"| Titiler
EP2 -->|"A1: resolve URI + band"| Lookup
Titiler -->|"A3: read COG tile"| COGs
EP3 -->|"B2: dispatch"| BgTask
EP3 -->|"B1: write PENDING"| Jobs
BgTask -->|"B3: resolve URIs + bands"| Lookup
BgTask -->|"B4: rasterio reads\n(anyio ≤10 concurrent)"| COGs
Lookup -.->|"cache miss"| lookupdict
BgTask -->|"B5: compute zonal stats, write results + SUCCESS"| Jobs
EP4 -->|"C: read base_series and apply transform/smoothing"| Jobs
EP5 -->|"B6: poll job status + get results"| Jobs
Pipeline ---|"write COG STAC + lookup.json"| Junc(( ))
Junc --> COGs
Junc --> lookupdict
%% Style the junction to look like a small black dot
style Junc fill:#888,stroke:#888,stroke-width:1px
style MetaYAML fill:#fff,stroke:#333,stroke-dasharray: 5 5
style COGs fill:#fff,stroke:#333,stroke-dasharray: 5 5
style lookupdict fill:#fff,stroke:#333,stroke-dasharray: 5 5
|
Introduce a Redis-backed job store and enable Redis in deployment. Adds a Redis service to docker-compose and exposes REDIS_URL to the app, adds redis dependency to requirements, and adds redis_url to app settings. Implements RedisJobStore (with 24h TTL) and updates get_job_store to prefer Redis when redis_url is configured. Also bumps default_max_cells to 1,000,000.
Introduce a pytest fixture for RedisJobStore and add comprehensive tests for Redis-backed job storage. Updates conftest to import RedisJobStore and provide redis_job_store with teardown flushing. Tests cover update/get semantics, overwrite behavior, missing keys, TTL enforcement (using _JOB_TTL_SECONDS), key namespacing, and handling of full success payloads to ensure Redis store correctness. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add fallback and safeguards for geometries that don't cover pixel centers. In timeseries_processing.py use geometry_mask(all_touched=True) when the initial mask has zero covered pixels, and ensure the computed window has at least 1x1 size. In validation.py treat lists of Shapely Point geometries as always within size limits by returning early. These changes avoid zero-coverage/mask issues for point or very small polygons and prevent zero-dimension windows.
Replace heuristic geographic detection with pyproj.CRS and compute geometry area robustly. Add dataset_crs parameter to calculate_spatial_coverage, union shapes with shapely.unary_union and use geodetic area for geographic CRSs (fall back to planar area otherwise). Update caller to pass dataset_crs. Remove legacy COMMON_GEOGRAPHIC_EPSG/is_geographic heuristic and update estimate_cell_count/validate_geom_size to accept CRS strings and rely on CRS.is_geographic().
Add timestep precision detection and normalization helpers to verify and coerce incoming ISO-8601 timesteps to the dataset's resolution (supports year/month/day/datetime). Integrate normalization into slice resolvers (resolve_temporal_slice and resolve_uri_single_band) to enforce canonical suffixes and provide clearer errors for mismatched precisions. Also remove the internal 'base_series' field from job payloads returned by the get_job_status endpoint.
Custom colormaps are defined once in deploy/colormaps/custom.json. A custom TiTiler image bakes them into rio-tiler's registry at build time so tile requests use colormap_name without per-request overhead. resolve_colormaps() reads the same file at startup and injects colormap_stops into the registry so the frontend colorbar can read color stops from /metadata with no additional endpoint. Colormap names are declared per-variable in metadata.yml. If not built in TiTiler, the colormap needs to be defined in deploy/colormaps/custom.json.
There was a problem hiding this comment.
Pull request overview
This PR modernizes the backend by introducing an internal TiTiler tile server, migrating the timeseries API to a new v3 pipeline design (registry + lookup dicts + job store), and restructuring deployment/configuration and tests accordingly.
Changes:
- Add TiTiler customization + build-time colormap baking for internal tile streaming.
- Replace prior timeseries service layers/routers with v3 endpoints (/metadata, /tiles, /timeseries/extract + /timeseries/analyze + /timeseries/status) backed by a job store and cached lookup dictionaries.
- Rework deployment files (Dockerfiles, compose, settings, requirements) and add an expanded unit/integration test suite with raster fixtures.
Reviewed changes
Copilot reviewed 72 out of 87 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| titiler_custom/main.py | Exposes TiTiler app entrypoint for custom image. |
| titiler_custom/build_colormaps.py | Build-time script to bake custom colormaps into rio-tiler. |
| timeseries/metadata.yml | Adds crs/transform and variable colormap references for datasets. |
| timeseries/docker/.gitignore | Removed docker ignore rules under timeseries/docker. |
| timeseries/deploy/settings/dev.yml | Removed legacy timeseries deploy settings (moved to deploy/). |
| timeseries/deploy/settings/base.yml | Removed legacy base settings (moved to deploy/). |
| timeseries/deploy/requirements/prod.txt | Removed legacy prod requirements (moved to deploy/). |
| timeseries/deploy/requirements/dev.txt | Removed legacy dev requirements (moved to deploy/). |
| timeseries/deploy/requirements/base.txt | Removed legacy base requirements (moved to deploy/). |
| timeseries/deploy/metadata/prod.yml | Removed legacy metadata split (now uses metadata.yml registry). |
| timeseries/deploy/metadata/dev.yml | Removed legacy dev metadata. |
| timeseries/deploy/Dockerfile | Removed legacy timeseries Dockerfile (replaced by deploy/Dockerfile). |
| timeseries/data/requests/yearly_prod.json | Removed legacy request sample. |
| timeseries/data/requests/yearly.json | Removed legacy request sample. |
| timeseries/data/requests/timeseriesv1.json | Removed legacy request sample. |
| timeseries/data/requests/monthly.json | Removed legacy request sample. |
| timeseries/app/tests/test_stores.py | Removed legacy store tests. |
| timeseries/app/tests/store/test_lookup_dicts.py | New tests for lookup-dict caching + validation. |
| timeseries/app/tests/store/test_jobs.py | New tests for filesystem/redis job stores + cleanup. |
| timeseries/app/tests/store/test_data_reader.py | New tests for Local/S3 DataReader and factory. |
| timeseries/app/tests/store/init.py | Test package init. |
| timeseries/app/tests/schemas/test_timeseries_schemas.py | New schema validation tests (TimeRange, smoother, request ID patterns). |
| timeseries/app/tests/schemas/test_geometry_schemas.py | New geometry validation + reprojection tests. |
| timeseries/app/tests/schemas/init.py | Test package init. |
| timeseries/app/tests/routers/test_datasets.py | Removed legacy router tests (v1/v2). |
| timeseries/app/tests/pipeline/test_pipeline.py | New end-to-end pipeline tests for extract/analyze/status. |
| timeseries/app/tests/pipeline/data/test-monthly/lookup.json | Monthly lookup fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/test-annual/lookup.json | Annual lookup fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/monthly_5x5x60_dataset_int16_variable.tif | Raster fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/monthly_5x5x60_dataset_float32_variable.tif | Raster fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/annual_5x5x5_dataset_uint16_variable.tif | Raster fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/annual_5x5x5_dataset_float32_variable_uncertainty.tif.aux.xml | Raster aux fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/annual_5x5x5_dataset_float32_variable_uncertainty.tif | Raster fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/annual_5x5x5_dataset_float32_variable.tif.aux.xml | Raster aux fixture for pipeline tests. |
| timeseries/app/tests/pipeline/data/annual_5x5x5_dataset_float32_variable.tif | Raster fixture for pipeline tests. |
| timeseries/app/tests/pipeline/conftest.py | Pipeline fixtures and dependency overrides for integration tests. |
| timeseries/app/tests/pipeline/init.py | Test package init. |
| timeseries/app/tests/core/test_validation.py | New validation tests (dataset/variable + geom sizing). |
| timeseries/app/tests/core/test_timeseries_processing.py | New unit tests for processing pipeline utilities/transforms. |
| timeseries/app/tests/core/test_registry.py | New unit tests for registry loading + slice/URI resolution. |
| timeseries/app/tests/core/init.py | Test package init. |
| timeseries/app/tests/conftest.py | Global fixtures for registry/lookup/series/job store/data reader. |
| timeseries/app/store/jobs.py | New filesystem + redis job store implementation + cleanup. |
| timeseries/app/store/index_loaders.py | New registry loader, colormap resolver, lookup cache/fetch/validate. |
| timeseries/app/store/data_reader.py | New Local/S3 JSON readers and reader factory. |
| timeseries/app/schemas/timeseries.py | Pydantic v2 schema refactor and request/response models. |
| timeseries/app/schemas/geometry.py | Geometry model refactor; validation + reprojection helpers. |
| timeseries/app/schemas/dataset.py | Removed legacy dataset manager/metadata resolver. |
| timeseries/app/schemas/common.py | Removed legacy common schemas (BandRange/TimeRange/date-based). |
| timeseries/app/routers/v3/api.py | New v3 API router (metadata, tiles, extract/analyze/status). |
| timeseries/app/routers/v3/init.py | Router package init. |
| timeseries/app/routers/v2/api.py | Removed legacy v2 API. |
| timeseries/app/routers/v1/api.py | Removed legacy v1 API. |
| timeseries/app/pytest.ini | Adds pytest configuration and integration marker. |
| timeseries/app/main.py | App lifecycle initialization + v3 router wiring + httpx client + registry load. |
| timeseries/app/exceptions.py | Updates validation error handling for FastAPI/Pydantic v2. |
| timeseries/app/core/validation.py | New request validation helpers (IDs + geometry sizing). |
| timeseries/app/core/timeseries_tasks.py | New background task for extract job execution + JobStore updates. |
| timeseries/app/core/timeseries_processing.py | New extraction/analyze processing pipeline (chunking, masking, transforms). |
| timeseries/app/core/tiles.py | New TiTiler proxy streaming for XYZ tiles. |
| timeseries/app/core/slice_resolver.py | New timestep normalization + lookup slice/URI resolution. |
| timeseries/app/core/services.py | Removed legacy service layer. |
| timeseries/app/config.py | Migrates settings to pydantic-settings YAML source + new config fields. |
| geoserver/settings/web.xml | Removed GeoServer settings from repo. |
| geoserver/settings/server.xml | Removed GeoServer settings from repo. |
| geoserver/.gitignore | Removed GeoServer ignore file. |
| docker/README.md | Removed legacy docker shared mount README. |
| deploy/settings/prod.yml | Updates prod settings with tile server + storage base URL. |
| deploy/settings/dev.yml | Adds new dev settings file. |
| deploy/requirements/prod.txt | Adds new pinned prod requirements. |
| deploy/requirements/dev.txt | Adds new pinned dev requirements. |
| deploy/requirements/base.txt | Adds new pinned base requirements (FastAPI + Pydantic v2 + geo stack). |
| deploy/logging/prod.yml | Adds prod logging config. |
| deploy/logging/dev.yml | Adds dev logging config. |
| deploy/dev.yml | Removed legacy compose fragment. |
| deploy/compose/prod.yml | Removes GeoServer service from prod compose. |
| deploy/compose/dev.yml | Adds dev compose definition (server + titiler volumes). |
| deploy/compose/base.yml | Adds base compose (server + redis + titiler + networks). |
| deploy/colormaps/custom.json | Adds custom colormap definitions. |
| deploy/base.yml | Removed legacy base compose. |
| deploy/Dockerfile.titiler | Adds TiTiler image build with baked colormaps. |
| deploy/Dockerfile | Adds new server image build for refactored app. |
| configure | Removes legacy configure script. |
| README.md | Updates request examples/paths. |
| Makefile | Updates build/deploy flow and default config generation. |
| .gitignore | Adds macOS/log paths. |
| .dockerignore | Adds docker ignore rules (excludes tests and local artifacts). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add environment-specific metadata files (deploy/metadata/dev.yml and deploy/metadata/prod.yml) and update Dockerfile to COPY deploy/metadata/${ENVIRONMENT}.yml to /code/metadata.yml. Remove the previous COPY of timeseries/metadata.yml so the image uses the environment-specific metadata baked into the container.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.