Jm/add model by Mijan · Pull Request #19 · Marconi-Lab/SolarIrradiation

Mijan · 2026-06-10T07:34:53Z

No description provided.

…plemented first model architecture

Captures the state of the jm/add_model branch on 2026-05-08 prior to a structured refactor driven by numbered notebooks (01–07). Preserves the mid-flight warehouse_ops reorganization (src/susse/io → warehouse_ops/io) and the population ingestion subsystem (jobs, loaders, validators) so they remain available as reference material while the notebook track rebuilds the model, training, validation, and inference layers cleanly. Also: add .idea/ and .venv/ to .gitignore; remove stale Kampala MERRA sample under notebooks/U10M. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace module-level PROJECT_ID/DATASET constants and the implicit TableRefs() defaults with a structured set of frozen dataclasses: * WarehouseConfig — GCP project + dataset + region. * TableSchema + TableSchemas — registry of every warehouse table with its table_id, MERGE-key contract, and description. Single source of truth for both loaders and coverage queries. * TableRefs — FQTN properties derived from a WarehouseConfig. * MatchStrategy StrEnum and validated WarehouseOptions. BigQueryClient now takes a WarehouseConfig directly (no implicit module imports) and gains an existing_keys() method used by ingest jobs to implement idempotent re-runs. Adds dependencies: google-cloud-bigquery, db-dtypes, pyarrow, pygeohash. Server-side geohash5 in the existing warehouse matches pygeohash output (verified against kampala station). Also adds idempotent DDL for cams_daily_vars_long (the long-format CAMS table introduced for symmetry with nasa_daily_vars_long) and dim_variable so the schema definitions live in version control. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The 6 CrossBoundary Energy daily-GHI CSVs under data/ground_measurements/CBE_Data/ are covered by an NDA and must not be redistributed publicly. They remain ingested in the BigQuery warehouse and the trained model bundle the portal serves; originals live on Google Drive for internal use. Changes - Delete the 6 CBE CSVs (egypt/ghana/kenya/madagascar/nigeria/somalia). - Add /data/ground_measurements/CBE_Data/ to .gitignore so a fresh copy cannot be accidentally re-committed. - Update data/ground_measurements/README.md: drop the CBE row, add an NDA note, tweak the Schema-row wording. - warehouse/extending_the_warehouse.ipynb: swap the Pattern-3 demo from CBE somalia.csv to ministry_energy_ug/soroti.csv; switch the Pattern 1/2 example from kenya_location3 / kenya_locations to Uganda-only (Makerere + Min. of Energy); clear all cell outputs. - notebooks/tutorial/0[1-7]*.ipynb: clear cell outputs (they contained CBE station coords / GHI values baked in from prior runs). NB 04 swaps a single LOSO-example station name (kenya_location13 -> tororo). - notebooks/papers/mukiibi_mikelson_2026/01_recomputation.ipynb + _build_notebook.py: clear outputs; anonymise two ghana_location3 mentions to "stations in the Gulf of Guinea". CrossBoundary partner-name acknowledgements (paper context) kept. Followups not handled here - Cell outputs are now empty; re-execute with whatever CBE-source filter you settle on so the public repo has rich outputs again. - Older commits on this branch and on main still contain CBE data; a history rewrite (git filter-repo + force-push) is a separate decision. - Local-only branches (cn/*, jm/new_model, rm/data_cleaning) still carry CBE files at tip; prune or rewrite before any push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Each DerivedFeature now declares a DerivedColumnMetadata (column, label, unit, description) per output column, exposed via the new output_metadata abstract property. Derived features have no entry in the warehouse VariableCatalog, so this gives the portal's upcoming variable inspector a single source of truth for their labels and units rather than a hardcoded parallel table that can drift. Commit 2 (Predictor feature catalog) must reconcile DerivedColumnMetadata.label with VariableSpec.display_name and the unit vocabularies of the two value objects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Predictor.feature_catalog exposes a FeatureCatalog: one FeatureMetadata (column, label, unit, description, presentation group) per model-input column predict() returns. It composes warehouse VariableCatalog metadata for NASA POWER / CAMS variables with derived features' output_metadata, reconciling display_name onto the shared `label` field — so a variable-inspector UI can show what every input is and where it comes from. Built purely from the bundle and cached. The new aux_column_prefix / aux_feature_column helpers in warehouse_ops.population.types are now the single source of truth for the <prefix>_<variable_id> aux-column naming; FeatureService and FeatureSelection.aux_columns route through them instead of each hardcoding the per-source prefixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rradiation into rm/repo_navigation

Rm/repo navigation

Mijan and others added 30 commits October 27, 2025 13:52

created training data dataset, added examples for pulling them and im…

28cd6fc

…plemented first model architecture

warehouse_ops population refactored

4279eb9

warehouse tests added

d92ccc4

removed api client example

73216eb

first warehouse population notebook added

8e8baab

fixed bugs in loader coad and modified Enum

7c97551

added warehouse migrations scripts, readme and setup

7785f35

fixed variable typos

e7f50f8

additional warehouse migrations performed

3292f3b

added timeout and logging to nasa and cams

e7853ad

ground station nasa cams ingestion step added

8b6b0cc

added fix for geolocation query bug

e072bf1

additional sanity tests added

825ffb3

loaders updated

ed74a33

adding merra 2 code

bbd3f2b

adding merra 2 code

6b0d7a2

adding merra schema

d32e856

dropped external relics

be8adc4

fixed scaling for cams rad satelite data

cb94e7c

solar zenith angle removed

1456a98

a9 and a10 applied

b77a944

minor bug fixed in features_service

12c47cb

finally a useful version of the notebook

565a89c

merra data ingestion

4c91528

Readme updated

1b68329

parallel merra fetcher

8c281f4

added various fixes for merra 2

8cf9a70

next little bug fixed

78fe3c1

Mijan and others added 28 commits May 13, 2026 08:04

cleanup and duplicated code removal

e11aa77

.env credentials added

134ba78

clean upas, unified satellite-feature assembly and other minor cleanups

696c43b

clean upas, unified satellite-feature assembly and other minor cleanups

737298c

added schema and remove dhard coded dataframe columns

0ac28ea

track notebook build scripts as source-of-truth for tutorial chain

90f2248

formats fixed

f2273f6

notebooks updated

f059289

udpated docs and readme

2f39751

updated notebooks and readme

01034d6

fixed NASA ingestions and snapping to grid

316e6c8

feature test added

99c079c

satelite loading added

2ffacfb

satellite loading refactored

0413c0e

minor cleanup and fix

d9c4837

added unit docstrings

06779df

Addding updated notebook

7d58d25

updated notebooks

09eb7bb

Adding implementation notebook

33cf692

Created using Colab

61edeb3

updating notebook

d969e46

Merge branch 'jm/add_model' of https://github.com/Marconi-Lab/Solar_i…

86bd290

…rradiation into rm/repo_navigation

Reolving merge conflict

ea57732

Merge pull request #20 from Marconi-Lab/rm/repo_navigation

cc39129

Rm/repo navigation

reformating with tox

df54ae3

rogerzmukiibi approved these changes Jun 12, 2026

View reviewed changes

rogerzmukiibi merged commit c42d0a6 into main Jun 12, 2026
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jm/add model#19

Jm/add model#19
rogerzmukiibi merged 98 commits into
mainfrom
jm/add_model

Mijan commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mijan commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants