RA Toolkit end-to-end pipeline enhancements#1355
Closed
allisonmcampbell wants to merge 3 commits into
Closed
Conversation
…nario configs Add hydro_balancing_type parameter to control hydro balancing granularity, aggregate all projects to BA level for RA studies, fix pandas 2.x dtype compatibility, and provide complete e2e scenario configuration files (temporal definitions, iterations, user-defined configs) so the full pipeline from PUDL download through reliability metrics is reproducible via gridpath_run_data_toolkit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t names AGG_PROJECT_NAME_STR now falls back to gridpath_technology when agg_project is NULL, fixing a DuckDB struct type mismatch in the open_data test (test_data_toolkit_open_data). Technologies without agg_project (e.g. BA, CT) now get proper aggregated names like Batteries_Zone1 instead of NULL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds battery_duration (4h) and pumped_storage_duration (12h) defaults so storage projects without EIA-860 energy capacity data get filled in. Includes empty copy files CSV required by the manual_adjustments step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a43af9c to
40decbe
Compare
Contributor
Author
|
Closing this PR in favor of three smaller, focused PRs:
This split makes each easier to review independently. The original bundled approach made it harder to evaluate the design changes separately from the bug fixes and configuration files. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hydro_balancing_typeparameter to data toolkit scripts for controlling hydro balancing granularity (day/week/month)LossySetitemErrorin PUDL extraction (int64casts for datetime columns)script,setting,value,script_true_false_arg,reverse_default_behavior)raw_data_ra_toolkit_e2e/) including temporal definitions for 28-subproblem synchronized run (14 weather years × 2 hydro years), Monte Carlo iteration configs, and all user-defined mapping tablesra_toolkit_e2e_settings_sample.csvfor running the full e2e pipelinedocs/ra_toolkit_e2e_guide.md) and detailed changelog (docs/ra_toolkit_e2e_changes.md)Test plan
gridpath_run_data_toolkit --settings_csv data_toolkit/ra_toolkit_e2e_settings_sample.csvwith PUDL and RA Toolkit raw data in placedb/csvs_ra_toolkit_e2e/gridpath_create_database,gridpath_load_csvs,gridpath_load_scenariosgridpath_run_e2e --scenario ra_toolkit_e2e_sync --solver cbcand confirm 28 subproblems solveopen_data_toolkit_settings_sample.csvstill works with its pipeline🤖 Generated with Claude Code