REF Unify data shapes in forecasting solvers#28
Conversation
|
I'm currently still testing it locally to make sure it runs nicely. |
|
TODOs:
|
|
cc @kalebphipps :) |
|
I'm OK with this PR, but there are a few conflicts to be resolved before merging. |
|
Thanks for checking! I can take of them later this evening :) |
Resolve conflicts and converge the forecasting data layout to the branch convention (n_cutoffs, H, C, Q) — Q last — everywhere: - outputs.py: keep Q-last point/__post_init__, add main's flatten() adapted to (M, H, C, Q). - chronos.py: take Chronos-2 docstring/References; AD adapter uses the build_adapter() return style without the invalid prediction_length kwarg. Rewrite _ChronosForecaster to consume Chronos-2's native quantile output (list[(C, Q, H)]) instead of v1 sample draws — fixes "AttributeError: 'list' object has no attribute 'float'". - chronos2.py: fix the Chronos2Pipeline import path and make the module self-contained (no unresolvable cross-solver import under benchopt); _Chronos2Forecaster produces Q-last output. - toto2.py, moment.py, metrics.py, leakage tests: align to Q-last. - objective.py: expose a 'value' key for benchopt's stopping criterion on the non-leaky forecasting path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
chronos.py had been switched to Chronos 2 (Chronos2Pipeline) during the BaseTSFMSolver iteration, leaving both solver files on the same model. Restore chronos.py to the Chronos v1 ChronosPipeline (amazon/chronos-t5-*) on plain BaseSolver — mirroring upstream/main and chronos2.py's structure. Three adaptations beyond upstream's v1 forecaster were required by this branch's evolved contracts: - _assemble_output emits the branch's (n_cutoffs, H, C, Q) quantile layout. - Seed before Monte-Carlo sampling so the behavioural leakage probe (benchmark_utils.leakage), which compares two predict() calls on identical history, does not read sampling noise as a leak (would force value=inf). - Predict in bounded chunks so the full-batch T5 forward (O(L^2) attention) does not exhaust memory on datasets with many long series (e.g. m4_weekly). Verified end-to-end on CPU: benchopt run -s chronos -d "Monash[dataset_name=m4_weekly_dataset]" benchopt run -s chronos2 -d "Monash[dataset_name=m4_weekly_dataset]" both produce finite metrics with leakage=0 (chronos v1 mase=0.43). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve conflicts from main's "FIX benchopt tests (benchopt#40)". All conflicts were the quantile-axis convention: the branch uses quantiles-last (n_cutoffs, H, C, Q) while main only reformatted the old quantiles-second (n_cutoffs, Q, H, C) layout — kept the branch's convention everywhere (naive, seasonal_naive, tfc_api, metrics, outputs, base_solver, leakage tests). For chronos2.py both sides independently rebased onto BaseTSFMAdapter; kept the branch's typed predict + quantiles-last _assemble_output and switched to the centralized POOLERS from benchmark_utils.adapters (dropping the duplicate local dict / imports). Wrapped two over-length lines in base_solver.py to satisfy ruff E501. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
I merged the main branch into this to resolve any conflicts + have the CI pipeline run checks before merging. @tomMoral could you allow the CI pipeline to run? |
|
Ah yeah, I realized removing |
|
@rtavenar I think this PR is ready to be merged. Do you have the time to review it? |
tomMoral
left a comment
There was a problem hiding this comment.
LGTM! a few nitpicks but good to go otherwise :)
Co-authored-by: Thomas Moreau <thomas.moreau.2010@gmail.com>
#22 was merged a bit early. This PR now changes: