Skip to content

REF Unify data shapes in forecasting solvers#28

Open
felixdivo wants to merge 10 commits into
benchopt:mainfrom
felixdivo:base-tsfm-solver-iteration
Open

REF Unify data shapes in forecasting solvers#28
felixdivo wants to merge 10 commits into
benchopt:mainfrom
felixdivo:base-tsfm-solver-iteration

Conversation

@felixdivo

@felixdivo felixdivo commented May 29, 2026

Copy link
Copy Markdown

#22 was merged a bit early. This PR now changes:

  • Deduplicate type signatures and docs on types
  • Add some consistency checks in data classes
  • Consolidate docs
  • Unify shapes for forecasting and document them
  • Confirmed Chronos 1 and 2 to be working with the new API

@felixdivo

Copy link
Copy Markdown
Author

I'm currently still testing it locally to make sure it runs nicely.

Comment thread benchmark_utils/base_solver.py
@felixdivo

Copy link
Copy Markdown
Author

TODOs:

  • Test thoroughly by hand (e.g., Toto 2 and metrics are likely broken now)

@felixdivo

Copy link
Copy Markdown
Author

cc @kalebphipps :)

@rtavenar

rtavenar commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

I'm OK with this PR, but there are a few conflicts to be resolved before merging.

@felixdivo

Copy link
Copy Markdown
Author

Thanks for checking! I can take of them later this evening :)

@felixdivo felixdivo changed the title REF Improve over #22 REF Unify data shapes in forecasting solvers Jun 6, 2026
Resolve conflicts and converge the forecasting data layout to the
branch convention (n_cutoffs, H, C, Q) — Q last — everywhere:

- outputs.py: keep Q-last point/__post_init__, add main's flatten()
  adapted to (M, H, C, Q).
- chronos.py: take Chronos-2 docstring/References; AD adapter uses the
  build_adapter() return style without the invalid prediction_length
  kwarg. Rewrite _ChronosForecaster to consume Chronos-2's native
  quantile output (list[(C, Q, H)]) instead of v1 sample draws — fixes
  "AttributeError: 'list' object has no attribute 'float'".
- chronos2.py: fix the Chronos2Pipeline import path and make the module
  self-contained (no unresolvable cross-solver import under benchopt);
  _Chronos2Forecaster produces Q-last output.
- toto2.py, moment.py, metrics.py, leakage tests: align to Q-last.
- objective.py: expose a 'value' key for benchopt's stopping criterion
  on the non-leaky forecasting path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread benchmark_utils/base_solver.py
Comment thread benchmark_utils/inputs.py
felixdivo and others added 2 commits June 7, 2026 06:13
chronos.py had been switched to Chronos 2 (Chronos2Pipeline) during the
BaseTSFMSolver iteration, leaving both solver files on the same model.
Restore chronos.py to the Chronos v1 ChronosPipeline (amazon/chronos-t5-*)
on plain BaseSolver — mirroring upstream/main and chronos2.py's structure.

Three adaptations beyond upstream's v1 forecaster were required by this
branch's evolved contracts:
- _assemble_output emits the branch's (n_cutoffs, H, C, Q) quantile layout.
- Seed before Monte-Carlo sampling so the behavioural leakage probe
  (benchmark_utils.leakage), which compares two predict() calls on identical
  history, does not read sampling noise as a leak (would force value=inf).
- Predict in bounded chunks so the full-batch T5 forward (O(L^2) attention)
  does not exhaust memory on datasets with many long series (e.g. m4_weekly).

Verified end-to-end on CPU:
  benchopt run -s chronos  -d "Monash[dataset_name=m4_weekly_dataset]"
  benchopt run -s chronos2 -d "Monash[dataset_name=m4_weekly_dataset]"
both produce finite metrics with leakage=0 (chronos v1 mase=0.43).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve conflicts from main's "FIX benchopt tests (benchopt#40)". All conflicts
were the quantile-axis convention: the branch uses quantiles-last
(n_cutoffs, H, C, Q) while main only reformatted the old quantiles-second
(n_cutoffs, Q, H, C) layout — kept the branch's convention everywhere
(naive, seasonal_naive, tfc_api, metrics, outputs, base_solver, leakage
tests). For chronos2.py both sides independently rebased onto
BaseTSFMAdapter; kept the branch's typed predict + quantiles-last
_assemble_output and switched to the centralized POOLERS from
benchmark_utils.adapters (dropping the duplicate local dict / imports).
Wrapped two over-length lines in base_solver.py to satisfy ruff E501.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@felixdivo

felixdivo commented Jun 8, 2026

Copy link
Copy Markdown
Author

I merged the main branch into this to resolve any conflicts + have the CI pipeline run checks before merging.

@tomMoral could you allow the CI pipeline to run?

Comment thread objective.py
@felixdivo

felixdivo commented Jun 9, 2026

Copy link
Copy Markdown
Author

Ah yeah, I realized removing sampling_strategy = "run_once" was a mistake while improving the CI pipeline on another branch I'm currently working on. I'll undo that specific change in a minute (was in multiple locations). Your change looks good to me.

@felixdivo felixdivo self-assigned this Jun 9, 2026
@felixdivo

Copy link
Copy Markdown
Author

@rtavenar I think this PR is ready to be merged. Do you have the time to review it?

@tomMoral tomMoral left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! a few nitpicks but good to go otherwise :)

Comment thread benchmark_utils/base_solver.py Outdated
Comment thread benchmark_utils/base_solver.py Outdated
Co-authored-by: Thomas Moreau <thomas.moreau.2010@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants