Skip to content

feat(model_zoo): weekly rotation scheduler — train the N stalest specs per run (L4488g)#234

Merged
cipher813 merged 1 commit into
mainfrom
feat/model-zoo-weekly-rotation
Jun 2, 2026
Merged

feat(model_zoo): weekly rotation scheduler — train the N stalest specs per run (L4488g)#234
cipher813 merged 1 commit into
mainfrom
feat/model-zoo-weekly-rotation

Conversation

@cipher813
Copy link
Copy Markdown
Owner

What

Implements the rotation mechanism for the model zoo per your 2026-06-02 decision: keep a zoo of N challenger specs but train only 2–3 each Saturday to keep compute tenable.

Scope (the option you picked): scheduler + config + tests only. This does not touch the live Saturday SF — nothing auto-fires; --weekly-rotation is runnable on demand. The SF/EventBridge wiring + per-spec leaderboard aggregation remain follow-ups, to be designed against a real run (the Plan B you accepted).

How

  • select_rotation_specs(specs, registered_versions, budget) — pure, unit-testable policy. Picks the budget stalest active specs by their newest registered version date; a never-registered spec is maximally stale (trained first); id tiebreak for determinism. Maps a registered version back to its spec via model_version == the spec's label (manifest["version"] = MODEL_VERSION_LABEL, set by train_spec — verified meta_trainer.py:3537).
  • train_weekly_rotation(...) — trains the selected specs. Registry read is best-effort (+ injectable for tests); registry-unreachable falls back to id-ordered selection (logged, never fatal). One spec's failure never aborts the rest.
  • MODEL_ZOO_WEEKLY_BUDGET config (default 3) + sample.yaml knob (validated it parses — CI copies the sample).
  • CLI: python -m training.model_zoo --weekly-rotation [--budget N].

The design rule it embodies

Rotation only controls version freshness — a frozen version keeps shadow-scoring daily regardless of whether its spec was retrained this week, so it never starves the GATE 2 maturity clock. The zoo refreshes round-robin over ~ceil(active/N) weeks. The champion retrain is separate + always-on; this budget is the challenger zoo on top. (Full spec: ROADMAP L4488g.)

Tests

6 new: label resolution (overrides → declared → spec-<id>), never-trained-first, oldest-version-first via newest-per-spec, budget cap, retired excluded, rotation trains only the budget + continues past failure. Full predictor suite: 1382 passed.

Not in this PR (follow-ups)

  • Wire --weekly-rotation into a parallel, off-critical-path Saturday spot (production behavior + weekly compute commit).
  • Rebuild the dashboard leaderboard + GATE 2 to aggregate realized α per spec across versions (best designed against real multi-version output — prove with one manual --all-active after 6/6 first).

🤖 Generated with Claude Code

…s per run (L4488g)

Operator decision (2026-06-02): keep a zoo of N challenger specs but train
only 2-3 each Saturday to bound compute. This adds the rotation MECHANISM
(scheduler + config + tests); it does NOT touch the live Saturday SF —
nothing auto-fires, `--weekly-rotation` is runnable on demand and the SF
wiring + per-spec leaderboard stay as follow-ups (designed against a real
run, per the accepted Plan B).

- select_rotation_specs(specs, registered_versions, budget): pure policy —
  picks the `budget` STALEST active specs by their newest registered
  version date (never-registered = maximally stale → trained first; id
  tiebreak for determinism). Maps a registered version back to its spec via
  model_version == the spec's label (manifest["version"]=MODEL_VERSION_LABEL,
  set by train_spec; verified meta_trainer.py:3537).
- train_weekly_rotation(...): trains the selected specs (registry read
  best-effort + injectable for tests; one spec's failure never aborts the
  rest). Registry-unreachable → id-ordered fallback, logged, never fatal.
- MODEL_ZOO_WEEKLY_BUDGET config (default 3) + sample.yaml knob (validated
  parses — CI copies the sample).
- CLI: `--weekly-rotation [--budget N]`.

Design rule it embodies (from the ROADMAP L4488g re-spec): rotation only
controls version FRESHNESS — a frozen version keeps shadow-scoring daily
regardless of retrain, so it never starves the GATE 2 maturity clock; the
zoo refreshes round-robin over ~ceil(active/N) weeks. Champion retrain is
separate + always-on; this budget is the challenger zoo on top.

Tests: 6 new (label resolution, never-trained-first, oldest-version-first
via newest-per-spec, budget cap, retired excluded, rotation trains only the
budget + continues past failure). Full predictor suite: 1382 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 15fa34f into main Jun 2, 2026
1 check passed
@cipher813 cipher813 deleted the feat/model-zoo-weekly-rotation branch June 2, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant