mp: add zero-hyperparameter auto-MP calibration#19
Open
Polarisyjr wants to merge 1 commit into
Open
Conversation
Adds an offline, gradient-free mixed-precision calibrator that finds per-(block, op) free boundaries minimizing the post-QwT-compensation residual via forward-only coordinate descent. - config: FreeBoundaryMPConfig (per-(block,op) k-1 free boundaries, fixed-level pinning), per-block context (_CURRENT_BLOCK_IDX + set/get_current_block_idx), AutoMPBudgetLogger, and two new guarded branches in adaptive_classify_rows (free-boundary + target_fractions quantile path). AdaptiveMPConfig gains an optional target_fractions field. No existing default path changed (additive only). - auto_calibrator: oracle_search_op/oracle_search_block/auto_calibrate_mp plus RidgeFitter closed-form compensation fit. - __init__: export the new public API. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Is this for QwT or for Mixed precision? |
Collaborator
Author
MP |
Comment on lines
+542
to
+548
| for it in range(comp_refit_iters): | ||
| tmp_comp = comp_factory(W.detach().cpu(), | ||
| b_vec.detach().cpu()).to(device) | ||
| with torch.no_grad(): | ||
| c_actual_chunks = [] | ||
| for s in range(0, X_flat.size(0), fwd_chunk * 257): | ||
| e = min(s + fwd_chunk * 257, X_flat.size(0)) |
Comment on lines
+29
to
+34
| _CURRENT_BLOCK_IDX: int = 0 | ||
|
|
||
|
|
||
| def set_current_block_idx(i: int) -> None: | ||
| global _CURRENT_BLOCK_IDX | ||
| _CURRENT_BLOCK_IDX = int(i) |
Comment on lines
+442
to
+458
| AutoMPBudgetLogger.enable() | ||
| try: | ||
| y0 = _run_block_sc() | ||
| finally: | ||
| AutoMPBudgetLogger.disable() | ||
| budget_entries = AutoMPBudgetLogger.snapshot(clear=True) | ||
| budget_stats = { | ||
| "baseline": sum(e["baseline"] for e in budget_entries), | ||
| "actual": sum(e["actual"] for e in budget_entries), | ||
| "entries": budget_entries, | ||
| } | ||
| if avg_sc_draws <= 1: | ||
| return y0, budget_stats | ||
| acc = y0 | ||
| for _ in range(avg_sc_draws - 1): | ||
| acc = acc + _run_block_sc() | ||
| return acc / avg_sc_draws, budget_stats |
Comment on lines
+489
to
+514
| per_op: dict[str, dict[str, Any]] = {} | ||
| if search_objective == "raw_mse": | ||
| score_fn = lambda Y_sc_eval: float( # noqa: E731 | ||
| (Y_fp_flat - Y_sc_eval).pow(2).mean().item()) | ||
| else: | ||
| score_fn = None | ||
| for outer in range(n_outer_block): | ||
| for op in ops_per_block: | ||
| b_init = cfg.get_boundaries(op, i) | ||
| b_new, score = oracle_search_op( | ||
| run_block_sc=_run_block_sc_avg, | ||
| Y_fp=Y_fp_flat, | ||
| fitter=fitter, | ||
| cfg=cfg, | ||
| block_idx=i, | ||
| op=op, | ||
| init_boundaries=b_init, | ||
| n_candidates=n_candidates, | ||
| n_outer=n_outer_per_op, | ||
| run_block_sc_with_budget=( | ||
| _run_block_sc_avg_with_budget if budget_ratio is not None else None), | ||
| budget_target_actual=block_budget_target, | ||
| score_fn=score_fn, | ||
| log_fn=log_fn, | ||
| ) | ||
| per_op[op] = {"boundaries": b_new.tolist(), "score": score} |
Comment on lines
+615
to
+620
| boundaries, so it sees the true block-local compute for that candidate. | ||
| """ | ||
|
|
||
| _enabled: bool = False | ||
| _log: list[dict] = [] | ||
|
|
Comment on lines
+610
to
+617
| # pre-hook was attached to blk_sc; re-attach to new_block | ||
| # (old hook still fires on blk_sc if it's still referenced, but | ||
| # forwarding now goes through new_block; add a fresh hook). | ||
| def _make_hook(idx): | ||
| def _hook(_m, _args): | ||
| set_current_block_idx(idx) | ||
| return _hook | ||
| new_block.register_forward_pre_hook(_make_hook(i)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an offline, gradient-free mixed-precision calibrator that finds per-(block, op) free boundaries minimizing the post-QwT-compensation residual via forward-only coordinate descent.