Skip to content

Remove stale preconditioner config names from __init__.py#256

Closed
runame wants to merge 1 commit into
facebookresearch:mainfrom
runame:fix/init-stale-rename-imports
Closed

Remove stale preconditioner config names from __init__.py#256
runame wants to merge 1 commit into
facebookresearch:mainfrom
runame:fix/init-stale-rename-imports

Conversation

@runame

@runame runame commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • shampoo_types.py renamed AmortizedPreconditionerConfigBaseShampooPreconditionerConfig and ShampooPreconditionerConfigClassicShampooPreconditionerConfig (cb150a9), but
    distributed_shampoo/__init__.py still imported and re-exported the old names, so import distributed_shampoo fails at HEAD with ImportError.
  • Replace the stale names in the imports and __all__ entries.

Test plan

  • python -c "import distributed_shampoo" succeeds
  • distributed_shampoo.BaseShampooPreconditionerConfig and distributed_shampoo.ClassicShampooPreconditionerConfig are accessible
  • Every entry in __all__ resolves to an actual module attribute
  • Inheritance chain intact (Classic→Base, RootInv→Classic, Eigendecomposed→Classic, EigenvalueCorrected→Base)

`AmortizedPreconditionerConfig` was renamed to
`BaseShampooPreconditionerConfig` and `ShampooPreconditionerConfig` to
`ClassicShampooPreconditionerConfig` in `shampoo_types.py`, but
`__init__.py` still imported and re-exported the old names. Replace
them with the new names in the imports and `__all__`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2026

@wz337 wz337 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@meta-codesync

meta-codesync Bot commented Apr 17, 2026

Copy link
Copy Markdown

@hjmshi has imported this pull request. If you are a Meta employee, you can view this in D101275252.

@meta-codesync meta-codesync Bot closed this in 75f77f6 May 5, 2026
meta-codesync Bot pushed a commit that referenced this pull request May 5, 2026
Summary:
The four grafting CLI overrides in `.github/workflows/examples.yaml` use the form
`'optimizer.grafting_config={_target_:...,...}'`. With OmegaConf's default merge
semantics, this *merges* the new mapping into the inherited
`grafting_config` from `distributed_shampoo/examples/configs/optimizer/shampoo.yaml`,
which currently defines:

```yaml
grafting_config:
  _target_: distributed_shampoo.AdamPreconditionerConfig
  beta2: 0.999
  epsilon: 1e-8
```

So an override with `_target_: AdaGradPreconditionerConfig` ends up as
`{_target_: AdaGrad, beta2: 0.999, epsilon: 1e-08}` and Hydra
instantiation fails with:

```
TypeError: AdaGradPreconditionerConfig.__init__() got an unexpected keyword argument 'beta2'
```

`SGDPreconditionerConfig` has the same latent bug (it would inherit both
`beta2` and `epsilon`), but execution stops at the AdaGrad line. Adam
and RMSprop happen to work because their `_target_` accepts `beta2`.

This regression was introduced in commit `405622d` ("Consolidate CIFAR-10
examples in Shampoo with hydra configs"), which switched the workflow
from argparse-based scripts to Hydra overrides without accounting for
dict-merge semantics.

## Fix

Replace each `optimizer.grafting_config={...}` override with the
standard Hydra delete-then-add idiom:

\`\`\`
'~optimizer.grafting_config' '+optimizer.grafting_config={...}'
\`\`\`

This forces the inherited mapping to be discarded and the new mapping
inserted whole, so only the explicitly-listed keys are passed to the
target's constructor.

Verified locally with \`hydra.compose\`:
- old override → \`{_target_: AdaGrad, beta2: 0.999, epsilon: 1e-08}\` (broken)
- new override → \`{_target_: AdaGrad, epsilon: 1e-08}\` (correct)

Applied to all six grafting-override invocations in the file (CPU
single-GPU loop, GPU single-GPU, DDP CPU, DDP GPU) for consistency.

Pull Request resolved: #257

Test Plan:
- [x] CI examples job passes on this branch
- [x] AdaGrad, Adam, RMSprop, and SGD grafting cases all run to completion

## Notes

Stacked on top of #256 (\`fix/init-stale-rename-imports\`). That PR's
import fix unblocked the \`examples\` job enough to surface this latent
merge bug. The branch is rebased on \`fix/init-stale-rename-imports\`,
so until #256 merges this diff will appear to include #256 commit.
Merge order: #256 first, then this.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed By: wz337

Differential Revision: D103081868

Pulled By: hjmshi

fbshipit-source-id: 069f887a3cc701316d6c486b6a278474155481e4
meta-codesync Bot pushed a commit that referenced this pull request May 5, 2026
Summary:
Resolves the remaining `pre-commit` and `type-check` job failures on `main` after PRs #256 (init imports) and #257 (examples workflow). Three commits, each separately reviewable:

### Commit 1: lint fixes (ruff E402 + E731) and copyright header normalization

- **E402 (~92 violations across 12 files)**: every affected file had a duplicate triple-quoted copyright block at the top followed by the *real* module docstring (or, in `gpa/`, a `#`-comment copyright). Python only treats the first triple-quoted string as a docstring, so the duplicate counted as a non-import statement and triggered E402 for every subsequent import. Drop the duplicate so the real module docstring is the first statement.
- **E731 (2 violations)** in `_shampoo_fully_shard_lossless_distributor.py:69` and `_shampoo_hybrid_shard_lossless_distributor.py:75`: rewrite `should_assign_param_idx = lambda i: ...` as a `def`.
- **Copyright convention**: 79/79 untouched files use a single triple-quoted block. The 6 gpa files I touched (plus `gpa/gpa_adamw.py`, which had both styles) are normalized to that convention by dropping their `#`-comment copyright in favor of the triple-quoted form.

### Commit 2: ruff-format pass

The pinned `ruff-format` (v0.8.0 in `.pre-commit-config.yaml`) considers 14 unrelated files mis-formatted (purely whitespace around `assert ..., (...)` wrapping). Applying the formatter so the hook passes on the first run instead of failing after auto-modifying files.

### Commit 3: mypy fixes (33 errors → 0)

All 27 errors reported by CI plus 6 local-only errors that newer mypy/torch stubs surface:

- **gpa/gpa_adamw.py** (10 errors):
  - `step()`: add `overload` pair to match `torch.optim.Optimizer`
  - `step()`: annotate per-group buffer lists as `list[Tensor]`
  - `step()`: skip empty parameter groups via `continue`; rename group-local `first_param` to `group_first_param` to avoid mypy union with the `Optional[Parameter]` from the train-mode check loop
- **gpa/tests/gpa_test_utils.py:42**: annotate `devices` as `tuple[device, ...]`
- **gpa/gpu_tests/gpa_adamw_numerics_test.py:31**: ignore missing `parameterized` stubs (no public stub package)
- **gpa/examples/cifar10_example.py:115**: ignore `len()` on `Dataset[Any]`
- **distributed_shampoo/distributed_shampoo.py**:
  - lines 669-671: drop redundant re-annotations on the `AdaGradPreconditionerConfig` branch (no-redef)
  - line 1632: move stray `# type: ignore` onto the `_pre_load_train_modes` assignment line so it actually applies
- **distributed_shampoo/distributor/_shampoo_hybrid_shard_lossless_distributor.py:160**: replace `filter(lambda, ...)` with a generator expression to dodge a confusing `TypeGuard` overload mypy picks for `filter()`
- **distributed_shampoo/distributor/shampoo_{ddp,hsdp,hybrid_shard}_distributor.py**: ignore `attr-defined` for the private `DeviceMesh._get_all_submeshes` — local-only, surfaces with current torch stubs
- **distributed_shampoo/distributor/shampoo_distributor.py:228**: ignore narrowing of `map(partial(tuple), ...)` assignment — local-only
- **distributed_shampoo/distributor/gpu_tests/shampoo_checkpoint_test.py:376,411** and **examples/parallelism.py:177,206**: ignore `arg-type` for `partial[FSDPModule]` / `FSDPModule` passed to a `Module` parameter (FSDPModule mixes in but mypy can't see it)
- **distributed_shampoo/examples/utils.py:161**: annotate `sampler`
- **distributed_shampoo/examples/tests/utils_test.py:228,240**: replace `assertIsNotNone` with `assert ... is not None` so mypy narrows `_fmt` before `assertIn`
- **distributed_shampoo/utils/load_balancing_utils.py:57**: extend existing `# type: ignore[misc]` to also cover `call-overload` from `max(float, floating[Any])`

## Local verification

```
$ ruff check .
All checks passed!
$ ruff format --check .
93 files already formatted
$ mypy .
Success: no issues found in 93 source files
```

Pull Request resolved: #258

Test Plan:
- [x] CI `pre-commit` job passes
- [x] CI `type-check` job passes
- [x] Other CI jobs (`tests`, `gpu-tests`, `examples`) unchanged

## Notes

Stacked on top of #256 (`fix/init-stale-rename-imports`) and #257 (`fix/examples-beta2-config`). Until those merge, the diff for this PR will appear to include their commits. Merge order: #256#257 → this.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed By: wz337

Differential Revision: D103082000

Pulled By: hjmshi

fbshipit-source-id: d4bf91517bc7426019a3fde42420e4408b2bbc59
@meta-codesync

meta-codesync Bot commented May 5, 2026

Copy link
Copy Markdown

@hjmshi merged this pull request in 75f77f6.

@runame runame deleted the fix/init-stale-rename-imports branch May 7, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants