Skip to content

chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026-06-27)#4541

Open
svcnvidia-nemo-ci wants to merge 1 commit into
r0.5.0from
bump-ci-container-2026-06-27-r0.5.0-core_r0.18.0
Open

chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026-06-27)#4541
svcnvidia-nemo-ci wants to merge 1 commit into
r0.5.0from
bump-ci-container-2026-06-27-r0.5.0-core_r0.18.0

Conversation

@svcnvidia-nemo-ci

Copy link
Copy Markdown
Contributor

🚀 PR to bump uv.lock in r0.5.0.

🤖 This PR will be merged automatically once CI passes.

…-06-27)

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@svcnvidia-nemo-ci

Copy link
Copy Markdown
Contributor Author

/ok to test a2139f9

@copy-pr-bot

copy-pr-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yaoyu-33

Copy link
Copy Markdown
Contributor

MCore bump auto-fix status for release-r0.5.0:

Classification: MCore broke Bridge
Evidence: PR #4541 advances 3rdparty/Megatron-LM and .main.commit from d30c93ffae858b22eece3fa71c734c8f43161eff to 22d950d25a5fdea8e423cad11bd65640ea71d63e. The MCore compare is one commit, 22d950d25a5f (cp: build: bump transformer-engine to release_v2.16.post into core_r0.18.0 (#5518)), changing MCore pyproject.toml and uv.lock to TransformerEngine b9d690e042b1c4e455214e7dab65d6d3512c05d6. Both failed quantization jobs hit the same stack: tests/functional_tests/test_groups/quantization/models/qwen/test_qwen3_moe_quantization_workflow.py::TestQwen3MoeQuantizationWorkflow::test_qwen3_moe_quantization_and_generation_with_expert_parallelism -> examples/quantization/quantize.py -> mtq.quantize(...) -> modelopt/torch/quantization/plugins/transformer_engine.py:178, where ModelOpt evaluates len(args[sig_params.index("non_tensor_args") - ctx_offset][0]) and raises TypeError: object of type 'bool' has no len(). Direct failed jobs: H100 L2_Launch_models_qwen_quantization https://github.com/NVIDIA-NeMo/Megatron-Bridge/actions/runs/28286485574/job/83815579172 and GB200 gb200_L2_Launch_models_qwen_quantization https://github.com/NVIDIA-NeMo/Megatron-Bridge/actions/runs/28286485574/job/83815579123. The same TransformerEngine revision also fails the same jobs on the open main TE bump PR #4536, so this is not release-branch-only behavior.
Fix PR: not opened
Guards: none added or removed
Validation: no code validation run because no fix branch was opened. Evidence gathered on 2026-06-27 07:16 PDT from Linear MB-592, PR #4541 metadata/diff/checks, failed job logs downloaded through the GitHub Actions job-log endpoint, MCore compare d30c93ffae858b22eece3fa71c734c8f43161eff...22d950d25a5fdea8e423cad11bd65640ea71d63e, and TransformerEngine compare 4220403e831d29e93868f7793693ea83f6b8b05b...b9d690e042b1c4e455214e7dab65d6d3512c05d6.
Next action: maintainer decision needed. Preferred path is for ModelOpt/quantization and TransformerEngine owners to update the TE grouped-linear plugin/API compatibility, or for MCore core_r0.18.0 to revert/patch the TE bump before merging this release-line bump. A Bridge-side monkeypatch of ModelOpt internals would be brittle, and changing the required nvidia-modelopt==0.44.0rc5 pin requires maintainer approval. After the upstream dependency fix or revert, rerun the H100 and GB200 Qwen quantization jobs on PR #4541.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants