[GB300][SGLang] Bump SGLang image for dsv4-fp4-gb300-dynamo-sglang-mtp by Fridge003 · Pull Request #1559 · SemiAnalysisAI/InferenceX

Fridge003 · 2026-05-22T22:55:14Z

Summary

Bump SGLang container image from lmsysorg/sglang:nightly-dev-cu13-20260510-2473659e to lmsysorg/sglang:nightly-dev-20260522-c9153da5 across all six DeepSeek-V4 8k1k disagg recipes.

Update SGLang container image from nightly-dev-cu13-20260510-2473659e to nightly-dev-20260522-c9153da5 across all DeepSeek-V4 8k1k disagg recipes.

github-actions · 2026-05-22T22:55:21Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-22T22:55:21Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-05-22T23:04:43Z

 model:
  path: "deepseek-v4-pro"
-  container: "lmsysorg/sglang:nightly-dev-cu13-20260510-2473659e"
+  container: "lmsysorg/sglang:nightly-dev-20260522-c9153da5"


🔴 The 6 recipe YAMLs are bumped to lmsysorg/sglang:nightly-dev-20260522-c9153da5, but the matching image: field on the dsv4-fp4-gb300-dynamo-sglang-mtp block in .github/configs/nvidia-master.yaml (line 9073) is left at the stale lmsysorg/sglang:nightly-dev-cu13-20260509-9ee83034. Per AGENTS.md the two must be bumped in lockstep — the launcher uses image: as the container-alias key, so without this update CI will still import/run the old image and the perf-changelog claim is untrue. Fix: bump line 9073 of nvidia-master.yaml to the same nightly-dev-20260522-c9153da5 tag.

Extended reasoning...

What's wrong

This PR bumps model.container in all six benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/*-mtp.yaml files from lmsysorg/sglang:nightly-dev-cu13-20260510-2473659e to lmsysorg/sglang:nightly-dev-20260522-c9153da5, and adds a perf-changelog entry that explicitly claims the image was updated for dsv4-fp4-gb300-dynamo-sglang-mtp. However .github/configs/nvidia-master.yaml line 9073 (the image: field on the dsv4-fp4-gb300-dynamo-sglang-mtp block) still reads lmsysorg/sglang:nightly-dev-cu13-20260509-9ee83034 — an even older 20260509 tag from before the previous bump.

Why this matters

AGENTS.md line 115 documents the invariant explicitly: multi-node srt-slurm changes must edit the recipe yaml AND nvidia-master.yaml together, and for image bumps model.container must equal image: because the launcher uses the latter as the container-alias key. Concretely, .github/workflows/profile.yml reads matrix.config.image from nvidia-master.yaml into the IMAGE env var, and runners/launch_gb300-cw.sh uses it both to build/import the enroot squash file (enroot import -o ... docker://$image) and to register the alias in the generated srtslurm.yaml containers map (${IMAGE}: ${SQUASH_FILE}). The recipe's container: is then matched against that alias by srtctl.

Precedent

The sibling non-MTP PR #1528 (commit 59980fe) for dsv4-fp4-gb300-dynamo-sglang updated BOTH .github/configs/nvidia-master.yaml AND the recipe YAMLs in lockstep. After that PR, the non-MTP block at line 8760 sits at nightly-dev-cu13-20260520-425dffbd matching its recipe — a consistent lockstep. The MTP variant has now diverged 13 days from its recipe, and the new tag has dropped the cu13 prefix.

Step-by-step proof of impact

CI launches the sweep, profile.yml reads matrix.config.image from nvidia-master.yaml → IMAGE=lmsysorg/sglang:nightly-dev-cu13-20260509-9ee83034.

runners/launch_gb300-cw.sh runs enroot import -o $SQUASH_FILE docker://$IMAGE — squashing the 20260509 image.

The generated srtslurm.yaml registers containers: { "${IMAGE}": ${SQUASH_FILE} } — keyed by the 20260509 tag.

srtctl loads the recipe yaml, sees model.container: lmsysorg/sglang:nightly-dev-20260522-c9153da5 — does not match the alias.

Result is one of: (a) srtctl falls back to a fresh docker pull of the 20260522 image at runtime (defeating the pre-stage), (b) the alias mismatch causes a launch failure, or (c) the bench actually runs against the 20260509 squash file — invalidating the perf-changelog claim. All three are bad outcomes.

Fix

Bump .github/configs/nvidia-master.yaml line 9073 from lmsysorg/sglang:nightly-dev-cu13-20260509-9ee83034 to lmsysorg/sglang:nightly-dev-20260522-c9153da5 in this PR, matching the recipe container: values and the lockstep pattern established by PR #1528.

claude · 2026-05-22T23:04:43Z

+    - dsv4-fp4-gb300-dynamo-sglang-mtp
+  description:
+    - "Update SGLang image from nightly-dev-cu13-20260510-2473659e to nightly-dev-20260522-c9153da5"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1559


🟡 The new perf-changelog.yaml entry for dsv4-fp4-gb300-dynamo-sglang-mtp has an unresolved template placeholder: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER. This link will 404 when clicked — please substitute the real PR number (1559) before merging.

Extended reasoning...

What the bug is

The newly-added entry in perf-changelog.yaml at line 3137 ends with:

- config-keys: - dsv4-fp4-gb300-dynamo-sglang-mtp description: - "Update SGLang image from nightly-dev-cu13-20260510-2473659e to nightly-dev-20260522-c9153da5" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER

The literal string PLACEHOLDER was never replaced with the actual PR number. (Note: the rendered PR diff in some GitHub views may show pull/1559, but the actual committed file content — what reviewers will merge — contains PLACEHOLDER. git show f66004e -- perf-changelog.yaml confirms the committed diff added the literal PLACEHOLDER text.)

Why existing code doesn't catch it

A repo-wide grep for PLACEHOLDER returns only this one location, so there is no post-merge template-substitution step that would replace it. Schema validation (e.g. utils/matrix_logic/validation.py) only checks that pr-link is a string — it does not validate URL format or that the path resolves, so the entry passes validation cleanly while still producing a broken link.

Impact

This is documentation-only — there is no runtime effect. Anyone clicking the link from the changelog to find the originating PR for the dsv4-fp4-gb300-dynamo-sglang-mtp image bump will hit a 404. Every other recent entry in this file uses the real PR number (e.g. pull/1554, pull/1555, pull/1516, pull/1514), so this entry breaks the established convention.

Fix

Replace PLACEHOLDER with 1559 (this PR's number):

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1559

Step-by-step proof

Open perf-changelog.yaml and look at line 3137. Actual content: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER.

Run git show f66004e -- perf-changelog.yaml — the added line is + pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER, i.e. the committed diff contains the placeholder literally.

Run grep -r PLACEHOLDER over the repo — PLACEHOLDER appears in no other file and in no workflow under .github/, so there is no substitution mechanism that would replace it post-merge.

Navigate to https://github.com/SemiAnalysisAI/InferenceX/pull/PLACEHOLDER — GitHub returns 404 (path is not a valid PR number).

- Drop SGLANG_OPT_USE_JIT_NORM, SGLANG_OPT_USE_JIT_INDEXER_METADATA, SGLANG_OPT_USE_TOPK_V2 (now default-on in latest sglang). - Drop the MegaMoE companion envs that sglang now auto-sets when SGLANG_OPT_USE_DEEPGEMM_MEGA_MOE is enabled: SGLANG_OPT_USE_DEEPGEMM_MEGA_MOE, SGLANG_OPT_FIX_HASH_MEGA_MOE, SGLANG_OPT_FIX_MEGA_MOE_MEMORY, SGLANG_OPT_FIX_NEXTN_MEGA_MOE, SGLANG_DEEPEP_NUM_MAX_DISPATCH_TOKENS_PER_RANK. - Drop SGLANG_RADIX_DISABLE_REUSE and SGLANG_OPT_USE_FAST_MASK_EP which no longer exist in sglang's environ.py.

github-actions · 2026-05-22T23:34:06Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26315949011
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26315949011

github-actions · 2026-05-22T23:35:46Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26316947560
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26316947560

github-actions · 2026-05-22T23:37:10Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26317022901
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26317022901

- Update dynamo commit hash to 81d0555ee23519cea80a42b4fe824e30368b7300 across all 6 dsv4 8k1k disagg recipes. - Quote moe-a2a-backend value as "megamoe" for consistency with other string fields. - Remove the now-unused deepep-config entries; megamoe doesn't read them.

github-actions · 2026-05-23T00:41:20Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26318696409
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26318696409

github-actions · 2026-05-23T01:40:02Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26318696409
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26318696409

github-actions · 2026-05-24T05:35:05Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26318696409
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26318696409

[GB300][SGLang] Bump SGLang image for dsv4-fp4-gb300-dynamo-sglang-mtp

f66004e

Update SGLang container image from nightly-dev-cu13-20260510-2473659e to nightly-dev-20260522-c9153da5 across all DeepSeek-V4 8k1k disagg recipes.

Fridge003 requested a review from a team May 22, 2026 22:55

github-project-automation Bot added this to InferenceMAX Board May 22, 2026

Update perf-changelog.yaml with PR #1559 link

a6666cc

Fridge003 added the full-sweep-enabled label May 22, 2026

claude Bot reviewed May 22, 2026

View reviewed changes

Switch moe-a2a-backend from deepep to megamoe in MegaMoE blocks

0be6580

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GB300][SGLang] Bump SGLang image for dsv4-fp4-gb300-dynamo-sglang-mtp#1559

[GB300][SGLang] Bump SGLang image for dsv4-fp4-gb300-dynamo-sglang-mtp#1559
Fridge003 wants to merge 5 commits into
mainfrom
sgl_image_bump_dsv4

Fridge003 commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

claude Bot May 22, 2026

Uh oh!

claude Bot May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

github-actions Bot commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Fridge003 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

claude Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

github-actions Bot commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fridge003 commented May 22, 2026 •

edited

Loading