Skip to content

fix(sglang): honor ModelExpress model name override#441

Open
Broduker wants to merge 2 commits into
ai-dynamo:mainfrom
Broduker:feat/sglang-model-name-override
Open

fix(sglang): honor ModelExpress model name override#441
Broduker wants to merge 2 commits into
ai-dynamo:mainfrom
Broduker:feat/sglang-model-name-override

Conversation

@Broduker

@Broduker Broduker commented Jun 17, 2026

Copy link
Copy Markdown

Summary

  • Honor load_config.modelexpress_model_name when SGLang builds SourceIdentity.model_name.
  • Preserve the existing fallback to model_config.model_path / model_config.model.
  • Add a regression test for the SGLang ModelExpress model-name override.

Problem

Without this, SGLang uses the local model path as SourceIdentity.model_name, which changes the computed mx_source_id and can prevent ModelExpress from matching the intended source.

Tests

  • uv run --extra dev pytest tests/test_sglang_loader.py -q

Summary by CodeRabbit

  • New Features

    • SGLang engine adapter now supports optional model name override via configuration.
  • Bug Fixes

    • Improved tensor discovery in SGLang adapter to include hidden tensors and buffers, with proper deduplication for shared storage.
    • Enhanced tensor attribute capture during weight processing.
  • Tests

    • Added test coverage for model name override functionality and tensor collection behavior.

Broduker added 2 commits June 17, 2026 17:17
Signed-off-by: shenls <shenlinshan@kanzhun.com>
Signed-off-by: shenls <shenlinshan@kanzhun.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4cdd1be1-f6b0-4eda-83cd-61485c2deec0

📥 Commits

Reviewing files that changed from the base of the PR and between 0c45617 and f824830.

📒 Files selected for processing (3)
  • .github/copy-pr-bot.yaml
  • modelexpress_client/python/modelexpress/engines/sglang/adapter.py
  • modelexpress_client/python/tests/test_sglang_loader.py

Walkthrough

The SGLang adapter is updated to collect model buffers in addition to parameters for NIXL tensor registration (with storage-pointer deduplication), adopt hidden tensors during discovery, and wrap quantization post-load processing inside capture_tensor_attrs. build_sglang_source_identity and _get_model_name gain an optional load_config parameter that allows overriding the identity model name. Three new unit tests cover these behaviors. The copy-pr-bot CI config is toggled to enabled: true.

Changes

SGLang Adapter: tensor collection and identity changes

Layer / File(s) Summary
load_config model name override in SourceIdentity
modelexpress_client/python/modelexpress/engines/sglang/adapter.py, modelexpress_client/python/tests/test_sglang_loader.py
SglangAdapter.build_identity() passes load_config into build_sglang_source_identity; _get_model_name returns load_config.modelexpress_model_name when set, otherwise falls back to model_config.model_path/model_config.model; test asserts the override is preferred over model_path.
Tensor discovery: buffers, hidden tensors, and quant attrs
modelexpress_client/python/modelexpress/engines/sglang/adapter.py, modelexpress_client/python/tests/test_sglang_loader.py
Imports adopt_hidden_tensors and capture_tensor_attrs; discover_tensors calls adopt_hidden_tensors before collection; quant post-load loop is wrapped in capture_tensor_attrs; collect_sglang_tensors now iterates both named_parameters and named_buffers with .data extraction and storage-pointer deduplication; tests cover buffer inclusion and parameter-vs-buffer deduplication.

CI copy-pr-bot config toggle

Layer / File(s) Summary
copy-pr-bot enabled flag
.github/copy-pr-bot.yaml
Sets enabled: true at the top level.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 Hopping through tensors, parameters and buffers too,
Dedup by storage — no duplicates will do!
Hidden tensors adopted, quant attrs wrapped neat,
Model name overrides make identity complete.
The bot is now enabled: true, hooray!
This bunny approves — what a wonderful day! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: honoring the ModelExpress model name override in SGLang, which is the core purpose of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Broduker Broduker force-pushed the feat/sglang-model-name-override branch from f824830 to 6d9c865 Compare June 17, 2026 11:21

@zhengluo-nv zhengluo-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one integration-contract issue to address.

load_config: LoadConfig | None = None,
) -> str:
if load_config is not None:
override = getattr(load_config, "modelexpress_model_name", None)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override does not appear reachable from the current SGLang runtime contract. Upstream SGLang main only carries modelexpress_url and modelexpress_transport on LoadConfig, and --modelexpress-config only parses url / transport, so the documented launch path never sets load_config.modelexpress_model_name. That means SourceIdentity.model_name still falls back to model_config.model_path, leaving the local-path mismatch described in the PR unresolved unless a matching SGLang change also adds and passes this field. Please either add/land that SGLang plumbing and document the key, or derive the intended model name from an existing runtime field.

@Broduker Broduker Jun 26, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @zhengluo-nv.

Originally, we wanted to solve the SGLang same-configuration loading issue for ModelExpress: source and target instances should be able to start with the same SGLang/Dynamo configuration, without explicitly separating source/target roles or requiring seed instance settings. In our custom SGLang branch, this was implemented by delegating the adaptive loading policy to the ModelExpress package, and modelexpress_config.model_name was passed down as load_config.modelexpress_model_name to keep source/target identity stable when their local model paths differed.

This PR was created to support that custom SGLang behavior on the ModelExpress side, by making the SGLang adapter honor load_config.modelexpress_model_name when building SourceIdentity.

However, SGLang has now been addressed that delegates ModelExpress loading to the ModelExpress package. The upstream implementation also supports same-configuration adaptive loading for the ModelExpress backend, but it does not expose or pass modelexpress_model_name; it derives identity from model_config.model_path.

So this PR is no longer necessary for the upstream SGLang path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants