fix(sglang): honor ModelExpress model name override#441
Conversation
Signed-off-by: shenls <shenlinshan@kanzhun.com>
Signed-off-by: shenls <shenlinshan@kanzhun.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
WalkthroughThe SGLang adapter is updated to collect model buffers in addition to parameters for NIXL tensor registration (with storage-pointer deduplication), adopt hidden tensors during discovery, and wrap quantization post-load processing inside ChangesSGLang Adapter: tensor collection and identity changes
CI copy-pr-bot config toggle
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
f824830 to
6d9c865
Compare
zhengluo-nv
left a comment
There was a problem hiding this comment.
I found one integration-contract issue to address.
| load_config: LoadConfig | None = None, | ||
| ) -> str: | ||
| if load_config is not None: | ||
| override = getattr(load_config, "modelexpress_model_name", None) |
There was a problem hiding this comment.
This override does not appear reachable from the current SGLang runtime contract. Upstream SGLang main only carries modelexpress_url and modelexpress_transport on LoadConfig, and --modelexpress-config only parses url / transport, so the documented launch path never sets load_config.modelexpress_model_name. That means SourceIdentity.model_name still falls back to model_config.model_path, leaving the local-path mismatch described in the PR unresolved unless a matching SGLang change also adds and passes this field. Please either add/land that SGLang plumbing and document the key, or derive the intended model name from an existing runtime field.
There was a problem hiding this comment.
Thanks, @zhengluo-nv.
Originally, we wanted to solve the SGLang same-configuration loading issue for ModelExpress: source and target instances should be able to start with the same SGLang/Dynamo configuration, without explicitly separating source/target roles or requiring seed instance settings. In our custom SGLang branch, this was implemented by delegating the adaptive loading policy to the ModelExpress package, and modelexpress_config.model_name was passed down as load_config.modelexpress_model_name to keep source/target identity stable when their local model paths differed.
This PR was created to support that custom SGLang behavior on the ModelExpress side, by making the SGLang adapter honor load_config.modelexpress_model_name when building SourceIdentity.
However, SGLang has now been addressed that delegates ModelExpress loading to the ModelExpress package. The upstream implementation also supports same-configuration adaptive loading for the ModelExpress backend, but it does not expose or pass modelexpress_model_name; it derives identity from model_config.model_path.
So this PR is no longer necessary for the upstream SGLang path.
Summary
load_config.modelexpress_model_namewhen SGLang buildsSourceIdentity.model_name.model_config.model_path/model_config.model.Problem
Without this, SGLang uses the local model path as
SourceIdentity.model_name, which changes the computedmx_source_idand can prevent ModelExpress from matching the intended source.Tests
uv run --extra dev pytest tests/test_sglang_loader.py -qSummary by CodeRabbit
New Features
Bug Fixes
Tests