fix(disagg): NVFP4 + publish-only rollout fixes for disaggregated tra…#1
Open
jvmncs wants to merge 1 commit into
Open
fix(disagg): NVFP4 + publish-only rollout fixes for disaggregated tra…#1jvmncs wants to merge 1 commit into
jvmncs wants to merge 1 commit into
Conversation
…ining Found bringing up disaggregated NVFP4 rollout (cookbook/miles_disagg, Moonlight/Kimi-K2.6) on Modal: - megatron_to_hf/processors: route quant_algo=="NVFP4" to quantize_params_nvfp4 (modelopt NVFP4 checkpoints advertise quant_method="modelopt", so dispatch on quant_algo); NVFP4 export was never reached before. - rollout/sglang_rollout: GenerateState.semaphore was Semaphore(0) in publish-only mode (rollout_num_gpus==0) -> every rollout deadlocked. Bound generation concurrency by sglang_server_concurrency when rollout_endpoint_url is set. - utils/http_utils: init_http_client early-returned when rollout_num_gpus==0, leaving _http_client=None -> "'NoneType' has no attribute 'post'". Initialize it and bound _client_concurrency by sglang_server_concurrency in publish-only. - update_weight/update_weight_from_disk_delta: flatten before .view(torch.uint8) (.contiguous().reshape(-1).view) so 0-dim NVFP4 scalar tensors (weight_scale_2, input_scale) don't crash the disk-delta encode/snapshot. - chat_template_utils/deepseek_v4: guard the `encoding_dsv4` import so non-V4 models load on sglang-miles builds without that symbol.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ining
NOTE: Requires Megatron-LM patch here
Found bringing up disaggregated NVFP4 rollout (cookbook/miles_disagg, Moonlight/Kimi-K2.6) on Modal:
encoding_dsv4import so non-V4 models load on sglang-miles builds without that symbol.