Fix MLite microbatch loss and forward-only output contracts#68
Open
ISEEKYAN wants to merge 4 commits into
Open
Fix MLite microbatch loss and forward-only output contracts#68ISEEKYAN wants to merge 4 commits into
ISEEKYAN wants to merge 4 commits into
Conversation
e9ca87c to
8a5d864
Compare
Mirror VERL’s Megatron loss-reduction hook so schedules retain standard microbatch averaging while logical-batch PPO gradients and per-micro reporting remain correct. Preserve loss context propagation, all-micro metric aggregation, and PP1 forward-only outputs.
8a5d864 to
5d2f0c9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
logical_loss * num_microbatchesto the schedule while retaining MLite’s standard schedule-side microbatch averaging.Why
VERL PPO/SFT losses are already contributions normalized against the logical global batch. Megatron does not change its runtime API for this case: its VERL postprocess hook compensates for the schedule’s fixed microbatch averaging and reports the unscaled reduction payload separately. MLite now follows the same contract, keeping connector-specific normalization out of the public runtime interface.
Scope
This PR contains only MLite runtime/connector code and focused tests. It does not include launch scripts, training configurations, or changes to the external VERL repository.
Validation
test_loss_microbatch_contract,test_ops_data_trainstep_unit,test_runtime_backend_unit,test_bridge_backend, andtest_mlite_engine_forward_only).COMPLETED, exit code0:0.git diff --checkpassed.main.