Migrate GPT-OSS to HybridModel by Phlip79 · Pull Request #4476 · NVIDIA-NeMo/Megatron-Bridge

Phlip79 · 2026-06-24T03:36:45Z

What does this PR do ?

Migrates GPT-OSS from using GPTModel to HybridModel. To accomplish this, I needed to update HybridModelProvider to support yarn, which HybridModel already supports in MCore.

Perf comparison results

Needs #4508 for correct TFLOPS calculation

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

copy-pr-bot · 2026-06-24T03:36:48Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Phlip79 · 2026-06-24T16:26:15Z

/ok to test c5e00e4

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

Phlip79 · 2026-06-24T16:37:16Z

/ok to test 8e4d0fa

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

claude · 2026-06-25T21:32:52Z

Light Code Review

Overall the migration from GPTModel/GPTModelProvider to HybridModel/HybridModelProvider looks correct and well-tested for the core paths. A few observations:

Missing test for megatron_to_hf_config - The new classmethod that recovers num_hidden_layers from the hybrid layer pattern has no unit test. A small test with a known pattern would guard against regressions in the Symbols.PIPE/MTP_SEPARATOR stripping logic.
Dual model.norm.weight mapping - Intentional per the test, but deserves a brief inline comment explaining why both decoder.final_layernorm.weight and decoder.final_norm.weight are registered.
YARN settings not verified in test - test_provider_bridge_maps_config does not assert position_embedding_type == yarn or any YARN-specific field, even though extending HybridModelProvider with YARN fields is a key part of this PR.

Suggested test cases: No perf/recipe config files are changed in this PR, but this migration affects all GPT-OSS perf tests since the underlying provider type changed. All GPT-OSS perf configs (20B and 120B, all GPU targets and precisions) should be validated: gpt_oss_20b_8gpu_pretrain_perf (all GPU/precision combos), gpt_oss_120b_pretrain_perf (all GPU/precision combos), test_gpt_oss_120b_perf_config_instantiation, L1 functional L1_Launch_recipes_gpt_oss (pretrain + finetune), L1 functional L1_Launch_models_gpt_oss (model conversion tests).

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

refactor(models): migrate GPT-OSS to HybridModel

c5e00e4

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

copy-pr-bot Bot temporarily deployed to public June 24, 2026 16:26 Inactive

style(models): format GPT-OSS bridge

8e4d0fa

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

copy-pr-bot Bot temporarily deployed to public June 24, 2026 16:38 Inactive

copy-pr-bot Bot temporarily deployed to test June 24, 2026 16:38 Inactive

copy-pr-bot Bot temporarily deployed to public June 24, 2026 16:47 Inactive

copy-pr-bot Bot temporarily deployed to public June 24, 2026 17:10 Inactive

Merge remote-tracking branch 'origin/main' into philip/gpt-oss-hybrid

46fb300

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

Phlip79 force-pushed the philip/gpt-oss-hybrid branch from d3165d6 to 46fb300 Compare June 25, 2026 20:45

Phlip79 changed the base branch from main to philip/hybrid-flop-accounting June 25, 2026 21:25

Phlip79 mentioned this pull request Jun 25, 2026

Correct Hybrid FLOP Calculation #4508

Open

Phlip79 marked this pull request as ready for review June 25, 2026 21:27

Phlip79 requested a review from yaoyu-33 June 25, 2026 21:27

claude Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread src/megatron/bridge/models/gpt_oss/gpt_oss_bridge.py

claude Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread src/megatron/bridge/models/gpt_oss/gpt_oss_bridge.py

claude Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread tests/unit_tests/models/gpt_oss/test_gpt_oss_bridges.py

test(gpt-oss): cover yarn hybrid config

422b6df

Signed-off-by: Philip Petrakian <ppetrakian@nvidia.com>

Phlip79 mentioned this pull request Jun 25, 2026

PoR: migrate to MCore HybridModel #4510

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate GPT-OSS to HybridModel#4476

Migrate GPT-OSS to HybridModel#4476
Phlip79 wants to merge 4 commits into
philip/hybrid-flop-accountingfrom
philip/gpt-oss-hybrid

Phlip79 commented Jun 24, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

Phlip79 commented Jun 24, 2026

Uh oh!

Phlip79 commented Jun 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Phlip79 commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

Phlip79 commented Jun 24, 2026

Uh oh!

Phlip79 commented Jun 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude Bot commented Jun 25, 2026

Light Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Phlip79 commented Jun 24, 2026 •

edited

Loading