-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
get_grad_norm_fp32crashes on empty gradient list withinfnorm or custom p-normbugSomething isn't workingSomething isn't workingStatus: Open.#5529 In NVIDIA/Megatron-LM;- Status: Open.#5525 In NVIDIA/Megatron-LM;
Native Liger-Kernel integration in Megatron-LM
enhancementNew feature or requestNew feature or requestwaiting-on-maintainersWaiting on maintainers to respondWaiting on maintainers to respondStatus: Open.#5488 In NVIDIA/Megatron-LM;🐛 CI failure: test_cuda_graphs.py::TestParallelTransformerBlockCudagraphs::test_gpu_cudagraph (NCCL_GRAPH_REGISTER/expandable_segments env guard)
bugSomething isn't workingSomething isn't workingStatus: Open.#5474 In NVIDIA/Megatron-LM;🐛 CI failure: tests/unit_tests/ssm/test_gated_delta_net.py::TestGatedDeltaNet::test_selective_recompute_gdn (flaky NCCL hang)
bugSomething isn't workingSomething isn't workingStatus: Open.#5473 In NVIDIA/Megatron-LM;Fix flaky unit test test_save_verify_integrity_manifest_directly
bugSomething isn't workingSomething isn't workingStatus: Open.#5467 In NVIDIA/Megatron-LM;Weekly CI failure: GB200 Nemotron3 Super mem-max-allocated mismatch
bugSomething isn't workingSomething isn't workingStatus: Open.#5457 In NVIDIA/Megatron-LM;Weekly CI failure: H100 Mixtral 8x7B iteration-time mismatch
bugSomething isn't workingSomething isn't workingStatus: Open.#5456 In NVIDIA/Megatron-LM;Weekly CI failure: GPT3 weekly GB200/H100 metric mismatches
bugSomething isn't workingSomething isn't workingStatus: Open.#5455 In NVIDIA/Megatron-LM;[BUG] Special-id property aliases on
MegatronTokenizerTextAbstractinfinitely recurse (RecursionError)waiting-on-customerWaiting on the original author to respondWaiting on the original author to respondStatus: Open.#5444 In NVIDIA/Megatron-LM;[ENHANCEMENT] Gefen optimizer support
enhancementNew feature or requestNew feature or requestStatus: Open.#5413 In NVIDIA/Megatron-LM;[BUG] ChainedOptimizer applies global grad-norm clipping to Muon (orthogonalizing) param groups, silently stalling training
bugSomething isn't workingSomething isn't workingwaiting-on-maintainersWaiting on maintainers to respondWaiting on maintainers to respondStatus: Open.#5394 In NVIDIA/Megatron-LM;