fix(checkpoint): ship nemotron_h remote-code in converted HF dirs for vLLM by Kyle1668 · Pull Request #11 · GeodesicResearch/geodesic-megatron

Kyle1668 · 2026-06-16T04:05:49Z

Problem

fixup_hf_output deliberately removes auto_map + the configuration/modeling_nemotron_h.py files after conversion, assuming "transformers >= 5.3.0 has native NemotronH support." That holds for the megatron env (transformers 5.10.x) but breaks every vLLM-based eval/inference consumer: vLLM 0.18.x is hard-pinned to transformers<5,>=4.56.0 (4.57.x), which has no native NemotronH. So vLLM's config parse fails with Transformers does not recognize this architecture and the model never loads.

This silently broke the GEOD-147 MQ capability evals (few-shot tasks load via vLLM). The manual workaround was copying the modeling files + setting auto_map into each converted dir.

Fix

Flip the step remove → ensure-present: fetch the upstream model's remote code (configuration_nemotron_h.py, modeling_nemotron_h.py) + auto_map via hf_hub_download and write them into the converted dir. transformers >= 5.3.0 ignores the remote code (uses native), so it's safe for both consumers. One file, +41/−16.

Note: the export's HF snapshot has config.json (with auto_map) but not the .py remote code — from_hf_pretrained doesn't trust_remote_code, so the files are never pulled locally — hence hf_hub_download rather than a snapshot copy.

Validation

hf_hub_download confirmed to fetch both files from nvidia/...Super-120B-A12B-BF16 (config 19.8 KB, modeling 82.3 KB).
py_compile clean; ruff adds no new lint (only 2 pre-existing repo-debt errors remain).
Not yet run end-to-end through a full conversion — the logic mirrors the manual patch that unblocked the MQ evals.

🤖 Generated with Claude Code

… vLLM fixup_hf_output previously REMOVED auto_map + the configuration/modeling_nemotron_h.py files, assuming transformers >= 5.3.0 has native NemotronH support. That holds for the megatron env (transformers 5.10.x) but breaks every vLLM consumer: vLLM 0.18.x is hard-pinned to transformers<5 (4.57.x), which has NO native NemotronH -- so its config parse fails ("Transformers does not recognize this architecture") and the model never loads. This silently broke the MQ capability evals. Flip remove -> ensure-present: copy the upstream model's remote-code modeling files + auto_map (from the HF cache snapshot the export already downloaded) into the converted dir. transformers >= 5.3.0 ignores the remote code (uses native), so it's safe for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(checkpoint): ship nemotron_h remote-code in converted HF dirs for vLLM#11

fix(checkpoint): ship nemotron_h remote-code in converted HF dirs for vLLM#11
Kyle1668 wants to merge 1 commit into
mainfrom
conv-modeling-fix

Kyle1668 commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Kyle1668 commented Jun 16, 2026

Problem

Fix

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant