Bug Description
when i try to convert minimax m2.7's hf checkpoint to torch_dist format with the following command,
source /nfs-152/disk6/tujie/train_env_slime/slime/scripts/models/minimax-m2.7.sh
echo "开始将 HuggingFace 格式转换为 torch-dist 格式..."
echo "输入路径: ${HF_MODEL_PATH}"
echo "输出路径: ${TORCH_DIST_OUTPUT_PATH}"
# 执行转换(单机 8 卡)
PYTHONPATH=/root/Megatron-LM/:$(pwd) torchrun \
--nproc-per-node 8 \
--master-addr localhost \
--master-port 12345 \
--nnodes=1 \
--node-rank 0 \
tools/convert_hf_to_torch_dist.py \
${MODEL_ARGS[@]} \
--hf-checkpoint ${HF_MODEL_PATH} \
--save ${TORCH_DIST_OUTPUT_PATH}
I ran into the following error:
[rank3]: Traceback (most recent call last):
[rank3]: File "/nfs-152/disk6/tujie/train_env_slime/slime/tools/convert_hf_to_torch_dist.py", line 146, in <module>
[rank3]: main()
[rank3]: File "/nfs-152/disk6/tujie/train_env_slime/slime/tools/convert_hf_to_torch_dist.py", line 119, in main
[rank3]: bridge = AutoBridge.from_pretrained(hf_model_path, trust_remote_code=True)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.12/dist-packages/mbridge/core/auto_bridge.py", line 30, in from_pretrained
[rank3]: return cls.from_config(config, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.12/dist-packages/mbridge/core/auto_bridge.py", line 48, in from_config
[rank3]: return _MODEL_REGISTRY[model_type](hf_config, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.12/dist-packages/mbridge/core/bridge.py", line 51, in __init__
[rank3]: self.config = self._build_config()
[rank3]: ^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/nfs-152/disk6/tujie/train_env_slime/slime/slime_plugins/mbridge/minimax_m2.py", line 42, in _build_config
[rank3]: return self._build_base_config(
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.12/dist-packages/mbridge/core/llm_bridge.py", line 108, in _build_base_config
[rank3]: return self.TransformerConfigClass(**base_config)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: TypeError: TransformerConfig.__init__() got an unexpected keyword argument 'rotary_percent'
i dock pull the latest version slimerl/slime:latest, and git clone the main branch of the repo, what's the possible cause of the problem, and how to solve? Appreciate the help!
Steps to Reproduce
-
docker pull
-
ran the conversion command
Expected Behavior
The conversion succeed
Actual Behavior
ran into error
Environment
- slime version:
- Python version:
- PyTorch version:
- CUDA/ROCm version:
- GPU type and count:
- OS:
- SGLang version (if relevant):
- Megatron-LM version (if relevant):
Logs
Additional Context
No response
Pre-submission Checklist
Bug Description
when i try to convert minimax m2.7's hf checkpoint to torch_dist format with the following command,
I ran into the following error:
i dock pull the latest version
slimerl/slime:latest, and git clone the main branch of the repo, what's the possible cause of the problem, and how to solve? Appreciate the help!Steps to Reproduce
docker pull
ran the conversion command
Expected Behavior
The conversion succeed
Actual Behavior
ran into error
Environment
Logs
Additional Context
No response
Pre-submission Checklist