Skip to content

XMUDeepLIT/CSST-SSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataset

For Common Voice, download from: https://commonvoice.mozilla.org/en/datasets

Since some audio files in Common Voice are broken, you can use validated_common_voice.py to obtain validated ones. Make sure to replace root_dir, language, and split in the python file.

For NTUML2021, download from: https://huggingface.co/datasets/ky552/ML2021_ASR_ST

For Fisher, download from: https://catalog.ldc.upenn.edu/LDC2010S01

Installation

It is recommended to build a Python-3.10 virtual environment using conda

conda create --name csstllm python=3.10 -y
conda activate csstllm
cd xtuner
pip install -e '.[all]'
pip install -U openai-whisper
pip install evaluate
pip install sacrebleu
pip install jiwer==3.1.0
pip install peft==0.12.0
pip install torch==2.4.0
pip install torchvision==0.19.0
pip install datasets==2.21.0
pip install librosa==0.11.0 soundfile==0.13.0
pip install deepspeed==0.17.4

Training

Taking NTUML2021 as a example

NPROC_PER_NODE=4 xtuner train workspace/9b_llama3_chat_stage1_ntuml.py --deepspeed deepspeed_zero2
NPROC_PER_NODE=4 xtuner train workspace/9b_llama3_chat_stage2_ntuml.py --deepspeed deepspeed_zero2
NPROC_PER_NODE=4 xtuner train workspace/9b_llama3_chat_stage3_ntuml.py --deepspeed deepspeed_zero2
NPROC_PER_NODE=4 xtuner train workspace/9b_llama3_chat_stage4_ntuml.py --deepspeed deepspeed_zero2

Make sure to replace root_dir in the python file.

Evaluation

NPROC_PER_NODE=4 xtuner test workspace/9b_llama3_chat_stage4_ntuml.py --checkpoint work_dir/9b_llama3_chat_stage4_ntuml/epoch_1.pth/mp_rank_00_model_states.pt

About

Code for "Towards Fine-Grained Code-Switch Speech Translation with Semantic Space Alignment" (IJCAI 2026 Main Track)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages