Skip to content

MM-Speech/DualAxisRM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DualAxisRM

This repository contains training utilities for DualAxisRM.

Overview

The code evaluates spoken dialogue along two axes:

  • Response Relevance: whether the reply is logically consistent and topically appropriate
  • Interactional Fluency: whether turn-taking is natural, including long pauses and extended overlap

The final label is binary:

  • 0: poor interaction
  • 1: strong interaction

Repository Layout

DualAxisRM/
├── examples/
│   └── data/
├── scripts/
├── src/
│   └── dual_axis_rm/
└── tools/

Installation

pip install -r requirements.txt
pip install -e .

Data Format

Each input line in examples/data/source.example.jsonl follows this schema:

{
  "audio": "relative/or/absolute/path/to/dialogue.wav",
  "overall_score": 0,
  "response_think": "The response stays coherent and answers the previous turn directly.",
  "fluency_think": "Turn-taking is natural, with no harmful overlap or long silence."
}

Build SFT data:

python tools/build_dataset.py \
  --input examples/data/source.example.jsonl \
  --output data/train_sft.jsonl \
  --mode sft

Build GRPO data:

python tools/build_dataset.py \
  --input examples/data/source.example.jsonl \
  --output data/train_grpo.jsonl \
  --mode grpo

Training

MODEL_PATH=Qwen/Qwen2.5-Omni-7B \
DATASET_PATH=data/train_sft.jsonl \
OUTPUT_DIR=outputs/sft \
bash scripts/train_sft.sh
MODEL_PATH=outputs/sft/checkpoint-xxx \
DATASET_PATH=data/train_grpo.jsonl \
OUTPUT_DIR=outputs/grpo \
bash scripts/train_grpo.sh

Inference

MODEL_PATH=outputs/grpo/checkpoint-xxx \
VAL_DATASET=data/val.jsonl \
bash scripts/infer.sh

About

[ACL 2026] Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors