FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings

Offical code for the paper FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings (ACL 2025 long paper).

Core contribution

Chen et al., 2024 empirically finds that DPO training rarely improves these misranked preference, despite its gradient emphasizing on these cases. We add a simple factor to DPO loss to make DPO focus on "more correct" (see gradient curve) samples. With the introduced hyperparameter fixed (we do not want to over-rely on hyperparameter tuning), it consistently outperforms DPO on Arena-hard and Alpaca Eval.

Released Models

Mistral

We release the following model that are built on top of Mistral-Base SFT (7B) model by training FocalPO on UltraFeedback dataset.

models	Alpaca Eval 2.0 LC	AH WR
tongliuphysics/Mistral-7B-Base-SFT-FocalPO	23.9	17.1

Llama

We release the following model that are built on top of Llama-3-Instruct (8B) model by training FocalPO on the on-policy Llama3-ultrafeedbackarmorm dataset.

models	Alpaca Eval 2.0 LC	AH WR
tongliuphysics/Llama-3-8B-Instruct-FocalPO	54.7	34.6

BibTeX

@article{liu2025focalpo,
  title={FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings},
  author={Liu, Tong and Yu, Xiao and Zhou, Wenxuan and Gu, Jindong and Tresp, Volker},
  journal={arXiv preprint arXiv:2501.06645},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings

Core contribution

Released Models

Mistral

Llama

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings

Core contribution

Released Models

Mistral

Llama

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages