TRIAGE: Ethical Benchmarking of AI Models Through Mass Casualty Simulations

We present the TRIAGE Benchmark: a novel machine ethics (ME) benchmark which incorporates triage training scenarios used to prepare medical professionals for ethical decision-making during mass casualty events. These scenarios are real-world ethical dilemmas with solutions that are derived from socially agreed-upon principles offering a more realistic alternative to annotation-based ME benchmarks. By incorporating a variety of different prompting styles, TRIAGE allows us to test the performance of our models across a variety of different contexts. Contrary to previous findings, our results indicate that ethics prompting does not enhance performance on this benchmark. Moreover, we observe that jailbreaking prompts can significantly degrade model performance and alter their relative rankings. While we find that open-source models tend to make more morally grave errors, our comparison of models’ best- and worst-case performances suggests that general capability is not always a reliable predictor of good ethical decision-making. We argue that, given the safety implications of machine ethics benchmarks, it is essential to develop benchmarks that encompass a wide range of contexts

Dataset available at https://huggingface.co/datasets/NLie2/TRIAGE

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
analysis		analysis
datasets/melted		datasets/melted
experiments		experiments
gen_dataset		gen_dataset
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TRIAGE: Ethical Benchmarking of AI Models Through Mass Casualty Simulations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TRIAGE: Ethical Benchmarking of AI Models Through Mass Casualty Simulations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages