taboo-rl

GRPO fine-tuning a small open-weight LLM (Llama 3.2 3B) to play Taboo — give a clue for a target word without using any of the forbidden words, against a frozen guesser.

The setup:

Giver: Llama 3.2 3B Instruct (the model being trained)
Guesser: Llama 3.1 8B Instruct (frozen)
Verifier: rule-based string matching with morphological stemming
Training: GRPO on Modal, 2× A100-40GB

Results

With the help of Claude, I am compiling a living writeup of the project. I'll link to the chapters below as I add them:

results/ch1/ch1.md introduces the project and describes the first round of training runs.
results/ch2/ch2.md details ablations for improving win rate while keeping violation rate down, ultimately culminating in a checkpoint that outperformed much larger models.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
results		results
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
aggregate_r16_training.py		aggregate_r16_training.py
bon.py		bon.py
build_competitor_tidy.py		build_competitor_tidy.py
build_dataset.py		build_dataset.py
ci.py		ci.py
crossplay.py		crossplay.py
eval.py		eval.py
eval_r11_sweep.py		eval_r11_sweep.py
eval_to_csv.py		eval_to_csv.py
hybrid_frontier_eval.py		hybrid_frontier_eval.py
judge.py		judge.py
local_giver_eval.py		local_giver_eval.py
mechanism_check.py		mechanism_check.py
merge_giver_into_clue_compare.py		merge_giver_into_clue_compare.py
plot_common.py		plot_common.py
plot_crossplay.py		plot_crossplay.py
plot_curves.py		plot_curves.py
plot_guesser_matrix.py		plot_guesser_matrix.py
plot_holdout_comparison.py		plot_holdout_comparison.py
plot_length_penalty.py		plot_length_penalty.py
plot_r12.py		plot_r12.py
plot_r16.py		plot_r16.py
plot_r16_training.py		plot_r16_training.py
plot_r19.py		plot_r19.py
plot_r19_violations.py		plot_r19_violations.py
plot_run10.py		plot_run10.py
plot_scaling.py		plot_scaling.py
plot_tradeoff_opener.py		plot_tradeoff_opener.py
pull_evals.py		pull_evals.py
pyproject.toml		pyproject.toml
saturation_r16.py		saturation_r16.py
taboo_sample_deck.csv		taboo_sample_deck.csv
taboo_test.csv		taboo_test.csv
taboo_train.csv		taboo_train.csv
train.py		train.py
uv.lock		uv.lock
verifier.py		verifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

taboo-rl

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

taboo-rl

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages