Training Agents

Public Codex context for agentic post-training work with TRL.

This repository contains reusable instructions, sub-agent definitions, skills, and lightweight guides for planning, implementing, reviewing, and monitoring agent training workflows.

It is not a training codebase. Keep checkpoints, datasets, logs, and experiment outputs outside the tracked repo, usually under ignored workspaces/ directories or separate project repositories.

Examples

examples/gemma4-pi-mono-sft/: TRL SFT example for google/gemma-4-E2B-it on badlogicgames/pi-mono, with Hugging Face Jobs, LoRA, hosted Trackio logging, verified Job IDs, Inspect AI HumanEval/MBPP coding evals, and private adapter artifact repos.

Guides

program.md: operating model for Training Agents.
docs/program.md: staged challenge ladder from SFT to environment GRPO and self-distillation.
docs/looping-rl.md: blog post on loop-shaped reinforcement learning for agent training systems.
docs/terminal-bench-loop.md: loop-shaped automation contract for training an approximately 2B open model toward Terminal-Bench performance above 40.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.agents/skills		.agents/skills
.codex		.codex
docs		docs
examples/gemma4-pi-mono-sft		examples/gemma4-pi-mono-sft
research		research
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
program.md		program.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training Agents

Examples

Guides

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

Training Agents

Examples

Guides

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages