THUDM · aoshen02 · Jun 26, 2026 · Jun 26, 2026
diff --git a/examples/README.md b/examples/README.md
@@ -4,6 +4,7 @@ These examples provide concrete examples to leverage slime in your own RL workfl
 
 ## Directory Structure
 
+- **[coding_agent_rl](./coding_agent_rl)**: End-to-end SWE coding-agent RL — a real coding agent (claude-code / codex) edits code in a per-sample sandbox, and the resulting `git diff` is graded against the dataset's test harness.
 - **[eval_multi_task](./eval_multi_task)**: Example for supporting evaluation multiple tasks with different configs.
 - **[fully_async](./fully_async)**: Demonstrates fully asynchronous rollout generation for higher efficiency.
 - **[geo3k_vlm](./geo3k_vlm)**: Training VLMs on a single-turn reasoning task using GRPO on the GEO3K dataset.

diff --git a/examples/fully_async/README.md b/examples/fully_async/README.md
@@ -57,7 +57,7 @@ work unchanged under fully-async:
 --custom-rm-path                your.module.reward      # (args, sample | list[Sample]) -> float | list[float]
 ```
 
-See `examples/swe_codex/` for a non-trivial example that plugs in a
+See `examples/coding_agent_rl/` for a non-trivial example that plugs in a
 multi-turn agent (Claude Code in a Docker-Proxy sandbox) this way.
 
 ## Worker Internals (Very Short)