Add a Dataset Formats reference guide

## Problem
Dataset shape is easy to misunderstand. Raw GSM8K (`question`/`answer`) needs different formatting per algorithm: RL math uses `prompt`/`solutions`; SFT expects `messages`, `prompt`+`response`, or `text`. "GSM8K works with AReno" does not mean it works for every algorithm.

## Scope
Add a `Dataset Formats` page answering: "What columns must my dataset have for each AReno algorithm?"
- Mental model: raw datasets vs AReno training schemas vs loader functions.
- SFT schemas (`prompt`+`response`, `messages`, `text`) and their loss behavior.
- RL math schema (`prompt`+`solutions` as reward metadata, not an SFT target).
- DPO preference schema (`prompt`/`chosen`/`rejected`).
- Dataset loader function contract and when to use one.
- GSM8K examples for both RL and SFT; state plainly raw GSM8K is not SFT-ready.
- Working CLI examples per shape.

## Acceptance
- Short, exact, example-heavy; no marketing language.
- Cites existing `examples/math/` files.
- No code or CLI changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a Dataset Formats reference guide #70

Problem

Scope

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add a Dataset Formats reference guide #70

Description

Problem

Scope

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions