Part of #1436 — Evaluate section restructure.
There is currently no dedicated page explaining how to configure models specifically for evaluation runs — model choice, sampling settings, repeat counts, and how these affect score reliability.
Tasks
- Create
fern/versions/latest/pages/evaluation/models.mdx
- Cover:
- Choosing a model for evaluation (policy model, reference model)
- Relevant config fields (
model_name, sampling_params, repeat count)
- How sampling settings interact with pass@1 vs pass@k
- Linking to Configure Models for server-level setup
- Add navigation card in evaluation index
fern check passes
Part of #1436 — Evaluate section restructure.
There is currently no dedicated page explaining how to configure models specifically for evaluation runs — model choice, sampling settings, repeat counts, and how these affect score reliability.
Tasks
fern/versions/latest/pages/evaluation/models.mdxmodel_name,sampling_params, repeat count)fern checkpasses