Part of #1436 — Evaluate section restructure.
There is no dedicated page explaining how to select and configure agent harnesses for evaluation. Harness choice affects what the model can do during a run (single-step vs multi-step, tools available, system prompt), and comparing harnesses is a first-class use case.
Tasks
- Create
fern/versions/latest/pages/evaluation/harness.mdx
- Cover:
- What a harness is in the evaluation context (agent server config + skills)
- Built-in harnesses and when to use each
- Key config fields for harness setup in
gym eval run
- Harness comparison as an evaluation pattern (vary
agent.config between runs, hold everything else fixed)
- Linking to Configure Agents for server-level setup
- Add navigation card in evaluation index
fern check passes
Part of #1436 — Evaluate section restructure.
There is no dedicated page explaining how to select and configure agent harnesses for evaluation. Harness choice affects what the model can do during a run (single-step vs multi-step, tools available, system prompt), and comparing harnesses is a first-class use case.
Tasks
fern/versions/latest/pages/evaluation/harness.mdxgym eval runagent.configbetween runs, hold everything else fixed)fern checkpasses