A live benchmark comparing three model configurations on the same research query — streaming in parallel, with full cost, latency, and quality metrics.
Three agents running simultaneously. The center column shows an advisor call badge mid-stream — Sonnet escalating to Opus for a complex decision before continuing its research loop.
| Configuration | Cost | Quality | Latency | Notes |
|---|---|---|---|---|
| Sonnet solo | $0.17 | 7.8/10 | 88.8s | Baseline — full agentic loop, no advisor |
| Sonnet + Opus advisor | $0.83 | 8.5/10 | 187.5s | Sweet spot — Opus consulted 2× on hard decisions |
| Opus solo | $1.19 | 8.5/10 | 98.8s | Gold standard — full frontier cost |
Sonnet + Advisor matched Opus quality at 70% of the cost.
The Advisor Strategy is a multi-model orchestration pattern where a capable executor model (Sonnet) drives the full agentic loop, but escalates to a more powerful model (Opus) via a dedicated tool call — only for decisions that actually warrant it.
{
"type": "advisor_20260301",
"name": "advisor",
"model": "claude-opus-4-6",
"max_uses": 5
}Required beta header: anthropic-beta: advisor-tool-2026-03-01
The executor stays in control. The advisor provides targeted judgment exactly where it changes the result. Advisor calls surface in the stream as server_tool_use; token cost lands in message_delta.usage.iterations[] where type === "advisor_message", split into input/output for accurate billing at Opus rates ($15/$75 per million).
The bottom panel shows per-dimension quality scores (source depth, reasoning, completeness, accuracy) judged by a separate Opus call after all three runs complete. The summary bar compares cost side-by-side.
- Next.js 15 — App Router, Server Components, route handlers
- TypeScript — end to end
- Tailwind CSS v4 — custom design tokens via
@theme - Anthropic SDK — baseline and Opus agents via
@anthropic-ai/sdk - Raw fetch — advisor agent (SDK doesn't yet expose the advisor tool natively)
- Brave Search API — web search tool execution
- SSE — three parallel streaming agent runs to the client
app/
page.tsx # Main UI — query input, three-column grid, quality chart
api/
research/
baseline/route.ts # Sonnet solo agent — SSE stream
advisor/route.ts # Sonnet + Opus advisor agent — SSE stream
opus/route.ts # Opus solo agent — SSE stream
judge/route.ts # Quality scoring — Opus as judge
components/
ComparisonGrid.tsx # Three-column layout
AgentColumn.tsx # Per-agent streaming output + metrics
MetricsCard.tsx # Cost / tokens / latency display
QualityChart.tsx # Dimension breakdown bar chart
lib/
agents/
baseline-agent.ts # Sonnet agentic loop (SDK streaming)
advisor-agent.ts # Sonnet + advisor loop (raw fetch, beta header)
opus-agent.ts # Opus agentic loop (SDK streaming)
shared.ts # System prompts, tool definitions, web search/fetch
metrics.ts # Pricing constants, cost calculation, formatters
types.ts # Shared TypeScript types
git clone https://github.com/popand/advisor-strategy
cd advisor-strategy
npm installCreate .env.local:
ANTHROPIC_API_KEY=sk-ant-...
BRAVE_API_KEY=... # optional — falls back to placeholder resultsnpm run devOpen http://localhost:3000, enter a research query, and click Run Comparison.
- The advisor feature requires beta access:
anthropic-beta: advisor-tool-2026-03-01 - All three agents run in parallel — expect the full comparison to take 60–120 seconds depending on query complexity
- Web search falls back to a placeholder if
BRAVE_API_KEYis not set; agents still run using training knowledge - Quality scores are judged by a separate Opus call after all three runs complete — expect some variance across runs
Built by Andrei Pop · Principal Engineer, Alethia
Alethia Prism is the intelligence layer that identifies what is forming across systems, context, and time — so organizations can act before outcomes harden.

