Scale Studio run-list/detail to thousands of runs (entire-style indexed reads)

Tracking issue for scaling Studio's eval-run display. Inspired by entireio/cli's \"single ref + tree-reads\" architecture.

## Today's bottleneck

`listResultFilesFromRunsDir` (`apps/cli/src/commands/inspect/utils.ts:572`) walks the runs directory with `readdir` + `statSync` + `loadResultFile` for every run, called via `/api/runs` every 5s by Studio. Cost is O(N runs × per-manifest read) per refresh. Stalls at hundreds of runs; falls over at thousands.

## Sub-tasks

- [ ] **P1: Append-only run index in results repo** (~2d)
  - Write `index/runs.jsonl` on every push in `directPushResults` (`packages/core/src/evaluation/results-repo.ts:407`).
  - Each row: `{run_id, timestamp, experiment, target, test_count, passed, pass_rate, avg_score, tags, sha}`.
  - Studio list view reads ONE file instead of N manifest reads.
  - Ship a `agentv results reindex` CLI to backfill existing repos.

- [ ] **P2: Cache remote runs server-side, invalidate on manual sync** (~0.5d)
  - Keep `/api/runs` merged (clean URL — `source` stays per-row metadata for the badge, not a URL concern).
  - Server: cache the remote portion of `listMergedResultFiles` in memory; invalidate only on `POST /api/remote/sync`.
  - Local portion stays computed fresh per request (in-flight runs need freshness).
  - Stops per-poll readdir/`git ls-tree` of the remote cache (currently happens 12x/min for no reason).

- [ ] **P3: Read remote runs via `git ls-tree` + `git cat-file`, not working-tree readdir** (~2d)
  - Drop `git checkout` + `git pull --ff-only` from `updateCacheRepo` (`results-repo.ts:167`) — just `git fetch origin --prune`.
  - New `listResultFilesFromGitTree(repoDir, treePath)` sibling to `listResultFilesFromRunsDir`.
  - File-content endpoints in `apps/cli/src/commands/results/serve.ts` swap `readFileSync` for `git cat-file -p` when source is remote.
  - Pairs naturally with the existing `--filter=blob:none` clone (`results-repo.ts:191`) — blobs are fetched only when a detail view opens.

- [ ] **P4: Pagination/cursor on `/api/runs`** (~1d)
  - Plumb existing `limit?` param through `/api/runs?limit=50&cursor=<run_id>`.
  - Studio switches to `useInfiniteQuery` (`apps/studio/src/lib/api.ts:62`).
  - Sentinel-row infinite scroll in `RunList.tsx`.
  - Trivial after P1 (cursor = byte offset into index file or last `run_id` seen).

- [ ] **P5: Zero-config same-repo mode** (~5d) — *strategic, defer until P1-P3 land*
  - When `results` is not configured, write run artifacts as commits on `refs/agentv/runs/v1` in the **source repo** (not under `refs/heads/` — keeps the ref out of default `git fetch`/`git push`, `git log`, `git branch`, and clone bloat).
  - Borrows entire's pattern; the non-`refs/heads/` namespace fixes entire's latent clone-bloat problem.
  - Studio reads from local ref via go-git equivalent.
  - Promotion path: `agentv results promote --to <org/repo>` copies the local ref history to a new separate repo when users outgrow solo mode.

- [ ] **P6: `Agentv-Run: <run-id>` commit trailer** (~2d)
  - Mirrors entire's `Entire-Checkpoint:` trailer.
  - At run start, record `git rev-parse HEAD` into the manifest.
  - `agentv results link` adds the trailer post-hoc on the source commit.
  - Studio RunDetail deep-links to the source commit via `results.repo`.

## Recommended sequence

**P1 + P2 + P4 first** (~3.5d total) → unlocks \"thousands of runs without UI lag\", which is the original goal.

Then **P3** (~2d, cleaner internals, pairs with P1 to make all reads object-DB-only).

Then **P6** (~2d, low priority but well-bounded).

**P5** last (~5d) — it's a UX revolution, not a scale fix.

## Background

The premise question: \"should Studio read from a cloned copy or from GitHub directly?\" Answer: **cloned copy is correct, agentv already does this**. The fix is making the local reads cheap, not changing data location. Reading from GitHub Contents API would hit rate limits and slow Monaco file-tree views to a crawl.

## References

- entire pattern: `entire/checkpoints/v1` ref, sharded `<id[:2]>/<id[2:]>/` paths (entireio/cli `docs/architecture/sessions-and-checkpoints.md`).
- Today's list path: `apps/cli/src/commands/inspect/utils.ts:572`, `apps/cli/src/commands/results/remote.ts:166`.
- Today's write path: `packages/core/src/evaluation/results-repo.ts:407`.
- Studio API surface: `apps/studio/src/lib/api.ts:62`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale Studio run-list/detail to thousands of runs (entire-style indexed reads) #1259

Today's bottleneck

Sub-tasks

Recommended sequence

Background

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scale Studio run-list/detail to thousands of runs (entire-style indexed reads) #1259

Description

Today's bottleneck

Sub-tasks

Recommended sequence

Background

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions