Skip to content

Evals playground information architecture#2271

Open
prathmeshpatel wants to merge 2 commits into
mainfrom
evals-ui-1
Open

Evals playground information architecture#2271
prathmeshpatel wants to merge 2 commits into
mainfrom
evals-ui-1

Conversation

@prathmeshpatel
Copy link
Copy Markdown
Collaborator

@prathmeshpatel prathmeshpatel commented May 25, 2026

  • Adds the Evals Playground information architecture foundation for Dashboard / Cases / Runs / Settings
  • Keeps /evals/suite/:suiteId route-compatible as the Dashboard default
  • Adds tested helpers for future dashboard matrix data and canonical v2 surface sequencing

@chelojimenez
Copy link
Copy Markdown
Contributor

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@prathmeshpatel prathmeshpatel marked this pull request as ready for review May 25, 2026 08:12
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels May 25, 2026
@prathmeshpatel prathmeshpatel changed the title hide client picker Evals playground information architecture May 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Internal preview

Preview URL: https://mcp-inspector-pr-2271.up.railway.app
Deployed commit: f167562
PR head commit: f87c3c4
Backend target: staging fallback.
Health: ❌ Convex unreachable — see upsert-preview job logs (staging may need convex deploy)
Access is employee-only in non-production environments.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

Review Change Stack

Walkthrough

This PR introduces suite workspace navigation with a tabbed interface for dashboard, runs, test-cases, and settings views. It adds a suite dashboard data assembly layer that derives the latest completed run, aggregates case and model identifiers, and computes available metrics based on iteration signals. The changes integrate workspace navigation into the suite overview page, add an IA surface dependency model, centralize route view handling via shared constants, and disable the global host bar when on the evals playground. Comprehensive tests verify route parsing, workspace section mapping, dashboard data assembly, and view behavior across different navigation contexts.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts`:
- Around line 56-60: The current getEvalIaSurfaceReference uses
EVAL_IA_SURFACE_SEQUENCE.find(... )! which relies on a non-null assertion;
instead build a Record<EvalIaSurfaceId, EvalIaSurfaceReference> (e.g.,
SURFACE_BY_ID) from EVAL_IA_SURFACE_SEQUENCE at module initialization,
validating at runtime that every EvalIaSurfaceId appears exactly once (throw an
Error if a missing or duplicate id is detected), and then have
getEvalIaSurfaceReference(id: EvalIaSurfaceId) simply return SURFACE_BY_ID[id]
without using !; reference EVAL_IA_SURFACE_SEQUENCE, EvalIaSurfaceId,
EvalIaSurfaceReference and the getEvalIaSurfaceReference function when making
the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a0d12e49-89d8-4483-8c01-389adaf60b10

📥 Commits

Reviewing files that changed from the base of the PR and between ed86a77 and f87c3c4.

📒 Files selected for processing (15)
  • mcpjam-inspector/client/src/App.tsx
  • mcpjam-inspector/client/src/__tests__/App.hosted-oauth.test.tsx
  • mcpjam-inspector/client/src/components/evals/__tests__/create-suite-navigation.test.ts
  • mcpjam-inspector/client/src/components/evals/__tests__/eval-ia-surfaces.test.ts
  • mcpjam-inspector/client/src/components/evals/__tests__/suite-dashboard-data.test.ts
  • mcpjam-inspector/client/src/components/evals/__tests__/suite-iterations-master-detail.test.tsx
  • mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts
  • mcpjam-inspector/client/src/components/evals/suite-dashboard-data.ts
  • mcpjam-inspector/client/src/components/evals/suite-iterations-view.tsx
  • mcpjam-inspector/client/src/components/evals/suite-workspace-nav.tsx
  • mcpjam-inspector/client/src/lib/__tests__/eval-route-url.test.ts
  • mcpjam-inspector/client/src/lib/__tests__/eval-suite-ia.test.ts
  • mcpjam-inspector/client/src/lib/app-navigation.ts
  • mcpjam-inspector/client/src/lib/eval-route-url.ts
  • mcpjam-inspector/client/src/lib/eval-suite-ia.ts

Comment on lines +56 to +60
export function getEvalIaSurfaceReference(
id: EvalIaSurfaceId,
): EvalIaSurfaceReference {
return EVAL_IA_SURFACE_SEQUENCE.find((surface) => surface.id === id)!;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify there are no unsafe/dynamic call paths and that ids in sequence are unique.
set -euo pipefail

echo "Usages of getEvalIaSurfaceReference:"
rg -nP '\bgetEvalIaSurfaceReference\s*\(' --type ts --type tsx -C2

echo
echo "Check sequence id literals and duplicates in eval-ia-surfaces.ts:"
python - <<'PY'
import re, pathlib
p = pathlib.Path("mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts")
s = p.read_text()
ids = re.findall(r'id:\s*"([^"]+)"', s)
print("ids:", ids)
print("unique:", sorted(set(ids)))
print("duplicate_count:", len(ids) - len(set(ids)))
PY

Repository: MCPJam/inspector

Length of output: 143


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts"

echo "== getEvalIaSurfaceReference call sites =="
rg -n --glob '*.ts' --glob '*.tsx' -C2 '\bgetEvalIaSurfaceReference\s*\(' .

echo
echo "== Inspect EVAL_IA_SURFACE_SEQUENCE + getEvalIaSurfaceReference =="
nl -ba "$FILE" | sed -n '1,140p'

echo
echo "== Parse IDs from EVAL_IA_SURFACE_SEQUENCE for duplicates =="
python3 - <<'PY'
import re, pathlib
p = pathlib.Path("mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts")
s = p.read_text()
# grab id: "..." literals inside the sequence array (best-effort)
ids = re.findall(r'EVAL_IA_SURFACE_SEQUENCE\s*=\s*\[\s*[\s\S]*?id:\s*"([^"]+)"', s)
if not ids:
    ids = re.findall(r'\bid:\s*"([^"]+)"', s)
print("ids:", ids)
print("unique:", sorted(set(ids)))
print("duplicate_count:", len(ids) - len(set(ids)))
PY

echo
echo "== Find EvalIaSurfaceId type definition =="
rg -n --glob '*.ts' --glob '*.tsx' 'type\s+EvalIaSurfaceId|enum\s+EvalIaSurfaceId|export\s+type\s+EvalIaSurfaceId|export\s+type\s+.*EvalIaSurfaceId' mcpjam-inspector/client/src/components/evals -S || true

Repository: MCPJam/inspector

Length of output: 1587


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts"

echo "== File excerpt (sequence + helper) =="
sed -n '1,130p' "$FILE"

echo
echo "== Evaluate getEvalIaSurfaceReference call sites (ts/tsx) =="
rg -n --glob '*.ts' --glob '*.tsx' '\bgetEvalIaSurfaceReference\s*\(' mcpjam-inspector/client/src/components/evals

echo
echo "== Inspect unit tests for this helper =="
sed -n '1,120p' mcpjam-inspector/client/src/components/evals/__tests__/eval-ia-surfaces.test.ts

echo
echo "== Extract EvalIaSurfaceId definition =="
rg -n --glob '*.ts' --glob '*.tsx' 'EvalIaSurfaceId' mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts

Repository: MCPJam/inspector

Length of output: 2894


Replace non-null assertion in getEvalIaSurfaceReference with a total keyed lookup (EvalIaSurfaceId).

mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts:56-60 currently works because EvalIaSurfaceId is exactly "S1" | ... | "S6" and EVAL_IA_SURFACE_SEQUENCE includes all six ids (covered by tests). Still, EVAL_IA_SURFACE_SEQUENCE.find(... )! doesn’t guarantee completeness/uniqueness, so any future edit that drops or duplicates an id can yield undefined and crash at runtime.

  • Replace it with a Record<EvalIaSurfaceId, EvalIaSurfaceReference> derived from EVAL_IA_SURFACE_SEQUENCE (with a small runtime invariant for completeness/uniqueness), and return the map value—no !.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcpjam-inspector/client/src/components/evals/eval-ia-surfaces.ts` around
lines 56 - 60, The current getEvalIaSurfaceReference uses
EVAL_IA_SURFACE_SEQUENCE.find(... )! which relies on a non-null assertion;
instead build a Record<EvalIaSurfaceId, EvalIaSurfaceReference> (e.g.,
SURFACE_BY_ID) from EVAL_IA_SURFACE_SEQUENCE at module initialization,
validating at runtime that every EvalIaSurfaceId appears exactly once (throw an
Error if a missing or duplicate id is detected), and then have
getEvalIaSurfaceReference(id: EvalIaSurfaceId) simply return SURFACE_BY_ID[id]
without using !; reference EVAL_IA_SURFACE_SEQUENCE, EvalIaSurfaceId,
EvalIaSurfaceReference and the getEvalIaSurfaceReference function when making
the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants