Skip to content

usability: add first-run success, status, and troubleshooting guidance #172

@changliu2

Description

@changliu2

Reporter perspective

As a first-time user, I can run the quickstart command, but I do not know what success looks like, how long the run should take, what the status output means, or where to go when the run fails. The docs mention artifacts and viewer docs, but not the expected first-run checkpoints or troubleshooting path.

Evidence

  • docs/getting-started.md:47-63 shows the quickstart run, status command, and artifact directory, but does not say:
    • expected runtime range,
    • which pipeline stages I should see,
    • how to tell the run completed successfully,
    • which files should exist after success.
  • docs/getting-started.md:55-57 labels the status command as powershell, even though the command itself is shell-agnostic:
    assert-eval results status travel-planner-langgraph-v1 demo-1
    
  • docs/getting-started.md:116-120 links to config/results/viewer docs but not docs/guides/troubleshooting.md, even though missing credentials and model/provider mismatch are common first-run failures.
  • docs/guides/local-viewer.md:20 says the dev server starts at http://localhost:5174, but does not explicitly tell the user to open that URL in a browser.

Recommended fix

Add a short "What success looks like" / "If this fails" block near the quickstart run:

  • Expected stages: systematize, test_set, inference, judge.
  • Expected artifacts: taxonomy.json, test_set.jsonl, inference_set.jsonl, scores.jsonl, metrics.json under artifacts/results/<suite>/<run>/.
  • Status command: use a generic sh or text code block if the command is shell-agnostic.
  • Troubleshooting link: add docs/guides/troubleshooting.md for credentials, model strings, and provider connectivity.
  • Local viewer: explicitly say "Open http://localhost:5174 in your browser."

This is separate from #53, which focuses on making Phoenix less distracting in the quickstart.

Slice rollup

Found by slice 1:

  • C:\Users\changliu2\.copilot\session-state\3714f9ab-3680-4990-a750-a80c932203f2\files\usability-slice-1-rollup.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationshould-fixConfusing or visibly rough but not launch-blocking

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions