Skip to content

feat(viewer): interactive solution visualizations via per-task visualize.py#20

Merged
aktasbatuhan merged 1 commit into
mainfrom
feat/viewer-solution-viz
Jun 4, 2026
Merged

feat(viewer): interactive solution visualizations via per-task visualize.py#20
aktasbatuhan merged 1 commit into
mainfrom
feat/viewer-solution-viz

Conversation

@aktasbatuhan

Copy link
Copy Markdown
Member

Viewer roadmap item E. Adds a "see the solution" view — an interactive picture of what a run's best program actually produced, modeled on AlphaEvolve's solution panel.

The design (as agreed)

A task ships an optional visualize.py beside its evaluator.py — the visualizer is a per-task input, exactly parallel to the evaluator. It exposes one function:

def render(program_path: str) -> str:   # self-contained HTML/SVG fragment (vanilla JS allowed)

Execution: lazy at view-time + cached. Since solutions aren't stored (only code + metrics), the viewer re-runs the best program through visualize.py in a sandboxed subprocess with a timeout (it execs evolved code; the viewer binds to localhost only), and caches the fragment to best/solution_viz.html (invalidated when the program is rewritten). First render ~2–5s, cached ~0.04s. Works retroactively on all existing runs — it only needs the stored best_program.py.

Discovery: by task-dir name under --tasks roots (default examples/), or a visualize.py dropped directly in the bench task dir.

Two reference visualizers (interactive, vanilla JS + inline SVG, no deps, CSS scoped under .kv-sol)

  • circle packing — the packing in the unit square; hover a circle for radius/center/share, scroll-to-zoom, drag-to-pan, shading by radius so load-bearing circles read at a glance.
  • autocorrelation — the step function f plus its autoconvolution f∗f with the score-driving peak marked in red; hover to read any step's height.

Also

  • /setup/{label}/run/{idx}/solution route (graceful "no visualizer" message when absent).
  • Dashboard "see the solution" button, shown only when a visualizer is found.
  • --tasks CLI flag on kai viewer.
  • skills/visualization/SKILL.md documenting the contract for new tasks.

Verification

  • test_solution.py (9): visualizer discovery (direct / by-name / none), render+cache cycle (incl. cache invalidation, force, error-reported-not-raised, missing program), and both reference visualizers render hermetically off the committed initial_program.pys.
  • test_viewer.py (+2): solution route present/absent paths and the conditional dashboard button.
  • Full suite green (333 tests). Verified live in the viewer on real circle + autocorrelation runs.

🤖 Generated with Claude Code

…ize.py

Adds a "see the solution" view to kai viewer: an interactive picture of
what a run's best program actually produced, not just its score. A task
ships an optional visualize.py beside its evaluator.py (the visualizer is
a per-task input, parallel to the evaluator) exposing:

    render(program_path: str) -> str   # self-contained HTML/SVG fragment

Execution model (kaievolve/viewer/solution.py): lazy at view-time, run in
a sandboxed subprocess with a timeout (it execs evolved code; the viewer
binds to localhost), and cached to best/solution_viz.html, invalidated
when the program is rewritten. This works retroactively on existing runs
since it only needs the stored best_program.py. The visualizer is resolved
by task-dir name under --tasks roots (default examples/), or dropped
directly in the bench task dir.

Two reference visualizers, both interactive, vanilla JS + inline SVG, no
deps, CSS scoped under .kv-sol:
  - packing_circles_max_sum_of_radii: the packing in the unit square,
    hover a circle for radius/center/share, scroll-zoom, drag-pan, shading
    by radius.
  - autocorrelation_C1: the step function f plus its autoconvolution f*f
    with the score-driving peak marked; hover to read any step.

Plus: /setup/{label}/run/{idx}/solution route (graceful when no visualizer
exists), a dashboard "see the solution" button shown only when one is
found, a --tasks CLI flag, and skills/visualization/SKILL.md documenting
the contract.

Tests: test_solution.py covers visualizer discovery, the render+cache
cycle (incl. cache invalidation and error reporting), and both reference
visualizers; test_viewer.py covers the route present/absent paths. Full
suite green (333 tests).
@aktasbatuhan aktasbatuhan merged commit f52f43d into main Jun 4, 2026
1 check passed
@aktasbatuhan aktasbatuhan deleted the feat/viewer-solution-viz branch June 4, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant