blizhan · blizhan · Apr 30, 2026 · Apr 26, 2026 · Apr 30, 2026 · Apr 30, 2026
diff --git a/.gitignore b/.gitignore
@@ -216,6 +216,8 @@ __marimo__/
 
 # spec-kit
 .specify/
+.agents/skills/spec*
 
 # aim
 data/
+datatest/
diff --git a/AGENTS.md b/AGENTS.md
@@ -81,6 +81,8 @@ Why `3.12`:
 - Existing local Aim repositories (read-only). Image bytes are read (003-query-images-terminal-render)
 - Python 3.12 for development, runtime support `>=3.10,<3.13` + Python standard library, `numpy>=1.24`, `rich>=13.7`, `textual-image>=0.12.0`, existing Aim SDK usage for owned query commands; no new dependency planned (004-run-params-query)
 - Existing local Aim repositories on disk (read-only); run params are read from Aim run metadata attributes under `.aim` (004-run-params-query)
+- Python 3.12 for development, runtime support `>=3.10,<3.13` + Python standard library, `numpy>=1.24`, `rich>=13.7`, `plotext>=5.3`, existing Aim SDK usage for owned trace commands; no new runtime dependency planned (005-distribution-trace-visual)
+- Existing local Aim repositories on disk, read-only; distribution histogram points are read from Aim sequence data under `.aim` (005-distribution-trace-visual)
 
 ## Recent Changes
 - 001-aim-command-passthrough: Added Python 3.12 for development, runtime support `>=3.10,<3.13` + Python standard library, native Aim CLI (external runtime prerequisite for delegated commands), pytest for test automation
diff --git a/README.md b/README.md
@@ -232,6 +232,31 @@ aimx trace "metric.name == 'loss'" --repo data --every 10
 Output modes: default plot, `--table`, `--csv`, `--json`.
 Display controls: `--width W`, `--height H`, `--no-color`.
 
+### Trace distributions
+
+`aimx trace distribution` fetches tracked Aim distribution sequences. By
+default it prints the matched distribution names, selects the first match, and
+renders a non-interactive Rich terminal visual with a web-style blue-gradient
+current-step histogram and step-by-bin heatmap. Use `--table`, `--csv`, or
+`--json` for tensor inspection and scripting.
+
+![aimx trace distribution output preview](static/distributions.png)
+
+```bash
+# Show a web-like terminal visual for the first matched distribution
+aimx trace distribution "distribution.name != ''" --repo data
+
+# Inspect a specific training step; nearest tracked step is used if needed
+aimx trace distribution "distribution.name != ''" --repo data --step 12300
+
+# Show distribution tensors in a readable table
+aimx trace distribution "distribution.name == 'weights'" --repo data --table
+
+# Export distribution histograms for scripting
+aimx trace distribution "distribution.name == 'weights'" --repo data --csv
+aimx trace distribution "distribution.name == 'weights'" --repo data --json
+```
+
 ### Common query options
 
 - Output: `--json`, `--oneline` / `--plain`, or the default rich terminal view.

diff --git a/pyproject.toml b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "aimx"
-version = "0.3.2"
+version = "0.3.3"
 description = "A safe CLI-first companion for native Aim"
 readme = "README.md"
 requires-python = ">=3.10,<3.13"

diff --git a/skills/aimx/SKILL.md b/skills/aimx/SKILL.md
@@ -11,6 +11,91 @@ Use `aimx` as a read-only evidence collector for `autoresearch` `log_experiment`
 steps. Prefer JSON output so downstream agents can compare runs, explain model
 effects, and propose the next experiment from concrete Aim data.
 
+## Fast Recipes
+
+Use these first for common analysis tasks. Keep `--repo` explicit and prefer
+`--json` for machine-readable output.
+
+### Discover run scope and available params
+
+```bash
+aimx query params "run.hash != ''" --repo <repo> --json
+```
+
+### Inspect one run quickly
+
+```bash
+aimx query params "run.hash == '<run-hash>'" --repo <repo> --json
+aimx query metrics "(run.hash == '<run-hash>') and metric.name != ''" --repo <repo> --json
+```
+
+### Rank runs by an objective metric
+
+```bash
+aimx query metrics "(<run-scope>) and metric.name == '<metric>'" --repo <repo> --json > metrics.json
+python - <<'PY'
+from __future__ import annotations
+import json
+from pathlib import Path
+
+payload = json.loads(Path("metrics.json").read_text())
+rows = []
+for run in payload.get("runs", []):
+    for metric in run.get("metrics", []):
+        value = metric.get("min", {}).get("value")
+        if value is not None:
+            rows.append((value, run.get("hash"), run.get("name"), metric.get("context", {})))
+for value, run_hash, run_name, context in sorted(rows)[:5]:
+    print(f"{value:.6f}\t{run_hash}\t{run_name}\t{context}")
+PY
+```
+
+### Compare two runs side by side
+
+```bash
+aimx query params "run.hash == '<baseline-hash>' or run.hash == '<candidate-hash>'" --repo <repo> --json
+aimx query metrics "((run.hash == '<baseline-hash>') or (run.hash == '<candidate-hash>')) and metric.name == '<metric>'" --repo <repo> --json
+```
+
+### Check curve health with bounded trace evidence
+
+```bash
+aimx trace "(<run-scope>) and metric.name == '<metric>'" --repo <repo> --json --tail 200 > trace.json
+```
+
+Then reduce `trace.json` with the `curve_summary` snippet from
+`references/aimx-cli.md` instead of pasting raw series.
+
+### Sanity-check distribution traces
+
+```bash
+aimx trace distribution "<distribution-expr>" --repo <repo> --json --tail 5
+aimx trace distribution "distribution.name != ''" --repo <repo> --step 12300
+```
+
+### Capture one snapshot bundle for logs
+
+```bash
+uv run python skills/aimx/scripts/collect_experiment_snapshot.py \
+  --repo data \
+  --base-expr "run.hash != ''" \
+  --metric loss \
+  --trace-metric loss \
+  --pretty
+```
+
+## When to use what
+
+| Need | Use |
+| --- | --- |
+| Discover runs and key hyperparameters | `aimx query params "<run-scope>" --repo <repo> --json` |
+| Rank runs cheaply by objective | `aimx query metrics "<metric-expr>" --repo <repo> --json` and compare `min.value` or `max.value` |
+| Inspect curve shape and late stability | `aimx trace "<metric-expr>" --repo <repo> --json --tail N` |
+| Focus on a step or epoch window | `--steps a:b` or `--epochs a:b` on query/trace commands |
+| Analyze weight or gradient histograms | `aimx trace distribution "<distribution-expr>" --repo <repo> --json` |
+| Collect qualitative image evidence | `aimx query images "<image-expr>" --repo <repo> --json --head N` |
+| Check native Aim passthrough readiness | `aimx doctor` |
+
 ## Requirements
 
 - Require `aimx` in the Python environment that runs `log_experiment`.
@@ -32,6 +117,9 @@ effects, and propose the next experiment from concrete Aim data.
 
 ## Workflow
 
+For common tasks, start from **Fast Recipes** and only switch to this full
+workflow when the scope is unclear or the question is complex.
+
 1. Locate the Aim repository. Pass `--repo <repo-root-or-.aim>` explicitly; in
    this repository, use `--repo data` or `--repo data/.aim` for local checks.
 2. Define the run scope as an AimQL expression. Start broad with
@@ -56,13 +144,22 @@ effects, and propose the next experiment from concrete Aim data.
    aimx trace "(<run-scope>) and metric.name == 'loss'" --repo <repo> --json --tail 50
    ```
 
-6. Collect image metadata when qualitative outputs matter:
+6. Inspect distribution traces when weight, activation, or gradient histograms
+   matter. Prefer JSON/CSV for automation; use the default visual output for
+   human terminal inspection.
+
+   ```bash
+   aimx trace distribution "<distribution-expr>" --repo <repo> --json --tail 5
+   aimx trace distribution "distribution.name != ''" --repo <repo> --step 12300
+   ```
+
+7. Collect image metadata when qualitative outputs matter:
 
    ```bash
    aimx query images "images" --repo <repo> --json --head 20
    ```
 
-7. Emit a compact `log_experiment` record containing:
+8. Emit a compact `log_experiment` record containing:
 
    ```json
    {
@@ -71,6 +168,7 @@ effects, and propose the next experiment from concrete Aim data.
      "params": {},
      "metric_summary": {},
      "trace_evidence": {},
+     "distribution_evidence": {},
      "image_evidence": {},
      "interpretation": {
        "best_runs": [],
@@ -81,13 +179,55 @@ effects, and propose the next experiment from concrete Aim data.
    }
    ```
 
+## Analysis Workflow
+
+Use the same discipline as large experiment trackers: inspect structure first,
+query only the fields needed for the question, then reduce evidence into compact
+statistics before writing conclusions.
+
+1. Start with params and metric summaries to discover candidate runs, objective
+   metrics, contexts, and missing fields. Avoid dumping full JSON payloads into
+   conversation context.
+2. Choose the objective direction explicitly. Rank cheaply from summaries first:
+   `min.value` for loss/error, `max.value` for accuracy/F1/AUC/IoU, and
+   `last.value` only when the final checkpoint is the real objective.
+3. Pull bounded traces only for the baseline, top candidates, and suspicious
+   runs. Prefer `--tail`, `--steps`, `--epochs`, and `--every` before collecting
+   full curves.
+4. Compute local stats before interpreting: best step, final-window mean/std,
+   train-vs-val gap, NaN/Inf counts, sustained increases, spikes, and plateaus.
+5. Compare runs side by side with selected params plus selected metrics. Do not
+   iterate every param or every metric unless discovery is the goal.
+6. Escalate evidence by modality: use distribution traces for weights,
+   activations, or gradients; use image metadata for qualitative regressions.
+7. Keep the final analysis small: state objective, run scope, top runs, curve
+   health, anomalies, confidence, and the next experiment suggested by evidence.
+
+## Critical Rules
+
+- Discover scope first with `aimx query params "<run-scope>" --repo <repo> --json`.
+  Do not assume metric or param names.
+- Treat `aimx` output as data: parse JSON and report aggregates, not raw payloads.
+- Slice traces aggressively with `--tail`, `--head`, `--steps`, `--epochs`, or
+  `--every` before computing local statistics.
+- Always pass `--repo` explicitly to avoid reading an unintended repository.
+- For automation, use `aimx trace distribution` with `--json`, `--csv`, or
+  `--table`. Unflagged mode is terminal visualization for human inspection.
+- Always finish with a compact conclusion: objective, top runs, curve health,
+  anomalies, confidence, and next experiment.
+
 ## Interpretation Rules
 
 - Prefer validation, test, or held-out contexts over training contexts when
   ranking runs.
 - Treat `aimx query metrics` as summary data: `last`, `min`, `max`, and step
   counts. Use `aimx trace --json` when shape, stability, divergence, or late
   improvement matters.
+- Use `aimx trace distribution --json` or `--csv` for automated histogram
+  evidence. The unflagged distribution command is a non-interactive terminal
+  visual that lists matched distributions, selects the first non-empty series,
+  and renders a current-step histogram plus step-by-bin heatmap. `--step N`
+  affects only this visual mode and falls back to the nearest tracked step.
 - For minimization metrics such as loss or error, compare `min.value` and the
   corresponding step. For maximization metrics such as accuracy, F1, AUC, or
   IoU, compare `max.value`.
@@ -97,6 +237,21 @@ effects, and propose the next experiment from concrete Aim data.
 - Preserve read-only behavior. Do not run commands that initialize, repair,
   migrate, delete, or rewrite Aim repositories during `log_experiment`.
 
+## Gotchas
+
+| Gotcha | Wrong | Right |
+| --- | --- | --- |
+| Missing `aimx` in environment | Assume `aimx` is available | Verify with `aimx --help` or `python -m aimx --help`, then follow project install workflow |
+| Repository targeting | Rely on current directory | Pass `--repo <repo>` explicitly on every collection command |
+| Summary vs curve confusion | Treat `query metrics` output as full history | Use `query metrics` for summary (`last/min/max`) and `trace --json` for curve shape |
+| Raw payload dumping | Paste full JSON into conversation | Parse and compute compact aggregates before reporting |
+| AimQL string quoting | `metric.name == "loss"` | `metric.name == 'loss'` |
+| Short hash assumptions | Assume short hash is canonical identity | Let `aimx` expand it, but compare/store full run hash |
+| Distribution output mode | Use default distribution mode in scripts | Use `--json`, `--csv`, or `--table` for automation |
+| `--step` expectation | Expect `--step` to filter JSON/CSV/table exports | Use `--step` only for visual histogram mode |
+| Empty trace handling | Treat non-JSON message as fatal parsing error | Treat it as no trace evidence and continue analysis |
+| Full trace collection | Pull all runs and all points first | Rank by summary, then trace only baseline, top candidates, and suspicious runs |
+
 ## Helper Script
 
 Use `scripts/collect_experiment_snapshot.py` when an agent needs one structured
@@ -120,4 +275,7 @@ needed. It writes only to stdout.
 ## Reference
 
 Read `references/aimx-cli.md` for command details, JSON envelope shapes, and
-suggested `log_experiment` evidence fields.
+suggested `log_experiment` evidence fields. For deeper experiment analysis
+patterns, see "Analysis Patterns", "Find best run by objective", "Spike /
+divergence / plateau / NaN detection", "Overfitting detection", and "Sweep
+ranking".