diff --git a/LICENSES.md b/LICENSES.md
index 259beb8..887e1f4 100644
--- a/LICENSES.md
+++ b/LICENSES.md
@@ -57,11 +57,14 @@ Phase 0 (this document) produces the initial map; Phase 9 verifies.
 
 | Dataset | Phase | License | Status | Notes |
 |---|---|---|---|---|
-| GuitarSet | 1.5 / 7 | CC-BY-4.0 | ✅ | https://guitarset.weebly.com — JAMS annotations, hexaphonic. Already used in v0 finetune work. Re-distribution requires attribution; not committed to repo. |
-| IDMT-SMT-Guitar | 1.5 / 7 | research-use, registration | ⚠️ | Training-only; not redistributed in our repo. Verify scope of "research use" for portfolio context. |
-| EGDB | 1.5 / 7 | TBD | ⚠️ | https://github.com/ss12f32v/GuitarTranscription — multi-amp distorted electric. Verify before relying on it for distorted-electric tier eval. |
-| DadaGP | 7 | TBD | ⚠️ | https://github.com/dada-bots/dadaGP — GuitarPro tabs as synthetic-data substrate. |
-| User clips (existing 11/20 self-recorded) | 1.5 (bonus) | self-owned | ✅ | iPhone OOD bonus tier per design doc §6. Owned by Patrick. |
+| GuitarSet | 1.5 / 7 / **Phase 0 (this PR)** | CC-BY-4.0 | ✅ | https://guitarset.weebly.com — JAMS annotations, hexaphonic. Already used in v0 finetune work. Re-distribution requires attribution; not committed to repo. **Used as the only data source for the 2026-05-13 composite baseline** (player 05 held-out validation; 60 tracks; 8 715 gold notes). |
+| Guitar-TECHS | Phase 0 (planned) / 1.5 / 7 | CC-BY-4.0 (paper §4 + Zenodo) | ⚠️ | arXiv:2501.03720 — 5h12m multi-mic + DI; per-string MIDI annotations. Acquisition planned per Phase 0 impl plan §3.2; on-disk scanner stub in `tabvision/tabvision/eval/manifest_builder.py:scan_guitar_techs`. Required attribution must appear in the public README. |
+| IDMT-SMT-Guitar | 1.5 / 7 | research-use, registration | ⚠️ | Training-only; not redistributed in our repo. Verified 2026-05-13 research pass; superseded by Guitar-TECHS for v1 acceptance — kept for potential future training augmentation. |
+| EGDB | 1.5 / 7 | **none on repo — author email pending** | ⚠️ | https://ss12f32v.github.io/Guitar-Transcription/ — 240 tracks, ~12h with multi-amp electric variants, GuitarPro tabs + aligned MIDI. **Portfolio-use written permission required** before any acquisition (LICENSE file is null per 2026-05-13 verification). Email `f08946011@ntu.edu.tw`; template in `docs/plans/2026-05-12-tab-f1-to-spec-design.md` §8.2. |
+| ~~GOAT~~ | DROPPED | request-only, research-only | ❌ | arXiv:2509.22655. Verified 2026-05-13: distribution gated per-use ("for research purposes only, upon request") due to copyrighted cover-song content. Not portfolio-compatible per SPEC §1.5; removed from the eval composite. |
+| ~~SynthTab~~ | DROPPED from default pipeline | dataset CC-BY-NC-4.0 (code CC-BY-4.0) | ❌ | github.com/yongyizang/SynthTab. Dataset NC clause taints derived weights (SynthTab paper treats trained models as derivative work). Not portfolio-compatible per SPEC §1.5; removed from the planned pretrain pipeline 2026-05-13. The repo code (Apache/CC-BY) remains MIT-style usable for our own renderers if needed. |
+| DadaGP | research/dev only — **not in default pipeline** | access-by-email; underlying GP tabs derive from copyrighted songs | ⚠️ | https://github.com/dada-bots/dadaGP. Per 2026-05-13 design plan §4.2, acceptable as internal training augmentation only. Synthetic-source clips are blocked from non-train manifest splits by `tabvision.eval.manifest.validate_manifest` (the `SYNTHETIC_IN_EVAL_SPLIT` guard). |
+| ~~User clips (the 20 self-recorded set)~~ | BANNED | self-owned | ⛔ | Banned from all roles per 2026-05-13 design plan D10 — not as accuracy gate, dev set, or label source. Replaced by the public-corpus composite. |
 | Roboflow `b101/guitar-3` | 3 (training) | **CC BY 4.0** | ✅ | **Verified 2026-05-05.** Source: https://universe.roboflow.com/b101/guitar-3. Forked into Patrick's workspace as `patricks-workspace-vozcg/guitar-3-4efcd` v2; YOLOv8-OBB export downloaded (926 images, 710/144/72 split, classes: fret / neck / nut). License declared in the dataset's README.dataset.txt: "License: CC BY 4.0". Attribution: "guitar 3" by b101 on Roboflow Universe (https://universe.roboflow.com/b101/guitar-3), CC BY 4.0; export downloaded May 5, 2026 via the Roboflow SDK. **Required attribution must appear in the public README and any blog post.** |
 
 ## Library dependencies (default pipeline)
diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md
index 80df952..5c971d6 100644
--- a/docs/DECISIONS.md
+++ b/docs/DECISIONS.md
@@ -16,6 +16,62 @@ Format:
 
 ---
 
+## 2026-05-13 — Tab F1 v1 acceptance: per-tier targets + public-corpus composite
+
+**Phase:** Accuracy work (cross-cuts Phases 1, 2, 3, 5, 7, 8 of the SPEC)
+**Decision tree:** Design plan adoption + SPEC §1.4 amendment proposal
+**Branch taken:** Replace the aggregate 0.88 Tab F1 acceptance gate with
+a per-tier table; drop SynthTab (CC-BY-NC) and GOAT (request-only) from
+the default pipeline; rely on GuitarSet + Guitar-TECHS + EGDB
+(license-pending) for the public-corpus composite eval.
+
+**Evidence:**
+- Strategy / decision record: `docs/plans/2026-05-12-tab-f1-to-spec-design.md`
+- Phase 0 implementation plan: `docs/plans/2026-05-13-tab-f1-phase-0-implementation.md`
+- SPEC amendment block: `SPEC.md` §1.4.1 (per-tier table + composite test set)
+- First baseline artifact (2 of 4 tiers covered): `docs/EVAL_REPORTS/composite_baseline_2026-05-13.md`
+- Companion error decomposition: `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md`
+- Implementation branch with the eval harness: `impl/tab-f1-phase-0`
+
+**Reasoning:** The 2026-05-08 GuitarSet validation showed aggregate Tab
+F1 = 0.6104 with comp tracks at 0.670 and solo tracks at 0.508. The
+aggregate target hid the dominant failure axis (string/fret assignment
+on single-line passages), and the SPEC §1.4 numbers (0.94 / 0.86 / 0.90
+/ 0.82) baked in implicit per-tier expectations that the project hadn't
+explicitly negotiated. The 2026-05-13 user conversation locked in
+relaxed v1 targets (0.85 / 0.90 / 0.87 / 0.80), kept the original SPEC
+numbers as the v1.1 / portfolio stretch reference, and committed to
+audio-only fusion priors + cheap pitch post-processing as the leverage
+path (no SynthTab pretrain → no NC license taint on shipped weights).
+
+**Per-tier acceptance gate (v1):**
+
+| Tier | v1 target | 2026-05-13 baseline (mean / lower 95% CI) |
+|---|---:|---:|
+| Clean acoustic single-line | 0.85 | 0.5076 / 0.4448 (fail) |
+| Clean acoustic strummed | 0.90 | 0.6708 / 0.6015 (fail) |
+| Clean electric | 0.87 | missing — pending Guitar-TECHS |
+| Distorted electric | 0.80 | missing — pending EGDB |
+
+Both covered tiers fail by ~25–35 pp. Per the error decomposition,
+`wrong_position_same_pitch` accounts for 77% of single-line loss and
+50% of strummed loss — Phases 1-7 of the design plan target this
+bucket.
+
+**Decisions inventoried in the design plan (D1–D11):**
+
+- D1 Per-tier replaces aggregate. D2 Targets table. D3 Composite eval.
+  D4 No SynthTab. D5 Video qualitative-only. D6 Free-tier compute first
+  (Local > Colab > Kaggle > Lightning > Modal). D7 1-2 month cadence.
+  D8 No stretch (bends/slides) in v1. D9 D2 numbers on top-1 only.
+  D10 Personal clips fully banned. D11 This is a SPEC §1.4 amendment,
+  not a SPEC-achievement plan.
+
+**Open Phase 0 user actions:** Lightning Studios / Kaggle / Colab / W&B
+account verification; EGDB author email; Guitar-TECHS Zenodo download.
+
+---
+
 ## 2026-05-05 — Project name kept as `tabvision` (not `tabify`)
 
 **Phase:** 0
diff --git a/docs/EVAL_REPORTS/composite_baseline_2026-05-13.md b/docs/EVAL_REPORTS/composite_baseline_2026-05-13.md
new file mode 100644
index 0000000..f700b90
--- /dev/null
+++ b/docs/EVAL_REPORTS/composite_baseline_2026-05-13.md
@@ -0,0 +1,41 @@
+# Composite per-tier baseline
+
+## Coverage
+
+**2 of 4 tiers measured.** Clean acoustic single-line + strummed covered
+via the GuitarSet validation split (held-out player 05, 60 tracks,
+8 715 gold notes). **Clean electric and distorted electric tiers
+pending Guitar-TECHS / EGDB acquisition** per the strategy doc §3.1 and
+Phase 0 implementation plan §3.2 — see the "missing" rows below.
+
+This is the first artifact of `impl/tab-f1-phase-0`. Companion
+6-bucket error decomposition: [`tab_f1_error_decomposition_2026-05-13.md`](tab_f1_error_decomposition_2026-05-13.md).
+
+## Per-tier results
+
+| Tier | Clips | Gold notes | Tab F1 mean | Tab F1 lower-95 | Target | Status | Onset F1 | Pitch F1 |
+|---|---:|---:|---:|---:|---:|---|---:|---:|
+| clean_acoustic_single_line | 30 | 2179 | 0.5076 | 0.4448 | 0.85 | fail | 0.9375 | 0.9304 |
+| clean_acoustic_strummed | 30 | 6536 | 0.6708 | 0.6015 | 0.90 | fail | 0.9229 | 0.9005 |
+| clean_electric | 0 | 0 | — | — | 0.87 | missing | — | — |
+| distorted_electric | 0 | 0 | — | — | 0.80 | missing | — | — |
+
+## Per-source breakdown
+
+| Tier | Source | Clips | Tab F1 mean | Onset F1 mean | Pitch F1 mean |
+|---|---|---:|---:|---:|---:|
+| clean_acoustic_single_line | GuitarSet | 30 | 0.5076 | 0.9375 | 0.9304 |
+| clean_acoustic_strummed | GuitarSet | 30 | 0.6708 | 0.9229 | 0.9005 |
+
+## Methodology
+
+- Manifest: `data/eval/composite.toml`
+- Audio backend: `highres`
+- Position prior: `guitarset-v1`
+- Eval-harness SHA: `9a7e957` (the commit that landed both this baseline
+  artifact and the chord-cluster matcher fix in
+  `tabvision.eval.error_decomposition.decompose_errors`)
+- Onset tolerance: 50 ms
+- Bootstrap: N=10,000, seed=42, 95% percentile interval
+- Acceptance gate: `lower_95_CI >= target` per design plan §5
+
diff --git a/docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md b/docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md
new file mode 100644
index 0000000..5ba1d8e
--- /dev/null
+++ b/docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md
@@ -0,0 +1,45 @@
+# Tab F1 error decomposition
+
+## Diagnostic summary
+
+**Dominant failure bucket on every covered tier is
+`wrong_position_same_pitch`** — the audio detected the right pitch
+within onset tolerance but the system placed it on the wrong
+(string, fret).
+
+| Tier | Loss share — wrong_position_same_pitch |
+|---|---:|
+| clean_acoustic_single_line | **77.5%** (910 / 1174 loss events) |
+| clean_acoustic_strummed | **49.7%** (1548 / 3112 loss events) |
+| Aggregate | **57.3%** (2458 / 4286 loss events) |
+
+This matches the strategy doc §2 diagnostic exactly. The audio side
+is at SPEC (Pitch F1 ≥ 0.90 on both covered tiers); the gap to D2
+per-tier targets is almost entirely string/fret assignment, and it
+gets worse on single-line passages where chord-cluster constraints
+can't help the fusion.
+
+Companion baseline report: [`composite_baseline_2026-05-13.md`](composite_baseline_2026-05-13.md).
+
+Six-bucket port of the apr-28 7-bucket harness; the seventh apr-28
+bucket (`muted_undetectable`) is deferred until the §8 `TabEvent`
+contract carries a muted/X flag.
+
+## Aggregate (all tiers)
+
+| Bucket | Count | Share of loss |
+|---|---:|---:|
+| correct | 4986 | — |
+| wrong_position_same_pitch | 2458 | 57.3% |
+| pitch_off | 505 | 11.8% |
+| timing_only | 94 | 2.2% |
+| missed_onset | 672 | 15.7% |
+| extra_detection | 557 | 13.0% |
+
+## Per-tier breakdown
+
+| Tier | correct | wrong_position_same_pitch | pitch_off | timing_only | missed_onset | extra_detection |
+|---|---|---|---|---|---|---|
+| clean_acoustic_single_line | 1125 | 910 | 19 | 17 | 108 | 120 |
+| clean_acoustic_strummed | 3861 | 1548 | 486 | 77 | 564 | 437 |
+
diff --git a/docs/plans/2026-05-12-tab-f1-to-spec-design.md b/docs/plans/2026-05-12-tab-f1-to-spec-design.md
index ff1569b..78991a3 100644
--- a/docs/plans/2026-05-12-tab-f1-to-spec-design.md
+++ b/docs/plans/2026-05-12-tab-f1-to-spec-design.md
@@ -213,7 +213,7 @@ phase's evidence justifies starting it.
   the composite eval. Acquire Guitar-TECHS; send EGDB email; verify free
   compute accounts. **No production code changes.** Acceptance: per-tier
   baseline numbers exist for ≥ 3 of 4 tiers with bootstrap CIs;
-  per-tier 7-bucket error breakdown exists. [Companion:
+  per-tier six-bucket error breakdown exists. [Companion:
   `2026-05-13-tab-f1-phase-0-implementation.md`.]
 - **Phase 1 — Pitch ceiling lift (cheap moves).** Voicing/silence gate
   + peak-picking + Basic Pitch pitch-only ensemble. Acceptance: Pitch
diff --git a/docs/plans/2026-05-13-tab-f1-phase-0-implementation.md b/docs/plans/2026-05-13-tab-f1-phase-0-implementation.md
index 0a9cd5f..6d6b8cc 100644
--- a/docs/plans/2026-05-13-tab-f1-phase-0-implementation.md
+++ b/docs/plans/2026-05-13-tab-f1-phase-0-implementation.md
@@ -17,7 +17,9 @@ Acceptance, copied from the strategy doc §6:
 
 - Per-tier baseline numbers for ≥ 3 of 4 D2 tiers with **bootstrap
   95% CIs**, on the composite eval set.
-- Per-tier 7-bucket error decomposition on the same set.
+- Per-tier six-bucket error decomposition on the same set
+  (port of the apr-28 7-bucket harness; ``muted_undetectable`` deferred
+  until the §8 ``TabEvent`` contract carries a muted/X flag).
 - Free-tier compute accounts (Local / Colab / Kaggle / Lightning / W&B)
   verified.
 - EGDB author email sent; reply tracked in `docs/DECISIONS.md`.
@@ -43,10 +45,10 @@ Acceptance, copied from the strategy doc §6:
 | `tabvision/tests/unit/test_parser_guitarset_jams.py` | JAMS parser round-trip test |
 | `tabvision/tests/unit/test_parser_guitar_techs_midi.py` | MIDI parser round-trip test |
 | `tabvision/tests/unit/test_bootstrap_ci.py` | CI helper correctness on known distributions |
-| `tabvision/tests/unit/test_error_decomposition.py` | 7-bucket assignment correctness on synthetic predicted/gold pairs |
+| `tabvision/tests/unit/test_error_decomposition.py` | Per-bucket assignment correctness on synthetic predicted/gold pairs (six buckets populated) |
 | `tabvision/tests/integration/test_composite_eval_smoke.py` | End-to-end smoke: 5-clip manifest → tier numbers exist + CIs computed |
 | `docs/EVAL_REPORTS/composite_baseline_2026-05-13.md` | First baseline report (output of Phase 0E) |
-| `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md` | First 7-bucket decomposition (output of Phase 0D) |
+| `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md` | First six-bucket decomposition (output of Phase 0D) |
 
 ### 1.2 Modified files
 
@@ -215,8 +217,8 @@ Must contain:
 
 Must contain:
 
-- Aggregate 7-bucket table (counts + share-of-loss).
-- Per-tier 7-bucket table.
+- Aggregate six-bucket table (counts + share-of-loss).
+- Per-tier six-bucket table.
 - A "biggest lever per tier" callout: which bucket dominates each
   tier's loss. Phase 1+ priorities derive from this.
 
diff --git a/tabvision/data/eval/composite.toml b/tabvision/data/eval/composite.toml
new file mode 100644
index 0000000..399c6a6
--- /dev/null
+++ b/tabvision/data/eval/composite.toml
@@ -0,0 +1,542 @@
+# Composite-eval manifest generated by tabvision/scripts/eval/build_composite_manifest.py.
+# Re-generate with the same args to refresh; this file is intended to be auto-managed.
+
+[[clips]]
+id = "guitarset/05_BN1-129-Eb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN1-129-Eb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN1-129-Eb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN1-129-Eb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN1-129-Eb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN1-129-Eb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN1-147-Gb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN1-147-Gb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN1-147-Gb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN1-147-Gb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN1-147-Gb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN1-147-Gb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN2-131-B_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN2-131-B_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN2-131-B_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN2-131-B_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN2-131-B_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN2-131-B_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN2-166-Ab_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN2-166-Ab_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN2-166-Ab_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN2-166-Ab_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN2-166-Ab_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN2-166-Ab_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN3-119-G_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN3-119-G_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN3-119-G_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN3-119-G_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN3-119-G_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN3-119-G_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN3-154-E_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN3-154-E_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN3-154-E_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_BN3-154-E_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_BN3-154-E_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_BN3-154-E_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk1-114-Ab_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk1-114-Ab_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk1-114-Ab_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk1-114-Ab_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk1-114-Ab_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk1-114-Ab_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk1-97-C_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk1-97-C_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk1-97-C_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk1-97-C_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk1-97-C_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk1-97-C_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk2-108-Eb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk2-108-Eb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk2-108-Eb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk2-108-Eb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk2-108-Eb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk2-108-Eb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk2-119-G_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk2-119-G_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk2-119-G_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk2-119-G_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk2-119-G_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk2-119-G_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk3-112-C#_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk3-112-C#_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk3-112-C#_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk3-112-C#_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk3-112-C#_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk3-112-C#_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk3-98-A_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk3-98-A_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk3-98-A_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Funk3-98-A_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Funk3-98-A_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Funk3-98-A_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz1-130-D_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz1-130-D_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz1-130-D_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz1-130-D_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz1-130-D_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz1-130-D_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz1-200-B_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz1-200-B_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz1-200-B_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz1-200-B_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz1-200-B_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz1-200-B_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz2-110-Bb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz2-110-Bb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz2-110-Bb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz2-110-Bb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz2-110-Bb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz2-110-Bb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz2-187-F#_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz2-187-F#_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz2-187-F#_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz2-187-F#_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz2-187-F#_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz2-187-F#_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz3-137-Eb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz3-137-Eb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz3-137-Eb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz3-137-Eb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz3-137-Eb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz3-137-Eb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz3-150-C_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz3-150-C_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz3-150-C_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Jazz3-150-C_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Jazz3-150-C_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Jazz3-150-C_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock1-130-A_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock1-130-A_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock1-130-A_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock1-130-A_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock1-130-A_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock1-130-A_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock1-90-C#_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock1-90-C#_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock1-90-C#_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock1-90-C#_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock1-90-C#_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock1-90-C#_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock2-142-D_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock2-142-D_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock2-142-D_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock2-142-D_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock2-142-D_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock2-142-D_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock2-85-F_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock2-85-F_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock2-85-F_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock2-85-F_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock2-85-F_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock2-85-F_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock3-117-Bb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock3-117-Bb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock3-117-Bb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock3-117-Bb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock3-117-Bb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock3-117-Bb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock3-148-C_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock3-148-C_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock3-148-C_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_Rock3-148-C_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_Rock3-148-C_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_Rock3-148-C_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS1-100-C#_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS1-100-C#_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS1-100-C#_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS1-100-C#_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS1-100-C#_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS1-100-C#_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS1-68-E_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS1-68-E_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS1-68-E_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS1-68-E_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS1-68-E_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS1-68-E_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS2-107-Ab_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS2-107-Ab_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS2-107-Ab_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS2-107-Ab_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS2-107-Ab_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS2-107-Ab_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS2-88-F_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS2-88-F_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS2-88-F_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS2-88-F_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS2-88-F_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS2-88-F_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS3-84-Bb_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS3-84-Bb_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS3-84-Bb_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS3-84-Bb_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS3-84-Bb_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS3-84-Bb_solo.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS3-98-C_comp"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS3-98-C_comp_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS3-98-C_comp.jams"
+annotation_format = "guitarset_jams"
+
+[[clips]]
+id = "guitarset/05_SS3-98-C_solo"
+tier = "clean_acoustic_single_line"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_SS3-98-C_solo_mic.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_SS3-98-C_solo.jams"
+annotation_format = "guitarset_jams"
diff --git a/tabvision/data/eval/manifest.toml b/tabvision/data/eval/manifest.toml
index fc5b65c..3654685 100644
--- a/tabvision/data/eval/manifest.toml
+++ b/tabvision/data/eval/manifest.toml
@@ -17,3 +17,12 @@
 # split = "validation"
 # media_path = "$TABVISION_DATA_ROOT/guitarset/audio_mono-mic/05_example_mic.wav"
 # annotation_path = "$TABVISION_DATA_ROOT/guitarset/annotation/05_example.jams"
+# annotation_format = "guitarset_jams"
+#
+# `annotation_format` selects the parser registered in
+# tabvision.eval.parsers (Phase 0). Known formats: guitarset_jams,
+# guitar_techs_midi. Forthcoming: egdb_gp (license-pending).
+#
+# Synthetic-source clips (source = "synthtab/...", "dadagp/...",
+# "synthetic/...") are restricted to split = "train". The validator
+# rejects them in validation/test splits — see design plan §5 / R8.
diff --git a/tabvision/scripts/eval/build_composite_manifest.py b/tabvision/scripts/eval/build_composite_manifest.py
new file mode 100644
index 0000000..9b47f44
--- /dev/null
+++ b/tabvision/scripts/eval/build_composite_manifest.py
@@ -0,0 +1,10 @@
+"""CLI wrapper for the composite-eval manifest builder.
+
+See ``docs/plans/2026-05-13-tab-f1-phase-0-implementation.md`` §3.3 for
+the canonical invocation.
+"""
+
+from tabvision.eval.manifest_builder import main
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tabvision/scripts/eval/composite_eval.py b/tabvision/scripts/eval/composite_eval.py
new file mode 100644
index 0000000..90d2fd9
--- /dev/null
+++ b/tabvision/scripts/eval/composite_eval.py
@@ -0,0 +1,10 @@
+"""CLI wrapper for the v1 composite per-tier eval.
+
+See ``docs/plans/2026-05-13-tab-f1-phase-0-implementation.md`` §3.4 for
+the canonical invocation.
+"""
+
+from tabvision.eval.composite import main
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tabvision/tabvision/eval/bootstrap.py b/tabvision/tabvision/eval/bootstrap.py
new file mode 100644
index 0000000..e3379e9
--- /dev/null
+++ b/tabvision/tabvision/eval/bootstrap.py
@@ -0,0 +1,112 @@
+"""Bootstrap confidence intervals for per-tier acceptance gates.
+
+The 2026-05-12 design plan (§5) requires every per-tier Tab F1 number
+to be reported with a 95% bootstrap CI, and the acceptance gate is
+``lower_95_CI >= target`` — not just ``mean >= target``. This module
+provides that primitive.
+
+Resamples observations (typically per-clip Tab F1 values) with
+replacement, applies a user-supplied statistic to each resample, and
+returns the original-sample statistic plus the symmetric percentile
+interval over the bootstrap distribution.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Callable, Sequence
+from dataclasses import dataclass
+
+import numpy as np
+
+
+@dataclass(frozen=True)
+class BootstrapResult:
+    """Bootstrap statistic + symmetric confidence interval.
+
+    ``lower`` and ``upper`` are the ``(1-confidence)/2`` and
+    ``(1+confidence)/2`` quantiles of the bootstrap distribution.
+    For a single observation, ``statistic == lower == upper`` and
+    ``n_bootstrap`` is ``0`` (no resampling performed).
+    """
+
+    statistic: float
+    lower: float
+    upper: float
+    n_observations: int
+    n_bootstrap: int
+    confidence: float
+
+
+def bootstrap_ci(
+    values: Sequence[float] | np.ndarray,
+    *,
+    statistic: Callable[[np.ndarray], float] | None = None,
+    n_bootstrap: int = 10_000,
+    confidence: float = 0.95,
+    seed: int = 42,
+) -> BootstrapResult:
+    """Bootstrap a confidence interval over ``values``.
+
+    ``statistic`` defaults to ``numpy.mean``. Pass a different callable
+    (e.g. ``numpy.median``) for other functionals. The callable receives
+    a 1-D ``numpy.ndarray`` of float64 values.
+
+    ``seed`` is the integer seed for ``numpy.random.default_rng``;
+    calling with the same seed + values produces identical output.
+    """
+    if len(values) == 0:
+        raise ValueError("bootstrap_ci requires at least one observation")
+    if not 0.0 < confidence < 1.0:
+        raise ValueError(
+            f"confidence must be in (0, 1); got {confidence}"
+        )
+    if n_bootstrap < 1:
+        raise ValueError(f"n_bootstrap must be >= 1; got {n_bootstrap}")
+
+    stat_fn: Callable[[np.ndarray], float] = (
+        statistic if statistic is not None else np.mean
+    )
+    arr = np.asarray(values, dtype=np.float64).ravel()
+    n_obs = arr.shape[0]
+    point = float(stat_fn(arr))
+
+    if n_obs == 1:
+        return BootstrapResult(
+            statistic=point,
+            lower=point,
+            upper=point,
+            n_observations=1,
+            n_bootstrap=0,
+            confidence=confidence,
+        )
+
+    rng = np.random.default_rng(seed)
+    indices = rng.integers(0, n_obs, size=(n_bootstrap, n_obs))
+    resamples = arr[indices]  # shape (n_bootstrap, n_obs)
+
+    if statistic is None or statistic is np.mean:
+        # Fast path: vectorized mean over rows.
+        dist = resamples.mean(axis=1)
+    else:
+        # General path: apply user statistic per resample.
+        dist = np.fromiter(
+            (float(stat_fn(resamples[i])) for i in range(n_bootstrap)),
+            dtype=np.float64,
+            count=n_bootstrap,
+        )
+
+    alpha = (1.0 - confidence) / 2.0
+    lower = float(np.quantile(dist, alpha))
+    upper = float(np.quantile(dist, 1.0 - alpha))
+
+    return BootstrapResult(
+        statistic=point,
+        lower=lower,
+        upper=upper,
+        n_observations=n_obs,
+        n_bootstrap=n_bootstrap,
+        confidence=confidence,
+    )
+
+
+__all__ = ["BootstrapResult", "bootstrap_ci"]
diff --git a/tabvision/tabvision/eval/composite.py b/tabvision/tabvision/eval/composite.py
new file mode 100644
index 0000000..578f195
--- /dev/null
+++ b/tabvision/tabvision/eval/composite.py
@@ -0,0 +1,548 @@
+"""Composite multi-source eval — Phase 0 per-tier baseline harness.
+
+Reads a manifest (validated by :mod:`tabvision.eval.manifest`),
+dispatches each clip's annotation through the registered parser,
+runs a user-supplied predictor over the media, and aggregates per-tier
+onset / pitch / tab F1 with bootstrap CIs plus the error-decomposition
+buckets.
+
+The predictor is **injected** so the harness is testable without the
+heavy audio backend. Production usage wires up
+:func:`tabvision.pipeline.run_pipeline` from the CLI; tests pass a
+fake predictor for fast iteration.
+"""
+
+from __future__ import annotations
+
+import os
+import tomllib
+from collections.abc import Callable, Mapping
+from dataclasses import dataclass
+from pathlib import Path
+
+from tabvision.eval.bootstrap import BootstrapResult, bootstrap_ci
+from tabvision.eval.error_decomposition import (
+    ErrorDecomposition,
+    aggregate_decompositions,
+    decompose_errors,
+)
+from tabvision.eval.manifest import ManifestValidation, validate_manifest
+from tabvision.eval.metrics import (
+    EventF1Result,
+    TabF1Result,
+    event_f1,
+    tab_f1,
+)
+from tabvision.eval.parsers import get_parser
+from tabvision.types import GuitarConfig, SessionConfig, TabEvent
+
+Predictor = Callable[[Path, SessionConfig], list[TabEvent]]
+"""``(media_path, session) -> list[TabEvent]``. The composite-eval harness
+calls this once per non-train clip."""
+
+
+@dataclass(frozen=True)
+class ClipEvalResult:
+    """Per-clip metrics + error decomposition."""
+
+    clip_id: str
+    tier: str
+    source: str
+    n_gold: int
+    n_predicted: int
+    onset: EventF1Result
+    pitch: EventF1Result
+    tab: TabF1Result
+    errors: ErrorDecomposition
+
+
+@dataclass(frozen=True)
+class TierReport:
+    """Aggregate metrics for one tier — bootstrap CI on each F1."""
+
+    tier: str
+    n_clips: int
+    n_gold_total: int
+    onset_f1: BootstrapResult
+    pitch_f1: BootstrapResult
+    tab_f1: BootstrapResult
+    errors: ErrorDecomposition  # summed across clips in this tier
+
+
+@dataclass(frozen=True)
+class CompositeReport:
+    """Top-level composite-eval result."""
+
+    manifest_path: str
+    manifest_validation: ManifestValidation
+    per_clip: list[ClipEvalResult]
+    tiers: Mapping[str, TierReport]
+    bootstrap_n: int
+    bootstrap_seed: int
+    onset_tolerance_s: float
+
+    def tab_f1_acceptance(self, targets: Mapping[str, float]) -> dict[str, str]:
+        """Compute the pass/gap/fail status per tier vs ``targets``.
+
+        Status semantics per design plan §5:
+        - ``"pass"``: ``lower_95_CI >= target`` (the official acceptance bar)
+        - ``"gap"``: ``mean >= target > lower_95_CI``
+        - ``"fail"``: ``mean < target``
+        - ``"missing"``: tier has no clips in this report
+        """
+        statuses: dict[str, str] = {}
+        for tier, target in targets.items():
+            report = self.tiers.get(tier)
+            if report is None:
+                statuses[tier] = "missing"
+                continue
+            mean = report.tab_f1.statistic
+            lower = report.tab_f1.lower
+            if lower >= target:
+                statuses[tier] = "pass"
+            elif mean >= target:
+                statuses[tier] = "gap"
+            else:
+                statuses[tier] = "fail"
+        return statuses
+
+
+DEFAULT_EVAL_SPLITS: tuple[str, ...] = ("validation", "test")
+"""Splits included in composite eval by default. ``train`` is excluded."""
+
+
+def run_composite_eval(
+    manifest_path: str | Path,
+    *,
+    predictor: Predictor,
+    media_root: str | Path | None = None,
+    annotation_root: str | Path | None = None,
+    splits: tuple[str, ...] = DEFAULT_EVAL_SPLITS,
+    cfg: GuitarConfig | None = None,
+    onset_tolerance_s: float = 0.05,
+    bootstrap_n: int = 10_000,
+    bootstrap_seed: int = 42,
+) -> CompositeReport:
+    """Per-clip eval, then per-tier aggregation with bootstrap CIs.
+
+    Raises ``ValueError`` if the manifest fails validation (fail-severity
+    issues from :func:`validate_manifest`). Train-split clips are
+    skipped by default; pass ``splits=("train",)`` to evaluate on them
+    (useful for diagnosing training-set fit).
+    """
+    manifest_path = Path(manifest_path)
+    validation = validate_manifest(manifest_path)
+    if not validation.passed:
+        fail_messages = [
+            i.message for i in validation.items if i.severity == "fail"
+        ]
+        raise ValueError(
+            f"Manifest {manifest_path} has fail-severity issues: {fail_messages}"
+        )
+
+    if cfg is None:
+        cfg = GuitarConfig()
+
+    payload = tomllib.loads(manifest_path.read_text(encoding="utf-8"))
+    clips = payload.get("clips") or []
+
+    per_clip: list[ClipEvalResult] = []
+    for clip in clips:
+        if clip["split"] not in splits:
+            continue
+
+        media_path = _resolve_path(clip["media_path"], media_root)
+        annotation_path = _resolve_path(clip["annotation_path"], annotation_root)
+
+        parser = get_parser(clip["annotation_format"])
+        gold = parser(annotation_path, cfg)
+
+        session = _session_from_clip(clip)
+        predicted = predictor(media_path, session)
+
+        per_clip.append(
+            ClipEvalResult(
+                clip_id=clip["id"],
+                tier=clip["tier"],
+                source=clip["source"],
+                n_gold=len(gold),
+                n_predicted=len(predicted),
+                onset=event_f1(
+                    predicted, gold, match_pitch=False, onset_tolerance_s=onset_tolerance_s
+                ),
+                pitch=event_f1(
+                    predicted, gold, match_pitch=True, onset_tolerance_s=onset_tolerance_s
+                ),
+                tab=tab_f1(predicted, gold, onset_tolerance_s=onset_tolerance_s),
+                errors=decompose_errors(
+                    predicted, gold, onset_tolerance_s=onset_tolerance_s
+                ),
+            )
+        )
+
+    tiers = _aggregate_per_tier(
+        per_clip,
+        bootstrap_n=bootstrap_n,
+        bootstrap_seed=bootstrap_seed,
+    )
+
+    return CompositeReport(
+        manifest_path=str(manifest_path),
+        manifest_validation=validation,
+        per_clip=per_clip,
+        tiers=tiers,
+        bootstrap_n=bootstrap_n,
+        bootstrap_seed=bootstrap_seed,
+        onset_tolerance_s=onset_tolerance_s,
+    )
+
+
+def _aggregate_per_tier(
+    per_clip: list[ClipEvalResult],
+    *,
+    bootstrap_n: int,
+    bootstrap_seed: int,
+) -> dict[str, TierReport]:
+    by_tier: dict[str, list[ClipEvalResult]] = {}
+    for result in per_clip:
+        by_tier.setdefault(result.tier, []).append(result)
+
+    reports: dict[str, TierReport] = {}
+    for tier, results in by_tier.items():
+        onset_f1s = [r.onset.f1 for r in results]
+        pitch_f1s = [r.pitch.f1 for r in results]
+        tab_f1s = [r.tab.f1 for r in results]
+        reports[tier] = TierReport(
+            tier=tier,
+            n_clips=len(results),
+            n_gold_total=sum(r.n_gold for r in results),
+            onset_f1=bootstrap_ci(
+                onset_f1s, n_bootstrap=bootstrap_n, seed=bootstrap_seed
+            ),
+            pitch_f1=bootstrap_ci(
+                pitch_f1s, n_bootstrap=bootstrap_n, seed=bootstrap_seed
+            ),
+            tab_f1=bootstrap_ci(
+                tab_f1s, n_bootstrap=bootstrap_n, seed=bootstrap_seed
+            ),
+            errors=aggregate_decompositions(r.errors for r in results),
+        )
+    return reports
+
+
+def _resolve_path(path_str: str, root: str | Path | None) -> Path:
+    """Expand ``$TABVISION_DATA_ROOT`` and apply optional override.
+
+    ``root`` (function arg) takes precedence over the env var.
+    """
+    expanded = path_str
+    if "$TABVISION_DATA_ROOT" in path_str:
+        resolved_root: str | None
+        if root is not None:
+            resolved_root = str(root)
+        else:
+            resolved_root = os.environ.get("TABVISION_DATA_ROOT")
+        if not resolved_root:
+            raise ValueError(
+                f"Path {path_str!r} contains $TABVISION_DATA_ROOT but neither "
+                f"the env var nor the function arg is set"
+            )
+        expanded = path_str.replace("$TABVISION_DATA_ROOT", resolved_root)
+    return Path(expanded).expanduser()
+
+
+def _session_from_clip(clip: dict[str, object]) -> SessionConfig:
+    """Map manifest clip metadata to a :class:`SessionConfig`.
+
+    Phase 0 defaults all clips to acoustic / clean / mixed. Per-clip
+    instrument / tone / style fields can be added to the manifest
+    schema in a later phase.
+    """
+    del clip  # unused in Phase 0
+    return SessionConfig()
+
+
+DEFAULT_TIER_TARGETS: Mapping[str, float] = {
+    "clean_acoustic_single_line": 0.85,
+    "clean_acoustic_strummed": 0.90,
+    "clean_electric": 0.87,
+    "distorted_electric": 0.80,
+}
+"""Per-tier Tab F1 acceptance targets from SPEC §1.4.1.
+
+These are the v1 acceptance bar locked in by the 2026-05-13 design plan
+§0 D2. The original SPEC §1.4 numbers (0.94 / 0.86 / 0.90 / 0.82) are
+the v1.1 / portfolio stretch reference, not used here.
+"""
+
+
+def format_baseline_markdown(
+    report: CompositeReport,
+    *,
+    targets: Mapping[str, float] = DEFAULT_TIER_TARGETS,
+    backend_label: str = "<unset>",
+    position_prior_label: str = "<unset>",
+    eval_harness_sha: str = "<unset>",
+    title: str = "Composite per-tier baseline",
+) -> str:
+    """Render a Phase 0 per-tier baseline report as Markdown.
+
+    Output format follows
+    ``docs/plans/2026-05-13-tab-f1-phase-0-implementation.md`` §4.1.
+    """
+    statuses = report.tab_f1_acceptance(targets)
+    lines: list[str] = [f"# {title}", ""]
+
+    lines.append("## Per-tier results")
+    lines.append("")
+    header_cells = [
+        "Tier",
+        "Clips",
+        "Gold notes",
+        "Tab F1 mean",
+        "Tab F1 lower-95",
+        "Target",
+        "Status",
+        "Onset F1",
+        "Pitch F1",
+    ]
+    lines.append("| " + " | ".join(header_cells) + " |")
+    lines.append("|---|---:|---:|---:|---:|---:|---|---:|---:|")
+    for tier, target in targets.items():
+        tier_report = report.tiers.get(tier)
+        if tier_report is None:
+            lines.append(
+                f"| {tier} | 0 | 0 | — | — | {target:.2f} | missing | — | — |"
+            )
+            continue
+        tab_mean = tier_report.tab_f1.statistic
+        tab_lo = tier_report.tab_f1.lower
+        onset_mean = tier_report.onset_f1.statistic
+        pitch_mean = tier_report.pitch_f1.statistic
+        lines.append(
+            f"| {tier} | {tier_report.n_clips} | {tier_report.n_gold_total} | "
+            f"{tab_mean:.4f} | {tab_lo:.4f} | {target:.2f} | {statuses[tier]} | "
+            f"{onset_mean:.4f} | {pitch_mean:.4f} |"
+        )
+    lines.append("")
+
+    lines.append("## Per-source breakdown")
+    lines.append("")
+    lines.append("| Tier | Source | Clips | Tab F1 mean | Onset F1 mean | Pitch F1 mean |")
+    lines.append("|---|---|---:|---:|---:|---:|")
+    grouped: dict[tuple[str, str], list[ClipEvalResult]] = {}
+    for clip in report.per_clip:
+        grouped.setdefault((clip.tier, clip.source), []).append(clip)
+    for (tier, source), clips in sorted(grouped.items()):
+        tab_mean = sum(c.tab.f1 for c in clips) / len(clips)
+        onset_mean = sum(c.onset.f1 for c in clips) / len(clips)
+        pitch_mean = sum(c.pitch.f1 for c in clips) / len(clips)
+        lines.append(
+            f"| {tier} | {source} | {len(clips)} | "
+            f"{tab_mean:.4f} | {onset_mean:.4f} | {pitch_mean:.4f} |"
+        )
+    lines.append("")
+
+    lines.append("## Methodology")
+    lines.append("")
+    lines.append(f"- Manifest: `{report.manifest_path}`")
+    lines.append(f"- Audio backend: `{backend_label}`")
+    lines.append(f"- Position prior: `{position_prior_label}`")
+    lines.append(f"- Eval-harness SHA: `{eval_harness_sha}`")
+    lines.append(f"- Onset tolerance: {report.onset_tolerance_s * 1000:.0f} ms")
+    lines.append(
+        f"- Bootstrap: N={report.bootstrap_n:,}, seed={report.bootstrap_seed}, "
+        f"95% percentile interval"
+    )
+    lines.append(
+        "- Acceptance gate: `lower_95_CI >= target` per design plan §5"
+    )
+    lines.append("")
+
+    return "\n".join(lines) + "\n"
+
+
+def format_decomposition_markdown(
+    report: CompositeReport,
+    *,
+    title: str = "Tab F1 error decomposition",
+) -> str:
+    """Render the per-tier six-bucket error decomposition.
+
+    Six buckets are populated; the apr-28 ``muted_undetectable`` seventh
+    bucket is deferred until the v1 contract carries a muted/X flag.
+    """
+    bucket_columns = (
+        "correct",
+        "wrong_position_same_pitch",
+        "pitch_off",
+        "timing_only",
+        "missed_onset",
+        "extra_detection",
+    )
+    lines: list[str] = [f"# {title}", ""]
+
+    lines.append("## Aggregate (all tiers)")
+    lines.append("")
+    from tabvision.eval.error_decomposition import aggregate_decompositions
+
+    overall = aggregate_decompositions(c.errors for c in report.per_clip)
+    lines.append("| Bucket | Count | Share of loss |")
+    lines.append("|---|---:|---:|")
+    shares = overall.share_of_loss()
+    for col in bucket_columns:
+        count = getattr(overall, col)
+        if col == "correct":
+            lines.append(f"| {col} | {count} | — |")
+        else:
+            lines.append(f"| {col} | {count} | {shares[col] * 100:.1f}% |")
+    lines.append("")
+
+    lines.append("## Per-tier breakdown")
+    lines.append("")
+    header_cells = ["Tier"] + list(bucket_columns)
+    lines.append("| " + " | ".join(header_cells) + " |")
+    lines.append("|" + "|".join(["---"] * len(header_cells)) + "|")
+    for tier_name in sorted(report.tiers):
+        tier_report = report.tiers[tier_name]
+        row = [tier_name]
+        for col in bucket_columns:
+            row.append(str(getattr(tier_report.errors, col)))
+        lines.append("| " + " | ".join(row) + " |")
+    lines.append("")
+
+    return "\n".join(lines) + "\n"
+
+
+def make_run_pipeline_predictor(
+    *,
+    audio_backend_name: str,
+    position_prior: str | None,
+    melodic_prior_enabled: bool = False,
+    video_enabled: bool = False,
+) -> Predictor:
+    """Wrap :func:`tabvision.pipeline.run_pipeline` for composite-eval use.
+
+    Imports ``run_pipeline`` lazily so the composite-eval CLI's --help
+    works without the audio-highres extras installed.
+    """
+    from tabvision.pipeline import run_pipeline  # noqa: PLC0415
+
+    def predictor(media_path: Path, session: SessionConfig) -> list[TabEvent]:
+        return run_pipeline(
+            str(media_path),
+            audio_backend_name=audio_backend_name,
+            position_prior=position_prior,
+            melodic_prior_enabled=melodic_prior_enabled,
+            video_enabled=video_enabled,
+            session=session,
+        )
+
+    return predictor
+
+
+def main(argv: list[str] | None = None) -> int:
+    """CLI entry point: ``tabvision-composite-eval``."""
+    import argparse
+
+    parser = argparse.ArgumentParser(
+        prog="tabvision-composite-eval",
+        description=(
+            "Run the v1 per-tier composite eval and write a Markdown report."
+        ),
+    )
+    parser.add_argument("--manifest", type=Path, required=True)
+    parser.add_argument("--backend", default="highres", help="audio backend name")
+    parser.add_argument(
+        "--position-prior",
+        default="guitarset-v1",
+        help='position prior name; pass "none" to disable',
+    )
+    parser.add_argument("--melodic-prior", action="store_true")
+    parser.add_argument(
+        "--enable-video",
+        action="store_true",
+        help="enable video stack (default: off — Phase 0 ships audio-only)",
+    )
+    parser.add_argument("--output", type=Path, required=True)
+    parser.add_argument(
+        "--decomposition-output",
+        type=Path,
+        help=(
+            "optional: write the six-bucket error decomposition "
+            "(port of the apr-28 7-bucket harness; muted_undetectable deferred) "
+            "to this file too"
+        ),
+    )
+    parser.add_argument("--bootstrap-n", type=int, default=10_000)
+    parser.add_argument("--bootstrap-seed", type=int, default=42)
+    parser.add_argument("--onset-tolerance-s", type=float, default=0.05)
+    parser.add_argument(
+        "--splits",
+        default="validation,test",
+        help="comma-separated splits to include",
+    )
+    parser.add_argument("--media-root", type=Path, default=None)
+    parser.add_argument("--annotation-root", type=Path, default=None)
+    parser.add_argument("--eval-harness-sha", default="<unset>")
+
+    args = parser.parse_args(argv)
+
+    position_prior: str | None = args.position_prior
+    if position_prior and position_prior.lower() == "none":
+        position_prior = None
+
+    predictor = make_run_pipeline_predictor(
+        audio_backend_name=args.backend,
+        position_prior=position_prior,
+        melodic_prior_enabled=args.melodic_prior,
+        video_enabled=args.enable_video,
+    )
+
+    splits = tuple(s.strip() for s in args.splits.split(",") if s.strip())
+
+    report = run_composite_eval(
+        args.manifest,
+        predictor=predictor,
+        media_root=args.media_root,
+        annotation_root=args.annotation_root,
+        splits=splits,
+        onset_tolerance_s=args.onset_tolerance_s,
+        bootstrap_n=args.bootstrap_n,
+        bootstrap_seed=args.bootstrap_seed,
+    )
+
+    baseline_md = format_baseline_markdown(
+        report,
+        backend_label=args.backend,
+        position_prior_label=position_prior or "none",
+        eval_harness_sha=args.eval_harness_sha,
+    )
+    args.output.parent.mkdir(parents=True, exist_ok=True)
+    args.output.write_text(baseline_md, encoding="utf-8")
+
+    if args.decomposition_output:
+        decomp_md = format_decomposition_markdown(report)
+        args.decomposition_output.parent.mkdir(parents=True, exist_ok=True)
+        args.decomposition_output.write_text(decomp_md, encoding="utf-8")
+
+    return 0
+
+
+__all__ = [
+    "ClipEvalResult",
+    "CompositeReport",
+    "DEFAULT_EVAL_SPLITS",
+    "DEFAULT_TIER_TARGETS",
+    "Predictor",
+    "TierReport",
+    "format_baseline_markdown",
+    "format_decomposition_markdown",
+    "main",
+    "make_run_pipeline_predictor",
+    "run_composite_eval",
+]
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tabvision/tabvision/eval/error_decomposition.py b/tabvision/tabvision/eval/error_decomposition.py
new file mode 100644
index 0000000..59c45d1
--- /dev/null
+++ b/tabvision/tabvision/eval/error_decomposition.py
@@ -0,0 +1,269 @@
+"""Tab F1 error decomposition — six-bucket port of the apr-28 7-bucket harness.
+
+Ports the methodology from
+``tabvision-server/tools/outputs/errors-2026-04-28_185743.md`` to operate
+on §8 ``TabEvent`` lists (the v1 contract) instead of the v0 internal
+``Note`` representation.
+
+Six failure buckets (the apr-28 ``muted_undetectable`` bucket needs a
+muted/X flag the v1 contract does not yet carry; deferred to a later
+phase):
+
+- ``correct``: predicted event matches a gold event on string + fret
+  + onset within ``onset_tolerance_s``.
+- ``wrong_position_same_pitch``: predicted event matches on
+  ``pitch_midi`` + onset within tolerance, but a different
+  ``(string_idx, fret)``. This is the bucket that dominated the
+  2026-05-08 GuitarSet validation (~35% of loss on personal clips per
+  the apr-28 report).
+- ``pitch_off``: predicted event aligns in onset but pitch_midi
+  differs from the matched gold. Audio-side loss.
+- ``timing_only``: predicted event matches on position or pitch but
+  the onset is outside ``onset_tolerance_s`` and within
+  ``timing_extended_tolerance_s``.
+- ``missed_onset``: gold event has no predicted event near it within
+  the extended tolerance.
+- ``extra_detection``: predicted event that did not match any gold
+  event by either rule above.
+
+Per the strategy doc §2 the dominant failure axis is
+``wrong_position_same_pitch`` on solos. This module lets us measure
+that explicitly per tier.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Iterable, Sequence
+from dataclasses import dataclass, fields
+
+from tabvision.types import TabEvent
+
+DEFAULT_ONSET_TOLERANCE_S = 0.05
+DEFAULT_TIMING_EXTENDED_TOLERANCE_S = 0.15
+
+
+@dataclass(frozen=True)
+class ErrorDecomposition:
+    """Six-bucket failure breakdown for one (predicted, gold) pair.
+
+    Construct via :func:`decompose_errors`; sum across tracks via
+    :func:`aggregate_decompositions`. Bucket counts are non-negative
+    integers.
+    """
+
+    correct: int = 0
+    wrong_position_same_pitch: int = 0
+    pitch_off: int = 0
+    timing_only: int = 0
+    missed_onset: int = 0
+    extra_detection: int = 0
+
+    @property
+    def total_gold(self) -> int:
+        """Number of gold events accounted for. Excludes ``extra_detection``."""
+        return (
+            self.correct
+            + self.wrong_position_same_pitch
+            + self.pitch_off
+            + self.timing_only
+            + self.missed_onset
+        )
+
+    @property
+    def total_predicted(self) -> int:
+        """Number of predicted events accounted for. Excludes ``missed_onset``."""
+        return (
+            self.correct
+            + self.wrong_position_same_pitch
+            + self.pitch_off
+            + self.timing_only
+            + self.extra_detection
+        )
+
+    @property
+    def total_loss(self) -> int:
+        """Events contributing to Tab F1 loss (everything except ``correct``)."""
+        return (
+            self.wrong_position_same_pitch
+            + self.pitch_off
+            + self.timing_only
+            + self.missed_onset
+            + self.extra_detection
+        )
+
+    def share_of_loss(self) -> dict[str, float]:
+        """Per-bucket share of recoverable Tab F1 loss.
+
+        ``correct`` events are not counted as loss; the remaining five
+        buckets sum to 1.0 (or all zeros if ``total_loss`` is 0).
+        """
+        total = self.total_loss
+        if total == 0:
+            return {
+                "wrong_position_same_pitch": 0.0,
+                "pitch_off": 0.0,
+                "timing_only": 0.0,
+                "missed_onset": 0.0,
+                "extra_detection": 0.0,
+            }
+        return {
+            "wrong_position_same_pitch": self.wrong_position_same_pitch / total,
+            "pitch_off": self.pitch_off / total,
+            "timing_only": self.timing_only / total,
+            "missed_onset": self.missed_onset / total,
+            "extra_detection": self.extra_detection / total,
+        }
+
+    def to_dict(self) -> dict[str, int]:
+        return {f.name: getattr(self, f.name) for f in fields(self)}
+
+
+def decompose_errors(
+    predicted: Sequence[TabEvent],
+    gold: Sequence[TabEvent],
+    *,
+    onset_tolerance_s: float = DEFAULT_ONSET_TOLERANCE_S,
+    timing_extended_tolerance_s: float = DEFAULT_TIMING_EXTENDED_TOLERANCE_S,
+) -> ErrorDecomposition:
+    """Bucket the events into the six-bucket Phase 0 schema.
+
+    The matcher is **priority-based** within each tolerance window so
+    chord clusters (multiple gold events at the same onset) don't get
+    mis-paired by raw onset proximity:
+
+    1. **Strict-tolerance pass.** For each gold event, search unclaimed
+       predicted events within ``onset_tolerance_s``. Pick the best in
+       priority order:
+       - same ``(string_idx, fret)`` → ``correct``
+       - same ``pitch_midi`` → ``wrong_position_same_pitch``
+       - neither → ``pitch_off``
+       Within each priority bucket, ties are broken by closest onset.
+    2. **Extended-tolerance pass.** For each gold event still unmatched,
+       search within ``timing_extended_tolerance_s`` for a predicted
+       event that agrees on position or pitch → ``timing_only``.
+       Else → ``missed_onset``.
+
+    Unclaimed predicted events after both passes → ``extra_detection``.
+
+    Priority matters: in a chord cluster with three gold events at the
+    same onset and three predicted events with matching pitches but
+    different on-the-wire ordering, onset-only greediness would shuffle
+    pairings and inflate ``pitch_off``. Priority-based matching tracks
+    ``event_f1(match_pitch=True)`` exactly when ``Pitch F1 = 1.0``.
+    """
+    if onset_tolerance_s <= 0:
+        raise ValueError(f"onset_tolerance_s must be positive; got {onset_tolerance_s}")
+    if timing_extended_tolerance_s < onset_tolerance_s:
+        raise ValueError(
+            f"timing_extended_tolerance_s ({timing_extended_tolerance_s}) must be "
+            f">= onset_tolerance_s ({onset_tolerance_s})"
+        )
+
+    pred_used = [False] * len(predicted)
+
+    correct = 0
+    wrong_position = 0
+    pitch_off = 0
+    timing_only = 0
+    missed = 0
+
+    gold_sorted = sorted(gold, key=lambda g: g.onset_s)
+
+    for g in gold_sorted:
+        # Pass 1: strict-tolerance, priority-ordered match.
+        best_pos_idx = -1
+        best_pitch_idx = -1
+        best_any_idx = -1
+        best_pos_dt = onset_tolerance_s + 1e-9
+        best_pitch_dt = onset_tolerance_s + 1e-9
+        best_any_dt = onset_tolerance_s + 1e-9
+
+        for pi, p in enumerate(predicted):
+            if pred_used[pi]:
+                continue
+            dt = abs(p.onset_s - g.onset_s)
+            if dt > onset_tolerance_s:
+                continue
+            same_pos = p.string_idx == g.string_idx and p.fret == g.fret
+            same_pitch = p.pitch_midi == g.pitch_midi
+            if same_pos:
+                if dt < best_pos_dt:
+                    best_pos_idx = pi
+                    best_pos_dt = dt
+            elif same_pitch:
+                if dt < best_pitch_dt:
+                    best_pitch_idx = pi
+                    best_pitch_dt = dt
+            elif dt < best_any_dt:
+                best_any_idx = pi
+                best_any_dt = dt
+
+        if best_pos_idx >= 0:
+            pred_used[best_pos_idx] = True
+            correct += 1
+            continue
+        if best_pitch_idx >= 0:
+            pred_used[best_pitch_idx] = True
+            wrong_position += 1
+            continue
+        if best_any_idx >= 0:
+            pred_used[best_any_idx] = True
+            pitch_off += 1
+            continue
+
+        # Pass 2: extended-tolerance match on position OR pitch.
+        timing_idx = -1
+        timing_dt = timing_extended_tolerance_s + 1e-9
+        for pi, p in enumerate(predicted):
+            if pred_used[pi]:
+                continue
+            dt = abs(p.onset_s - g.onset_s)
+            if dt > timing_extended_tolerance_s:
+                continue
+            same_pos = p.string_idx == g.string_idx and p.fret == g.fret
+            same_pitch = p.pitch_midi == g.pitch_midi
+            if (same_pos or same_pitch) and dt < timing_dt:
+                timing_idx = pi
+                timing_dt = dt
+
+        if timing_idx >= 0:
+            pred_used[timing_idx] = True
+            timing_only += 1
+            continue
+
+        missed += 1
+
+    extra = sum(1 for used in pred_used if not used)
+
+    return ErrorDecomposition(
+        correct=correct,
+        wrong_position_same_pitch=wrong_position,
+        pitch_off=pitch_off,
+        timing_only=timing_only,
+        missed_onset=missed,
+        extra_detection=extra,
+    )
+
+
+def aggregate_decompositions(
+    decompositions: Iterable[ErrorDecomposition],
+) -> ErrorDecomposition:
+    """Sum a sequence of per-track decompositions into an aggregate."""
+    items = list(decompositions)
+    return ErrorDecomposition(
+        correct=sum(d.correct for d in items),
+        wrong_position_same_pitch=sum(d.wrong_position_same_pitch for d in items),
+        pitch_off=sum(d.pitch_off for d in items),
+        timing_only=sum(d.timing_only for d in items),
+        missed_onset=sum(d.missed_onset for d in items),
+        extra_detection=sum(d.extra_detection for d in items),
+    )
+
+
+__all__ = [
+    "DEFAULT_ONSET_TOLERANCE_S",
+    "DEFAULT_TIMING_EXTENDED_TOLERANCE_S",
+    "ErrorDecomposition",
+    "aggregate_decompositions",
+    "decompose_errors",
+]
diff --git a/tabvision/tabvision/eval/manifest.py b/tabvision/tabvision/eval/manifest.py
index 1d43d0d..9b37caa 100644
--- a/tabvision/tabvision/eval/manifest.py
+++ b/tabvision/tabvision/eval/manifest.py
@@ -24,10 +24,24 @@
     "split",
     "media_path",
     "annotation_path",
+    "annotation_format",
 )
 ALLOWED_SPLITS: tuple[str, ...] = ("train", "validation", "test")
 MIN_PHASE15_CLIPS = 15
 
+SYNTHETIC_SOURCE_PREFIXES: tuple[str, ...] = (
+    "synthtab/",
+    "dadagp/",
+    "synthetic/",
+)
+"""Source-name prefixes flagged as synthetic.
+
+Per the 2026-05-12 design plan §5 (R8 in §7), synthetic-source clips
+must not appear in non-train splits. ``validate_manifest`` emits a
+``SYNTHETIC_IN_EVAL_SPLIT`` fail issue when a clip whose ``source``
+starts with any of these prefixes is listed with ``split`` of
+``"validation"`` or ``"test"``."""
+
 Severity = Literal["info", "warn", "fail"]
 
 
@@ -198,6 +212,25 @@ def validate_manifest(path: str | Path) -> ManifestValidation:
                 )
             )
 
+        # Cross-contamination guard: synthetic-source clips must not appear
+        # in non-train splits. See design plan §5 / risk R8.
+        source = _string_field(clip, "source") or ""
+        if split in {"validation", "test"} and any(
+            source.lower().startswith(prefix) for prefix in SYNTHETIC_SOURCE_PREFIXES
+        ):
+            items.append(
+                ManifestIssue(
+                    severity="fail",
+                    code="SYNTHETIC_IN_EVAL_SPLIT",
+                    message=(
+                        f"Clip {clip_id!r} has synthetic source {source!r} but "
+                        f"split={split!r}; synthetic-source clips are restricted to "
+                        f"split='train' (design plan §5 / R8)."
+                    ),
+                    clip_id=clip_id,
+                )
+            )
+
     if len(clips) < MIN_PHASE15_CLIPS:
         items.append(
             ManifestIssue(
@@ -251,5 +284,6 @@ def _missing_tier_issues(missing_tiers: tuple[str, ...] | list[str]) -> list[Man
     "OPTIONAL_TIERS",
     "REQUIRED_CLIP_FIELDS",
     "REQUIRED_TIERS",
+    "SYNTHETIC_SOURCE_PREFIXES",
     "validate_manifest",
 ]
diff --git a/tabvision/tabvision/eval/manifest_builder.py b/tabvision/tabvision/eval/manifest_builder.py
new file mode 100644
index 0000000..a919a55
--- /dev/null
+++ b/tabvision/tabvision/eval/manifest_builder.py
@@ -0,0 +1,427 @@
+"""Composite-eval manifest builder.
+
+Scans known dataset roots on disk and emits a TOML manifest suitable
+for ``tabvision-composite-eval``. Designed to be deterministic so
+re-runs on the same data produce byte-identical output: clips are
+emitted in sorted-id order, and per-tier caps + total limits are
+applied after that sort.
+
+Currently supports:
+
+- **GuitarSet** (CC-BY-4.0) — clean acoustic single-line + strummed
+  tiers. Default split = player 05 → validation, others → train.
+- **Guitar-TECHS** (CC-BY-4.0) — stubbed; Phase 0 returns ``[]`` until
+  the dataset is acquired locally and the on-disk layout is verified.
+
+EGDB is intentionally not yet wired up (license-pending per the
+2026-05-13 design plan).
+"""
+
+from __future__ import annotations
+
+import argparse
+from collections.abc import Iterable
+from dataclasses import dataclass
+from pathlib import Path
+
+from tabvision.eval.manifest import (
+    SYNTHETIC_SOURCE_PREFIXES,
+    ManifestValidation,
+    validate_manifest,
+)
+
+GUITARSET_VALIDATION_PLAYER = "05"
+
+
+@dataclass(frozen=True)
+class ClipEntry:
+    """Minimal clip-row representation, one per manifest ``[[clips]]``."""
+
+    id: str
+    tier: str
+    source: str
+    split: str
+    media_path: str
+    annotation_path: str
+    annotation_format: str
+
+
+def _guitarset_tier(track_id: str) -> str | None:
+    """Map a GuitarSet track id suffix to a SPEC §1.4 tier name.
+
+    Returns ``None`` for unrecognised suffixes (track is skipped).
+    """
+    if track_id.endswith("_comp"):
+        return "clean_acoustic_strummed"
+    if track_id.endswith("_solo"):
+        return "clean_acoustic_single_line"
+    return None
+
+
+def _guitarset_split(track_id: str, validation_player: str) -> str:
+    """``validation`` for the held-out player, ``train`` otherwise."""
+    if track_id.split("_", 1)[0] == validation_player:
+        return "validation"
+    return "train"
+
+
+def scan_guitarset(
+    root: Path,
+    *,
+    validation_player: str = GUITARSET_VALIDATION_PLAYER,
+) -> list[ClipEntry]:
+    """Scan a GuitarSet directory tree and return discovered clips.
+
+    Expected layout::
+
+        <root>/annotation/<track>.jams
+        <root>/audio_mono-mic/<track>_mic.wav
+
+    Tracks missing either file are skipped. Tracks whose suffix is
+    neither ``_comp`` nor ``_solo`` are skipped.
+    """
+    annotation_dir = root / "annotation"
+    audio_dir = root / "audio_mono-mic"
+    if not annotation_dir.is_dir() or not audio_dir.is_dir():
+        return []
+
+    entries: list[ClipEntry] = []
+    for jams_path in sorted(annotation_dir.glob("*.jams")):
+        track_id = jams_path.stem
+        media_path = audio_dir / f"{track_id}_mic.wav"
+        if not media_path.is_file():
+            continue
+        tier = _guitarset_tier(track_id)
+        if tier is None:
+            continue
+        entries.append(
+            ClipEntry(
+                id=f"guitarset/{track_id}",
+                tier=tier,
+                source="GuitarSet",
+                split=_guitarset_split(track_id, validation_player),
+                media_path=str(media_path.resolve()),
+                annotation_path=str(jams_path.resolve()),
+                annotation_format="guitarset_jams",
+            )
+        )
+    return entries
+
+
+def scan_guitar_techs(root: Path) -> list[ClipEntry]:
+    """Scan a Guitar-TECHS directory tree.
+
+    Returns ``[]`` until the dataset is acquired locally and the
+    on-disk layout (per arXiv:2501.03720) is verified. The strategy
+    doc §3.1 marks Guitar-TECHS as an acquisition item; once the
+    bytes are on disk we can populate this scanner in a follow-up
+    commit.
+    """
+    del root
+    return []
+
+
+def apply_limits(
+    entries: Iterable[ClipEntry],
+    *,
+    max_clips_per_tier: int | None = None,
+    total_limit: int | None = None,
+) -> list[ClipEntry]:
+    """Apply per-tier and total limits deterministically.
+
+    Entries are first sorted by ``id`` (so the same data produces the
+    same output regardless of input scan order), then per-tier capped,
+    then total-limited.
+    """
+    sorted_entries = sorted(entries, key=lambda entry: entry.id)
+
+    if max_clips_per_tier is not None and max_clips_per_tier >= 0:
+        by_tier: dict[str, int] = {}
+        capped: list[ClipEntry] = []
+        for entry in sorted_entries:
+            count = by_tier.get(entry.tier, 0)
+            if count >= max_clips_per_tier:
+                continue
+            capped.append(entry)
+            by_tier[entry.tier] = count + 1
+        sorted_entries = capped
+
+    if total_limit is not None and 0 <= total_limit < len(sorted_entries):
+        sorted_entries = sorted_entries[:total_limit]
+
+    return sorted_entries
+
+
+def _toml_escape(value: str) -> str:
+    """Escape a TOML basic-string value (backslashes + double quotes)."""
+    return value.replace("\\", "\\\\").replace('"', '\\"')
+
+
+def _relativize_to_data_root(path_str: str, data_root: Path | None) -> str:
+    """Rewrite ``path_str`` as ``$TABVISION_DATA_ROOT/<rest>`` when it lives
+    under ``data_root``. Returns the original string when ``data_root`` is
+    ``None`` or the path isn't under it.
+
+    The composite-eval CLI expands ``$TABVISION_DATA_ROOT`` at eval time
+    via the env var or its ``--media-root`` / ``--annotation-root`` args
+    (see :func:`tabvision.eval.composite._resolve_path`), so this keeps
+    checked-in manifests portable across developer machines.
+    """
+    if data_root is None:
+        return path_str
+    abs_root = str(data_root.expanduser().resolve())
+    if path_str == abs_root:
+        return "$TABVISION_DATA_ROOT"
+    if path_str.startswith(abs_root + "/"):
+        rest = path_str[len(abs_root) + 1 :]
+        return f"$TABVISION_DATA_ROOT/{rest}"
+    return path_str
+
+
+def render_toml(
+    entries: Iterable[ClipEntry],
+    *,
+    header_comment: str = "",
+    data_root: Path | None = None,
+) -> str:
+    """Render entries as a TOML composite manifest.
+
+    Output is sorted by clip id for byte-stable re-generation. When
+    ``data_root`` is provided, ``media_path`` and ``annotation_path``
+    values that fall under that root are rewritten as
+    ``$TABVISION_DATA_ROOT/<rest>`` — the composite-eval CLI expands
+    that token at eval time. Use this for checked-in manifests.
+    """
+    sorted_entries = sorted(entries, key=lambda entry: entry.id)
+    lines: list[str] = []
+    if header_comment:
+        for raw_line in header_comment.splitlines():
+            lines.append(f"# {raw_line}" if raw_line else "#")
+        lines.append("")
+    fields = (
+        "id",
+        "tier",
+        "source",
+        "split",
+        "media_path",
+        "annotation_path",
+        "annotation_format",
+    )
+    for entry in sorted_entries:
+        lines.append("[[clips]]")
+        for field in fields:
+            raw = getattr(entry, field)
+            if field in ("media_path", "annotation_path"):
+                raw = _relativize_to_data_root(raw, data_root)
+            value = _toml_escape(raw)
+            lines.append(f'{field} = "{value}"')
+        lines.append("")
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def summarise_coverage(entries: Iterable[ClipEntry]) -> str:
+    """Human-readable coverage summary."""
+    entries_list = list(entries)
+    by_tier: dict[str, dict[str, int]] = {}
+    by_split: dict[str, int] = {}
+    for entry in entries_list:
+        by_tier.setdefault(entry.tier, {}).setdefault(entry.source, 0)
+        by_tier[entry.tier][entry.source] += 1
+        by_split[entry.split] = by_split.get(entry.split, 0) + 1
+
+    lines: list[str] = []
+    lines.append(f"Total clips: {len(entries_list)}")
+    lines.append("Per-tier × source:")
+    for tier in sorted(by_tier):
+        per_source = ", ".join(
+            f"{source}={count}" for source, count in sorted(by_tier[tier].items())
+        )
+        total = sum(by_tier[tier].values())
+        lines.append(f"  {tier}: {total} clips ({per_source})")
+    if by_split:
+        split_summary = ", ".join(
+            f"{split}={count}" for split, count in sorted(by_split.items())
+        )
+        lines.append(f"Splits: {split_summary}")
+    return "\n".join(lines)
+
+
+def _refuse_synthetic_in_eval_splits(entries: Iterable[ClipEntry]) -> None:
+    """Pre-write guard: bail loudly on bad synthetic-source manifests."""
+    for entry in entries:
+        if entry.split == "train":
+            continue
+        source = entry.source.lower()
+        if any(source.startswith(prefix) for prefix in SYNTHETIC_SOURCE_PREFIXES):
+            raise ValueError(
+                f"Clip {entry.id!r} has synthetic source {entry.source!r} but "
+                f"split={entry.split!r}; the manifest validator (and design "
+                f"plan §5 R8) forbid synthetic-source clips in eval splits. "
+                f"Either move to split='train' or remove."
+            )
+
+
+def build_manifest(
+    *,
+    guitarset_root: Path | None = None,
+    guitar_techs_root: Path | None = None,
+    splits: tuple[str, ...] | None = None,
+    max_clips_per_tier: int | None = None,
+    total_limit: int | None = None,
+    validation_player: str = GUITARSET_VALIDATION_PLAYER,
+) -> list[ClipEntry]:
+    """Scan all configured roots and apply filters + limits.
+
+    Sources whose root is ``None`` or doesn't exist are silently skipped.
+    Optional ``splits`` restricts to the named splits (e.g.
+    ``("validation",)`` for a smoke pre-flight). Limits are applied
+    after the split filter, sorted by clip id for determinism.
+    """
+    entries: list[ClipEntry] = []
+    if guitarset_root is not None:
+        entries.extend(
+            scan_guitarset(guitarset_root, validation_player=validation_player)
+        )
+    if guitar_techs_root is not None:
+        entries.extend(scan_guitar_techs(guitar_techs_root))
+
+    _refuse_synthetic_in_eval_splits(entries)
+
+    if splits is not None:
+        allowed = set(splits)
+        entries = [entry for entry in entries if entry.split in allowed]
+
+    return apply_limits(
+        entries,
+        max_clips_per_tier=max_clips_per_tier,
+        total_limit=total_limit,
+    )
+
+
+def main(argv: list[str] | None = None) -> int:
+    """CLI entry point: ``tabvision-build-composite-manifest``."""
+    parser = argparse.ArgumentParser(
+        prog="build_composite_manifest",
+        description=(
+            "Scan dataset roots on disk and emit a composite-eval TOML manifest."
+        ),
+    )
+    parser.add_argument(
+        "--guitarset",
+        type=Path,
+        default=None,
+        help="GuitarSet root directory (with annotation/ and audio_mono-mic/)",
+    )
+    parser.add_argument(
+        "--guitar-techs",
+        type=Path,
+        default=None,
+        help="Guitar-TECHS root directory (scanner is currently a stub)",
+    )
+    parser.add_argument("--output", type=Path, required=True)
+    parser.add_argument(
+        "--max-clips-per-tier",
+        type=int,
+        default=None,
+        help="cap clips per tier; useful for smoke runs",
+    )
+    parser.add_argument(
+        "--limit",
+        type=int,
+        default=None,
+        help="cap total clips after per-tier cap; useful for smoke runs",
+    )
+    parser.add_argument(
+        "--guitarset-validation-player",
+        default=GUITARSET_VALIDATION_PLAYER,
+        help="GuitarSet player id whose tracks go into the validation split",
+    )
+    parser.add_argument(
+        "--splits",
+        default=None,
+        help=(
+            "comma-separated splits to include (e.g. 'validation' for a "
+            "smoke pre-flight). Default: include all splits."
+        ),
+    )
+    parser.add_argument(
+        "--data-root",
+        type=Path,
+        default=None,
+        help=(
+            "rewrite media/annotation paths that fall under this root as "
+            "$TABVISION_DATA_ROOT/<rest> for portable checked-in manifests"
+        ),
+    )
+
+    args = parser.parse_args(argv)
+
+    if args.guitarset is None and args.guitar_techs is None:
+        parser.error("specify at least one of --guitarset or --guitar-techs")
+
+    splits_filter: tuple[str, ...] | None = None
+    if args.splits:
+        splits_filter = tuple(s.strip() for s in args.splits.split(",") if s.strip())
+
+    try:
+        entries = build_manifest(
+            guitarset_root=args.guitarset,
+            guitar_techs_root=args.guitar_techs,
+            splits=splits_filter,
+            max_clips_per_tier=args.max_clips_per_tier,
+            total_limit=args.limit,
+            validation_player=args.guitarset_validation_player,
+        )
+    except ValueError as exc:
+        print(f"error: {exc}", flush=True)
+        return 2
+
+    if not entries:
+        print(
+            "No clips discovered. Check --guitarset / --guitar-techs paths.",
+            flush=True,
+        )
+        return 1
+
+    header = (
+        "Composite-eval manifest generated by "
+        "tabvision/scripts/eval/build_composite_manifest.py."
+        "\nRe-generate with the same args to refresh; this file is "
+        "intended to be auto-managed."
+    )
+    args.output.parent.mkdir(parents=True, exist_ok=True)
+    args.output.write_text(
+        render_toml(entries, header_comment=header, data_root=args.data_root),
+        encoding="utf-8",
+    )
+
+    print(f"Wrote {len(entries)} clips to {args.output}", flush=True)
+    print(summarise_coverage(entries), flush=True)
+
+    validation: ManifestValidation = validate_manifest(args.output)
+    fail_items = [item for item in validation.items if item.severity == "fail"]
+    if fail_items:
+        print(f"\nValidation FAILED with {len(fail_items)} issue(s):", flush=True)
+        for item in fail_items:
+            print(f"  [{item.code}] {item.message}", flush=True)
+        return 2
+
+    print("\nManifest validation passed.", flush=True)
+    return 0
+
+
+__all__ = [
+    "ClipEntry",
+    "GUITARSET_VALIDATION_PLAYER",
+    "apply_limits",
+    "build_manifest",
+    "main",
+    "render_toml",
+    "scan_guitar_techs",
+    "scan_guitarset",
+    "summarise_coverage",
+]
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tabvision/tabvision/eval/metrics.py b/tabvision/tabvision/eval/metrics.py
index 92fd24f..d30042a 100644
--- a/tabvision/tabvision/eval/metrics.py
+++ b/tabvision/tabvision/eval/metrics.py
@@ -164,9 +164,81 @@ def _cluster_by_gap(events: Sequence[TabEvent], gap_s: float) -> list[list[TabEv
     return clusters
 
 
+@dataclass(frozen=True)
+class EventF1Result:
+    """Onset-only or onset+pitch F1 over two ``TabEvent`` sequences.
+
+    Mirrors the structure of :class:`TabF1Result` but represents the
+    looser matchers used to track audio-side performance independent
+    of string/fret assignment.
+    """
+
+    precision: float
+    recall: float
+    f1: float
+    true_positives: int
+    false_positives: int
+    false_negatives: int
+
+
+def event_f1(
+    predicted: Sequence[TabEvent],
+    gold: Sequence[TabEvent],
+    *,
+    match_pitch: bool = True,
+    onset_tolerance_s: float = 0.05,
+) -> EventF1Result:
+    """F1 over predicted-vs-gold events on onset (optionally + pitch).
+
+    With ``match_pitch=False`` this is onset F1 (SPEC §1.4 line 1).
+    With ``match_pitch=True`` (default) it is pitch F1 (SPEC §1.4 line 2).
+    String / fret agreement is ignored — that is what :func:`tab_f1` is for.
+    """
+    pred_sorted = sorted(predicted, key=lambda t: t.onset_s)
+    gold_sorted = sorted(gold, key=lambda t: t.onset_s)
+    gold_used = [False] * len(gold_sorted)
+    tp = 0
+    fp = 0
+    for p in pred_sorted:
+        best_j = -1
+        best_dt = onset_tolerance_s + 1e-9
+        for j, g in enumerate(gold_sorted):
+            if gold_used[j]:
+                continue
+            if match_pitch and g.pitch_midi != p.pitch_midi:
+                continue
+            dt = abs(g.onset_s - p.onset_s)
+            if dt <= onset_tolerance_s and dt < best_dt:
+                best_j = j
+                best_dt = dt
+        if best_j >= 0:
+            gold_used[best_j] = True
+            tp += 1
+        else:
+            fp += 1
+    fn = sum(1 for used in gold_used if not used)
+    precision = tp / (tp + fp) if (tp + fp) > 0 else 0.0
+    recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0
+    f1 = (
+        2 * precision * recall / (precision + recall)
+        if (precision + recall) > 0
+        else 0.0
+    )
+    return EventF1Result(
+        precision=precision,
+        recall=recall,
+        f1=f1,
+        true_positives=tp,
+        false_positives=fp,
+        false_negatives=fn,
+    )
+
+
 __all__ = [
-    "TabF1Result",
     "ChordAccuracyResult",
-    "tab_f1",
+    "EventF1Result",
+    "TabF1Result",
     "chord_instance_accuracy",
+    "event_f1",
+    "tab_f1",
 ]
diff --git a/tabvision/tabvision/eval/parsers/__init__.py b/tabvision/tabvision/eval/parsers/__init__.py
new file mode 100644
index 0000000..656e8a8
--- /dev/null
+++ b/tabvision/tabvision/eval/parsers/__init__.py
@@ -0,0 +1,31 @@
+"""Annotation parsers — uniform interface for source-specific tab labels.
+
+Each parser module exposes:
+
+- ``FORMAT_NAME``: the string key that appears in
+  ``Manifest.clip.annotation_format`` (added in Phase 0 to support
+  multi-source composite eval).
+- ``parse(annotation_path, cfg) -> list[TabEvent]``: pure function;
+  no I/O outside the file at ``annotation_path``.
+
+Submodule imports below trigger registration in
+:mod:`tabvision.eval.parsers.registry`.
+"""
+
+# Built-in parsers — importing them registers their FORMAT_NAME.
+from tabvision.eval.parsers import guitar_techs_midi, guitarset_jams  # noqa: F401
+from tabvision.eval.parsers.registry import (
+    ParserFn,
+    clear_parsers,
+    get_parser,
+    list_parsers,
+    register_parser,
+)
+
+__all__ = [
+    "ParserFn",
+    "clear_parsers",
+    "get_parser",
+    "list_parsers",
+    "register_parser",
+]
diff --git a/tabvision/tabvision/eval/parsers/guitar_techs_midi.py b/tabvision/tabvision/eval/parsers/guitar_techs_midi.py
new file mode 100644
index 0000000..69b0cbd
--- /dev/null
+++ b/tabvision/tabvision/eval/parsers/guitar_techs_midi.py
@@ -0,0 +1,84 @@
+"""Guitar-TECHS 6-track MIDI annotation parser.
+
+Per arXiv:2501.03720 §3, Guitar-TECHS distributes one MIDI file per
+clip with six instrument tracks, each carrying the notes for one
+guitar string. The default ordering is low E → high E, matching the
+:class:`tabvision.types.GuitarConfig` ``tuning_midi`` convention
+(low E = ``string_idx`` 0).
+
+If a particular Guitar-TECHS release uses a different track ordering,
+pass ``track_to_string`` to ``parse`` directly; manifest-level support
+for parser arguments is deferred to a later phase.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from tabvision.eval.parsers.registry import register_parser
+from tabvision.types import GuitarConfig, TabEvent
+
+FORMAT_NAME = "guitar_techs_midi"
+
+DEFAULT_TRACK_TO_STRING: tuple[int, ...] = (0, 1, 2, 3, 4, 5)
+"""Track-index → ``string_idx`` mapping; default = identity (low E first)."""
+
+
+def parse(
+    midi_path: str | Path,
+    cfg: GuitarConfig | None = None,
+    *,
+    track_to_string: tuple[int, ...] = DEFAULT_TRACK_TO_STRING,
+) -> list[TabEvent]:
+    """Parse Guitar-TECHS MIDI into v1 :class:`TabEvent` gold notes.
+
+    Pitch ``p`` on the track mapped to string ``s`` is assigned
+    ``fret = p - cfg.tuning_midi[s]``. Notes that would imply a fret
+    below ``cfg.capo`` or above ``cfg.max_fret`` are dropped.
+    """
+    try:
+        import pretty_midi  # noqa: PLC0415
+    except ImportError as exc:  # pragma: no cover - skip path
+        raise ImportError(
+            "guitar_techs_midi parser requires pretty_midi. Install with: "
+            "pip install -e 'tabvision[audio-highres]'"
+        ) from exc
+
+    if cfg is None:
+        cfg = GuitarConfig()
+
+    midi = pretty_midi.PrettyMIDI(str(midi_path))
+
+    out: list[TabEvent] = []
+    for track_index, instrument in enumerate(midi.instruments):
+        if track_index >= len(track_to_string):
+            break
+        string_idx = track_to_string[track_index]
+        if not 0 <= string_idx < cfg.n_strings:
+            continue
+
+        open_pitch = cfg.tuning_midi[string_idx]
+        for note in instrument.notes:
+            pitch_midi = int(note.pitch)
+            fret = pitch_midi - open_pitch
+            if fret < cfg.capo or fret > cfg.max_fret:
+                continue
+            out.append(
+                TabEvent(
+                    onset_s=float(note.start),
+                    duration_s=float(max(0.0, note.end - note.start)),
+                    string_idx=string_idx,
+                    fret=fret,
+                    pitch_midi=pitch_midi,
+                    confidence=1.0,
+                )
+            )
+
+    out.sort(key=lambda ev: (ev.onset_s, ev.string_idx, ev.fret))
+    return out
+
+
+register_parser(FORMAT_NAME, parse)
+
+
+__all__ = ["DEFAULT_TRACK_TO_STRING", "FORMAT_NAME", "parse"]
diff --git a/tabvision/tabvision/eval/parsers/guitarset_jams.py b/tabvision/tabvision/eval/parsers/guitarset_jams.py
new file mode 100644
index 0000000..566d2cb
--- /dev/null
+++ b/tabvision/tabvision/eval/parsers/guitarset_jams.py
@@ -0,0 +1,18 @@
+"""GuitarSet JAMS annotation parser.
+
+Wraps the existing :func:`tabvision.eval.guitarset_audio.parse_guitarset_jams`
+under the uniform parser interface so composite-eval dispatch can route
+``annotation_format = "guitarset_jams"`` clips here.
+"""
+
+from __future__ import annotations
+
+from tabvision.eval.guitarset_audio import parse_guitarset_jams as parse
+from tabvision.eval.parsers.registry import register_parser
+
+FORMAT_NAME = "guitarset_jams"
+
+register_parser(FORMAT_NAME, parse)
+
+
+__all__ = ["FORMAT_NAME", "parse"]
diff --git a/tabvision/tabvision/eval/parsers/registry.py b/tabvision/tabvision/eval/parsers/registry.py
new file mode 100644
index 0000000..99a29de
--- /dev/null
+++ b/tabvision/tabvision/eval/parsers/registry.py
@@ -0,0 +1,69 @@
+"""Annotation-parser registry.
+
+Each annotation source (GuitarSet JAMS, Guitar-TECHS 6-track MIDI, EGDB
+GuitarPro, etc.) gets a parser module that registers itself here on
+import. Composite-eval dispatch then routes by
+``Manifest.clip.annotation_format`` to the registered parser.
+
+This file is import-side-effect free: the registry is empty at first
+import. Built-in parsers are registered by ``parsers/__init__.py``
+importing their submodules.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Callable
+from pathlib import Path
+
+from tabvision.types import GuitarConfig, TabEvent
+
+ParserFn = Callable[[str | Path, GuitarConfig | None], list[TabEvent]]
+"""``(annotation_path, cfg) -> list[TabEvent]``. ``cfg`` may be ``None``."""
+
+
+_PARSERS: dict[str, ParserFn] = {}
+
+
+def register_parser(format_name: str, fn: ParserFn) -> None:
+    """Register ``fn`` as the parser for ``format_name``.
+
+    Raises ``ValueError`` if ``format_name`` is already registered.
+    """
+    if format_name in _PARSERS:
+        raise ValueError(
+            f"Parser already registered for format {format_name!r}; "
+            f"call clear_parsers() first if this is intentional."
+        )
+    _PARSERS[format_name] = fn
+
+
+def get_parser(format_name: str) -> ParserFn:
+    """Look up the parser for ``format_name``.
+
+    Raises ``KeyError`` with the list of known formats if not registered.
+    """
+    if format_name not in _PARSERS:
+        known = ", ".join(sorted(_PARSERS)) or "(none registered)"
+        raise KeyError(
+            f"Unknown annotation format: {format_name!r}. Known: {known}."
+        )
+    return _PARSERS[format_name]
+
+
+def list_parsers() -> list[str]:
+    """Return the sorted list of registered format names."""
+    return sorted(_PARSERS)
+
+
+def clear_parsers() -> None:
+    """Remove all registered parsers. For tests only."""
+    _PARSERS.clear()
+
+
+__all__ = [
+    "ParserFn",
+    "clear_parsers",
+    "get_parser",
+    "list_parsers",
+    "register_parser",
+]
diff --git a/tabvision/tests/integration/test_composite_eval_smoke.py b/tabvision/tests/integration/test_composite_eval_smoke.py
new file mode 100644
index 0000000..63faa13
--- /dev/null
+++ b/tabvision/tests/integration/test_composite_eval_smoke.py
@@ -0,0 +1,486 @@
+"""Integration smoke tests for the composite-eval harness (Phase 0)."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pytest
+
+from tabvision.eval.composite import (
+    Predictor,
+    run_composite_eval,
+)
+from tabvision.types import SessionConfig, TabEvent
+
+# Standard tuning open pitches for derived MIDI.
+_OPEN_PITCH = (40, 45, 50, 55, 59, 64)
+
+
+def _write_jams(
+    path: Path,
+    notes: list[tuple[float, float, int, int]],
+) -> None:
+    """Write a minimal GuitarSet-style JAMS at ``path``.
+
+    Each ``notes`` tuple is ``(onset_s, duration_s, string_idx, fret)``.
+    """
+    by_string: dict[int, list[dict[str, float]]] = {}
+    for onset, duration, string_idx, fret in notes:
+        midi = _OPEN_PITCH[string_idx] + fret
+        by_string.setdefault(string_idx, []).append(
+            {"time": float(onset), "duration": float(duration), "value": float(midi)}
+        )
+    payload = {
+        "annotations": [
+            {
+                "namespace": "note_midi",
+                "annotation_metadata": {"data_source": str(string_idx)},
+                "data": data,
+            }
+            for string_idx, data in sorted(by_string.items())
+        ]
+    }
+    path.write_text(json.dumps(payload), encoding="utf-8")
+
+
+def _tab_event(onset: float, duration: float, string_idx: int, fret: int) -> TabEvent:
+    return TabEvent(
+        onset_s=onset,
+        duration_s=duration,
+        string_idx=string_idx,
+        fret=fret,
+        pitch_midi=_OPEN_PITCH[string_idx] + fret,
+        confidence=1.0,
+    )
+
+
+def _write_manifest(
+    manifest_path: Path,
+    entries: list[dict[str, str]],
+) -> None:
+    """Build a TOML manifest from a list of clip-dict entries."""
+    lines: list[str] = []
+    for entry in entries:
+        lines.append("[[clips]]")
+        for key, value in entry.items():
+            lines.append(f'{key} = "{value}"')
+        lines.append("")
+    manifest_path.write_text("\n".join(lines), encoding="utf-8")
+
+
+def _make_predictor(gold_by_path: dict[str, list[TabEvent]]) -> Predictor:
+    """Return a predictor that echoes gold for each known path."""
+
+    def predict(media_path: Path, session: SessionConfig) -> list[TabEvent]:
+        del session
+        key = str(media_path)
+        if key not in gold_by_path:
+            raise KeyError(f"unknown media path in test: {key}")
+        return list(gold_by_path[key])
+
+    return predict
+
+
+def _shifted_predictor(gold_by_path: dict[str, list[TabEvent]]) -> Predictor:
+    """Return a predictor that shifts every event to a different string with the same pitch."""
+
+    def predict(media_path: Path, session: SessionConfig) -> list[TabEvent]:
+        del session
+        gold = gold_by_path[str(media_path)]
+        out: list[TabEvent] = []
+        for event in gold:
+            for candidate_string in range(6):
+                if candidate_string == event.string_idx:
+                    continue
+                fret = event.pitch_midi - _OPEN_PITCH[candidate_string]
+                if 0 <= fret <= 24:
+                    out.append(
+                        TabEvent(
+                            onset_s=event.onset_s,
+                            duration_s=event.duration_s,
+                            string_idx=candidate_string,
+                            fret=fret,
+                            pitch_midi=event.pitch_midi,
+                            confidence=event.confidence,
+                        )
+                    )
+                    break
+        return out
+
+    return predict
+
+
+def _build_two_tier_manifest(tmp_path: Path) -> tuple[Path, dict[str, list[TabEvent]]]:
+    """Two clips in clean_acoustic_strummed + one in clean_acoustic_single_line.
+
+    Returns (manifest_path, gold_by_media_path).
+    """
+    # Mid-range pitches so the shifted_predictor in tests below can find a
+    # legal alternate string (low pitches like low-E fret 3 can only live on
+    # string 0; shifting them yields no prediction).
+    clips = [
+        (
+            "guitarset-strum-01",
+            "clean_acoustic_strummed",
+            [(0.0, 0.5, 0, 7), (0.0, 0.5, 1, 7), (0.0, 0.5, 2, 7)],
+        ),
+        (
+            "guitarset-strum-02",
+            "clean_acoustic_strummed",
+            [(1.0, 0.4, 3, 5), (1.5, 0.4, 4, 5)],
+        ),
+        (
+            "guitarset-single-01",
+            "clean_acoustic_single_line",
+            [(0.0, 0.2, 2, 5), (0.5, 0.2, 2, 7), (1.0, 0.2, 2, 9)],
+        ),
+    ]
+
+    gold_by_path: dict[str, list[TabEvent]] = {}
+    entries: list[dict[str, str]] = []
+    for clip_id, tier, notes in clips:
+        jams_path = tmp_path / f"{clip_id}.jams"
+        media_path = tmp_path / f"{clip_id}.wav"
+        media_path.write_bytes(b"")  # zero-byte placeholder; predictor doesn't read it
+        _write_jams(jams_path, notes)
+        gold_by_path[str(media_path)] = [
+            _tab_event(o, d, s, f) for (o, d, s, f) in notes
+        ]
+        entries.append(
+            {
+                "id": clip_id,
+                "tier": tier,
+                "source": "GuitarSet",
+                "split": "validation",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                "annotation_format": "guitarset_jams",
+            }
+        )
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(manifest_path, entries)
+    return manifest_path, gold_by_path
+
+
+def test_perfect_predictor_yields_pass_on_both_tiers(tmp_path: Path) -> None:
+    manifest_path, gold_by_path = _build_two_tier_manifest(tmp_path)
+    predictor = _make_predictor(gold_by_path)
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=predictor,
+        bootstrap_n=500,
+        bootstrap_seed=42,
+    )
+
+    assert set(report.tiers) == {
+        "clean_acoustic_strummed",
+        "clean_acoustic_single_line",
+    }
+    for tier, tier_report in report.tiers.items():
+        assert tier_report.tab_f1.statistic == pytest.approx(1.0), (
+            f"tier {tier} should be perfect with echo predictor"
+        )
+        assert tier_report.onset_f1.statistic == pytest.approx(1.0)
+        assert tier_report.pitch_f1.statistic == pytest.approx(1.0)
+
+
+def test_acceptance_helper_classifies_pass_gap_fail(tmp_path: Path) -> None:
+    manifest_path, gold_by_path = _build_two_tier_manifest(tmp_path)
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor(gold_by_path),
+        bootstrap_n=500,
+    )
+
+    targets = {
+        "clean_acoustic_strummed": 0.90,
+        "clean_acoustic_single_line": 0.85,
+        "clean_electric": 0.87,  # not in manifest
+    }
+    statuses = report.tab_f1_acceptance(targets)
+    assert statuses["clean_acoustic_strummed"] == "pass"
+    assert statuses["clean_acoustic_single_line"] == "pass"
+    assert statuses["clean_electric"] == "missing"
+
+
+def test_shifted_predictor_populates_wrong_position_bucket(tmp_path: Path) -> None:
+    """Every prediction same-pitch different-string → fills wrong_position_same_pitch."""
+    manifest_path, gold_by_path = _build_two_tier_manifest(tmp_path)
+    predictor = _shifted_predictor(gold_by_path)
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=predictor,
+        bootstrap_n=500,
+    )
+
+    strum = report.tiers["clean_acoustic_strummed"].errors
+    # All predictions are pitch-correct but position-wrong: zero correct,
+    # all in the wrong_position bucket.
+    assert strum.correct == 0
+    assert strum.wrong_position_same_pitch > 0
+    assert strum.pitch_off == 0
+    assert strum.missed_onset == 0
+
+
+def test_train_clips_skipped_by_default(tmp_path: Path) -> None:
+    """A train-split clip should not appear in per_clip results."""
+    jams_path = tmp_path / "train.jams"
+    media_path = tmp_path / "train.wav"
+    media_path.write_bytes(b"")
+    _write_jams(jams_path, [(0.0, 0.2, 0, 0)])
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "train-01",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "train",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                "annotation_format": "guitarset_jams",
+            }
+        ],
+    )
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor({}),
+        bootstrap_n=100,
+    )
+
+    assert report.per_clip == []
+    assert report.tiers == {}
+
+
+def test_explicit_train_split_includes_train_clips(tmp_path: Path) -> None:
+    jams_path = tmp_path / "train.jams"
+    media_path = tmp_path / "train.wav"
+    media_path.write_bytes(b"")
+    notes = [(0.0, 0.2, 0, 0)]
+    _write_jams(jams_path, notes)
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "train-01",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "train",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                "annotation_format": "guitarset_jams",
+            }
+        ],
+    )
+
+    gold = {str(media_path): [_tab_event(o, d, s, f) for (o, d, s, f) in notes]}
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor(gold),
+        splits=("train",),
+        bootstrap_n=100,
+    )
+
+    assert len(report.per_clip) == 1
+    assert report.per_clip[0].clip_id == "train-01"
+
+
+def test_rejects_manifest_with_fail_issues(tmp_path: Path) -> None:
+    """Missing required field (annotation_format) should block the eval."""
+    jams_path = tmp_path / "clip.jams"
+    media_path = tmp_path / "clip.wav"
+    media_path.write_bytes(b"")
+    _write_jams(jams_path, [(0.0, 0.2, 0, 0)])
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "clip-no-format",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "validation",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                # annotation_format intentionally omitted
+            }
+        ],
+    )
+
+    with pytest.raises(ValueError, match="fail-severity"):
+        run_composite_eval(
+            manifest_path,
+            predictor=_make_predictor({}),
+            bootstrap_n=100,
+        )
+
+
+def test_unknown_parser_format_raises(tmp_path: Path) -> None:
+    """A manifest referencing an unregistered parser should raise KeyError at dispatch."""
+    jams_path = tmp_path / "clip.jams"
+    media_path = tmp_path / "clip.wav"
+    media_path.write_bytes(b"")
+    _write_jams(jams_path, [(0.0, 0.2, 0, 0)])
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "weird",
+                "tier": "clean_acoustic_single_line",
+                "source": "Unknown",
+                "split": "validation",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                "annotation_format": "non_existent_format",
+            }
+        ],
+    )
+
+    with pytest.raises(KeyError, match="non_existent_format"):
+        run_composite_eval(
+            manifest_path,
+            predictor=_make_predictor({}),
+            bootstrap_n=100,
+        )
+
+
+def test_data_root_substitution_uses_env_var(
+    tmp_path: Path,
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
+    """$TABVISION_DATA_ROOT in paths is expanded via env var when no override."""
+    data_root = tmp_path / "data"
+    data_root.mkdir()
+    jams_path = data_root / "clip.jams"
+    media_path = data_root / "clip.wav"
+    media_path.write_bytes(b"")
+    _write_jams(jams_path, [(0.0, 0.2, 0, 0)])
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "with-root",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "validation",
+                "media_path": "$TABVISION_DATA_ROOT/clip.wav",
+                "annotation_path": "$TABVISION_DATA_ROOT/clip.jams",
+                "annotation_format": "guitarset_jams",
+            }
+        ],
+    )
+
+    monkeypatch.setenv("TABVISION_DATA_ROOT", str(data_root))
+    gold = {str(media_path): [_tab_event(0.0, 0.2, 0, 0)]}
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor(gold),
+        bootstrap_n=100,
+    )
+
+    assert len(report.per_clip) == 1
+
+
+def test_data_root_substitution_uses_function_arg(
+    tmp_path: Path,
+    monkeypatch: pytest.MonkeyPatch,
+) -> None:
+    """``annotation_root`` arg overrides the env var."""
+    real_root = tmp_path / "real"
+    real_root.mkdir()
+    jams_path = real_root / "clip.jams"
+    media_path = real_root / "clip.wav"
+    media_path.write_bytes(b"")
+    _write_jams(jams_path, [(0.0, 0.2, 0, 0)])
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "rooted",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "validation",
+                "media_path": "$TABVISION_DATA_ROOT/clip.wav",
+                "annotation_path": "$TABVISION_DATA_ROOT/clip.jams",
+                "annotation_format": "guitarset_jams",
+            }
+        ],
+    )
+
+    monkeypatch.setenv("TABVISION_DATA_ROOT", "/nonexistent")
+    gold = {str(media_path): [_tab_event(0.0, 0.2, 0, 0)]}
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor(gold),
+        media_root=str(real_root),
+        annotation_root=str(real_root),
+        bootstrap_n=100,
+    )
+
+    assert len(report.per_clip) == 1
+
+
+def test_per_clip_metrics_include_error_decomposition(tmp_path: Path) -> None:
+    """Each ClipEvalResult should carry the six-bucket decomposition."""
+    manifest_path, gold_by_path = _build_two_tier_manifest(tmp_path)
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor(gold_by_path),
+        bootstrap_n=100,
+    )
+
+    for clip_result in report.per_clip:
+        # Echo predictor → all gold notes should be correct
+        assert clip_result.errors.correct == clip_result.n_gold
+        assert clip_result.errors.total_loss == 0
+
+
+def test_clip_with_no_gold_or_predictions(tmp_path: Path) -> None:
+    """Empty-gold clip should not break aggregation; F1 is 0 by convention."""
+    jams_path = tmp_path / "empty.jams"
+    jams_path.write_text(json.dumps({"annotations": []}), encoding="utf-8")
+    media_path = tmp_path / "empty.wav"
+    media_path.write_bytes(b"")
+
+    manifest_path = tmp_path / "composite.toml"
+    _write_manifest(
+        manifest_path,
+        [
+            {
+                "id": "empty-clip",
+                "tier": "clean_acoustic_single_line",
+                "source": "GuitarSet",
+                "split": "validation",
+                "media_path": str(media_path),
+                "annotation_path": str(jams_path),
+                "annotation_format": "guitarset_jams",
+            }
+        ],
+    )
+
+    report = run_composite_eval(
+        manifest_path,
+        predictor=_make_predictor({str(media_path): []}),
+        bootstrap_n=100,
+    )
+
+    assert len(report.per_clip) == 1
+    assert report.per_clip[0].tab.f1 == 0.0
diff --git a/tabvision/tests/unit/test_bootstrap_ci.py b/tabvision/tests/unit/test_bootstrap_ci.py
new file mode 100644
index 0000000..0b71ca7
--- /dev/null
+++ b/tabvision/tests/unit/test_bootstrap_ci.py
@@ -0,0 +1,111 @@
+"""Tests for the bootstrap-CI helper (Phase 0)."""
+
+from __future__ import annotations
+
+import numpy as np
+import pytest
+
+from tabvision.eval.bootstrap import BootstrapResult, bootstrap_ci
+
+
+def test_returns_bootstrap_result_type():
+    r = bootstrap_ci([0.5, 0.6, 0.7])
+    assert isinstance(r, BootstrapResult)
+    assert r.n_observations == 3
+    assert r.n_bootstrap == 10_000
+    assert r.confidence == 0.95
+
+
+def test_deterministic_with_seed():
+    values = [0.10, 0.50, 0.90, 0.60, 0.30, 0.80]
+    r1 = bootstrap_ci(values, seed=42)
+    r2 = bootstrap_ci(values, seed=42)
+    assert r1.statistic == r2.statistic
+    assert r1.lower == r2.lower
+    assert r1.upper == r2.upper
+
+
+def test_different_seeds_produce_different_intervals():
+    values = [0.10, 0.50, 0.90, 0.60, 0.30, 0.80]
+    r1 = bootstrap_ci(values, seed=42)
+    r2 = bootstrap_ci(values, seed=43)
+    # CI endpoints may coincide on small data; require at least one to differ.
+    assert (r1.lower != r2.lower) or (r1.upper != r2.upper)
+
+
+def test_single_observation_has_zero_width_ci():
+    r = bootstrap_ci([0.85])
+    assert r.statistic == pytest.approx(0.85)
+    assert r.lower == r.statistic == r.upper
+    assert r.n_observations == 1
+    assert r.n_bootstrap == 0
+
+
+def test_rejects_empty_values():
+    with pytest.raises(ValueError, match="at least one observation"):
+        bootstrap_ci([])
+
+
+@pytest.mark.parametrize("bad_conf", [0.0, 1.0, -0.1, 1.5])
+def test_rejects_bad_confidence(bad_conf):
+    with pytest.raises(ValueError, match="confidence"):
+        bootstrap_ci([0.5, 0.6], confidence=bad_conf)
+
+
+def test_rejects_zero_bootstrap():
+    with pytest.raises(ValueError, match="n_bootstrap"):
+        bootstrap_ci([0.5, 0.6], n_bootstrap=0)
+
+
+def test_accepts_numpy_array():
+    arr = np.array([0.1, 0.5, 0.9])
+    r = bootstrap_ci(arr)
+    assert r.statistic == pytest.approx(0.5)
+    assert r.n_observations == 3
+
+
+def test_custom_statistic():
+    """Verify a non-mean statistic is honored."""
+    values = [1.0, 2.0, 3.0, 4.0, 5.0]
+    r_median = bootstrap_ci(values, statistic=np.median, seed=0)
+    r_mean = bootstrap_ci(values, statistic=np.mean, seed=0)
+    # On this small sample they may coincide; correctness check is that
+    # statistic is honored, not that they differ.
+    assert r_median.statistic == pytest.approx(3.0)
+    assert r_mean.statistic == pytest.approx(3.0)
+
+
+def test_lower_le_statistic_le_upper():
+    values = [0.1, 0.3, 0.5, 0.7, 0.9, 0.2, 0.4, 0.6, 0.8]
+    r = bootstrap_ci(values, seed=7)
+    assert r.lower <= r.statistic <= r.upper
+
+
+def test_ci_brackets_known_normal_mean():
+    """Coverage check: 95% CI should contain the true mean in roughly 95% of trials.
+
+    Bootstrap percentile intervals are asymptotic — allow generous slack
+    so this isn't flaky. We require >= 88% coverage on a low-trial run
+    (200 trials, n_obs=80, n_bootstrap=500) for speed.
+    """
+    rng = np.random.default_rng(0)
+    n_trials = 200
+    n_obs = 80
+    true_mean = 0.85
+    sigma = 0.05
+    hits = 0
+    for trial in range(n_trials):
+        sample = rng.normal(true_mean, sigma, n_obs)
+        r = bootstrap_ci(sample, seed=trial, n_bootstrap=500)
+        if r.lower <= true_mean <= r.upper:
+            hits += 1
+    coverage = hits / n_trials
+    assert coverage >= 0.88, f"bootstrap coverage {coverage:.3f} below 0.88"
+
+
+def test_zero_variance_input_collapses_ci():
+    """If every observation is identical, the CI is a point."""
+    r = bootstrap_ci([0.5] * 10, seed=42)
+    assert r.statistic == pytest.approx(0.5)
+    assert r.lower == pytest.approx(0.5)
+    assert r.upper == pytest.approx(0.5)
diff --git a/tabvision/tests/unit/test_composite_report_formatting.py b/tabvision/tests/unit/test_composite_report_formatting.py
new file mode 100644
index 0000000..3a74b97
--- /dev/null
+++ b/tabvision/tests/unit/test_composite_report_formatting.py
@@ -0,0 +1,197 @@
+"""Smoke tests for the composite-eval markdown formatters (Phase 0)."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from tabvision.eval.bootstrap import BootstrapResult
+from tabvision.eval.composite import (
+    DEFAULT_TIER_TARGETS,
+    ClipEvalResult,
+    CompositeReport,
+    TierReport,
+    format_baseline_markdown,
+    format_decomposition_markdown,
+)
+from tabvision.eval.error_decomposition import ErrorDecomposition
+from tabvision.eval.manifest import ManifestValidation
+from tabvision.eval.metrics import EventF1Result, TabF1Result
+
+
+def _bootstrap(value: float, lower: float, upper: float) -> BootstrapResult:
+    return BootstrapResult(
+        statistic=value,
+        lower=lower,
+        upper=upper,
+        n_observations=20,
+        n_bootstrap=10_000,
+        confidence=0.95,
+    )
+
+
+def _event_f1(value: float) -> EventF1Result:
+    return EventF1Result(
+        precision=value,
+        recall=value,
+        f1=value,
+        true_positives=10,
+        false_positives=1,
+        false_negatives=1,
+    )
+
+
+def _tab_f1(value: float) -> TabF1Result:
+    return TabF1Result(
+        precision=value,
+        recall=value,
+        f1=value,
+        true_positives=10,
+        false_positives=1,
+        false_negatives=1,
+    )
+
+
+def _clip(tier: str, source: str, tab_value: float) -> ClipEvalResult:
+    return ClipEvalResult(
+        clip_id=f"{source}-{tier}-x",
+        tier=tier,
+        source=source,
+        n_gold=12,
+        n_predicted=11,
+        onset=_event_f1(0.95),
+        pitch=_event_f1(0.92),
+        tab=_tab_f1(tab_value),
+        errors=ErrorDecomposition(
+            correct=10, wrong_position_same_pitch=1, missed_onset=1
+        ),
+    )
+
+
+def _report(tmp_path: Path) -> CompositeReport:
+    per_clip = [
+        _clip("clean_acoustic_strummed", "GuitarSet", 0.92),
+        _clip("clean_acoustic_strummed", "GuitarSet", 0.94),
+        _clip("clean_acoustic_single_line", "GuitarSet", 0.62),
+        _clip("clean_acoustic_single_line", "Guitar-TECHS", 0.71),
+    ]
+    tiers = {
+        "clean_acoustic_strummed": TierReport(
+            tier="clean_acoustic_strummed",
+            n_clips=2,
+            n_gold_total=24,
+            onset_f1=_bootstrap(0.95, 0.93, 0.97),
+            pitch_f1=_bootstrap(0.92, 0.90, 0.94),
+            tab_f1=_bootstrap(0.93, 0.91, 0.95),
+            errors=ErrorDecomposition(correct=20, wrong_position_same_pitch=2),
+        ),
+        "clean_acoustic_single_line": TierReport(
+            tier="clean_acoustic_single_line",
+            n_clips=2,
+            n_gold_total=24,
+            onset_f1=_bootstrap(0.95, 0.92, 0.98),
+            pitch_f1=_bootstrap(0.92, 0.90, 0.95),
+            tab_f1=_bootstrap(0.665, 0.55, 0.78),  # gap: mean > 0.85? no, fail
+            errors=ErrorDecomposition(
+                correct=10, wrong_position_same_pitch=10, missed_onset=4
+            ),
+        ),
+    }
+    validation = ManifestValidation(
+        manifest_path=str(tmp_path / "manifest.toml"),
+        passed=True,
+        clip_count=4,
+        clip_ids=["a", "b", "c", "d"],
+        present_tiers=["clean_acoustic_single_line", "clean_acoustic_strummed"],
+        missing_tiers=["clean_electric", "distorted_electric"],
+        items=[],
+    )
+    return CompositeReport(
+        manifest_path=str(tmp_path / "manifest.toml"),
+        manifest_validation=validation,
+        per_clip=per_clip,
+        tiers=tiers,
+        bootstrap_n=10_000,
+        bootstrap_seed=42,
+        onset_tolerance_s=0.05,
+    )
+
+
+def test_baseline_markdown_has_required_sections(tmp_path: Path) -> None:
+    md = format_baseline_markdown(_report(tmp_path))
+
+    assert "## Per-tier results" in md
+    assert "## Per-source breakdown" in md
+    assert "## Methodology" in md
+    for tier in DEFAULT_TIER_TARGETS:
+        assert tier in md
+
+
+def test_baseline_markdown_status_column(tmp_path: Path) -> None:
+    """The status column must categorise as pass / gap / fail / missing."""
+    md = format_baseline_markdown(_report(tmp_path))
+
+    # clean_acoustic_strummed: lower_95 = 0.91 >= 0.90 target → pass
+    strum_row = next(
+        line for line in md.split("\n") if line.startswith("| clean_acoustic_strummed")
+    )
+    assert "| pass |" in strum_row
+
+    # clean_acoustic_single_line: mean=0.665 < 0.85 → fail
+    single_row = next(
+        line for line in md.split("\n") if line.startswith("| clean_acoustic_single_line")
+    )
+    assert "| fail |" in single_row
+
+    # clean_electric: tier not in report → missing
+    electric_row = next(line for line in md.split("\n") if line.startswith("| clean_electric"))
+    assert "| missing |" in electric_row
+
+
+def test_baseline_markdown_methodology_includes_settings(tmp_path: Path) -> None:
+    md = format_baseline_markdown(
+        _report(tmp_path),
+        backend_label="highres",
+        position_prior_label="guitarset-v1",
+        eval_harness_sha="deadbeef",
+    )
+    assert "`highres`" in md
+    assert "`guitarset-v1`" in md
+    assert "`deadbeef`" in md
+    assert "Bootstrap: N=10,000" in md
+    assert "Onset tolerance: 50 ms" in md
+
+
+def test_decomposition_markdown_has_aggregate_and_per_tier(tmp_path: Path) -> None:
+    md = format_decomposition_markdown(_report(tmp_path))
+
+    assert "## Aggregate (all tiers)" in md
+    assert "## Per-tier breakdown" in md
+    # Bucket names should appear in the aggregate table
+    for bucket in (
+        "correct",
+        "wrong_position_same_pitch",
+        "pitch_off",
+        "timing_only",
+        "missed_onset",
+        "extra_detection",
+    ):
+        assert bucket in md
+
+
+def test_decomposition_markdown_aggregates_per_clip(tmp_path: Path) -> None:
+    """Aggregate row should sum per-clip decompositions, not duplicate per-tier."""
+    md = format_decomposition_markdown(_report(tmp_path))
+    # 4 clips × 10 correct each = 40
+    aggregate_section = md.split("## Per-tier breakdown")[0]
+    assert "| correct | 40 |" in aggregate_section
+
+
+@pytest.mark.parametrize(
+    "tier",
+    list(DEFAULT_TIER_TARGETS),
+)
+def test_default_targets_cover_all_required_tiers(tier: str) -> None:
+    assert tier in DEFAULT_TIER_TARGETS
+    assert 0.0 < DEFAULT_TIER_TARGETS[tier] <= 1.0
diff --git a/tabvision/tests/unit/test_error_decomposition.py b/tabvision/tests/unit/test_error_decomposition.py
new file mode 100644
index 0000000..3db377e
--- /dev/null
+++ b/tabvision/tests/unit/test_error_decomposition.py
@@ -0,0 +1,257 @@
+"""Tests for the Tab F1 error-decomposition module (Phase 0)."""
+
+from __future__ import annotations
+
+import pytest
+
+from tabvision.eval.error_decomposition import (
+    ErrorDecomposition,
+    aggregate_decompositions,
+    decompose_errors,
+)
+from tabvision.types import TabEvent
+
+
+def _ev(onset: float, string_idx: int, fret: int, *, pitch: int | None = None) -> TabEvent:
+    """Convenience: TabEvent with default duration, confidence, and derived pitch."""
+    # Standard tuning open pitches: low E to high E.
+    open_pitches = (40, 45, 50, 55, 59, 64)
+    pitch_midi = pitch if pitch is not None else open_pitches[string_idx] + fret
+    return TabEvent(
+        onset_s=onset,
+        duration_s=0.1,
+        string_idx=string_idx,
+        fret=fret,
+        pitch_midi=pitch_midi,
+        confidence=1.0,
+    )
+
+
+def test_perfect_match_all_correct() -> None:
+    gold = [_ev(0.0, 0, 0), _ev(0.5, 2, 5), _ev(1.0, 4, 3)]
+    pred = list(gold)
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 3
+    assert r.total_loss == 0
+    assert r.wrong_position_same_pitch == 0
+    assert r.missed_onset == 0
+    assert r.extra_detection == 0
+
+
+def test_wrong_position_same_pitch_bucket() -> None:
+    """E3 (MIDI 64) on high-E open vs MIDI 64 on G string fret 9: same pitch, different position."""
+    gold = [_ev(0.0, 5, 0, pitch=64)]  # high E open, MIDI 64
+    pred = [_ev(0.0, 2, 9, pitch=64)]  # MIDI 64 placed at G string fret 9 — same pitch
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 0
+    assert r.wrong_position_same_pitch == 1
+    assert r.pitch_off == 0
+
+
+def test_pitch_off_bucket() -> None:
+    """Onset matches strictly but the predicted pitch is wrong."""
+    gold = [_ev(0.0, 0, 0, pitch=40)]
+    pred = [_ev(0.01, 0, 1, pitch=41)]  # onset within tolerance, but wrong pitch
+
+    r = decompose_errors(pred, gold)
+
+    assert r.pitch_off == 1
+    assert r.correct == 0
+    assert r.wrong_position_same_pitch == 0
+
+
+def test_timing_only_bucket() -> None:
+    """Correct position + pitch, but onset just outside strict tolerance, within extended."""
+    gold = [_ev(0.0, 0, 0)]
+    pred = [_ev(0.10, 0, 0)]  # 100 ms off — outside strict (50 ms), within extended (150 ms)
+
+    r = decompose_errors(pred, gold)
+
+    assert r.timing_only == 1
+    assert r.correct == 0
+    assert r.missed_onset == 0
+
+
+def test_missed_onset_bucket() -> None:
+    """Gold event with no predicted event nearby at all."""
+    gold = [_ev(0.0, 0, 0)]
+    pred: list[TabEvent] = []
+
+    r = decompose_errors(pred, gold)
+
+    assert r.missed_onset == 1
+    assert r.extra_detection == 0
+
+
+def test_extra_detection_bucket() -> None:
+    """Predicted event with no gold event nearby at all."""
+    gold: list[TabEvent] = []
+    pred = [_ev(0.0, 0, 0)]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.extra_detection == 1
+    assert r.missed_onset == 0
+
+
+def test_predicted_far_from_gold_yields_missed_and_extra() -> None:
+    """Far-apart events should bucket as missed + extra, not pair up."""
+    gold = [_ev(0.0, 0, 0)]
+    pred = [_ev(10.0, 0, 0)]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.missed_onset == 1
+    assert r.extra_detection == 1
+    assert r.correct == 0
+
+
+def test_mixed_buckets() -> None:
+    """A mixed scenario across all buckets at once."""
+    gold = [
+        _ev(0.0, 0, 0),             # correct match
+        _ev(0.5, 5, 0, pitch=64),   # wrong-position match (MIDI 64 placed elsewhere)
+        _ev(1.0, 2, 5, pitch=55),   # pitch_off (pred at wrong position with wrong pitch)
+        _ev(1.5, 3, 7),             # timing_only (pred is 100 ms late)
+        _ev(2.0, 4, 3),             # missed_onset
+    ]
+    pred = [
+        _ev(0.01, 0, 0),                  # → correct
+        _ev(0.51, 2, 9, pitch=64),        # → wrong_position_same_pitch
+        _ev(1.01, 0, 3),                  # → pitch_off (low E fret 3 → MIDI 43, ≠ gold's 55)
+        _ev(1.60, 3, 7),                  # → timing_only (100 ms late)
+        # Nothing near gold[4] at 2.0 → missed_onset
+        _ev(5.0, 0, 0),                   # → extra_detection (far from any gold)
+    ]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 1
+    assert r.wrong_position_same_pitch == 1
+    assert r.pitch_off == 1
+    assert r.timing_only == 1
+    assert r.missed_onset == 1
+    assert r.extra_detection == 1
+
+
+def test_share_of_loss_sums_to_one() -> None:
+    r = ErrorDecomposition(
+        correct=10,
+        wrong_position_same_pitch=3,
+        pitch_off=2,
+        timing_only=1,
+        missed_onset=2,
+        extra_detection=2,
+    )
+    shares = r.share_of_loss()
+    assert sum(shares.values()) == pytest.approx(1.0)
+    assert shares["wrong_position_same_pitch"] == pytest.approx(3 / 10)
+
+
+def test_share_of_loss_zero_when_no_loss() -> None:
+    r = ErrorDecomposition(correct=5)
+    shares = r.share_of_loss()
+    assert all(v == 0.0 for v in shares.values())
+
+
+def test_total_gold_excludes_extra_detection() -> None:
+    r = ErrorDecomposition(
+        correct=10, wrong_position_same_pitch=2, pitch_off=1, missed_onset=3, extra_detection=5
+    )
+    # total_gold = correct + wrong_pos + pitch_off + timing_only + missed_onset
+    assert r.total_gold == 16
+    # total_predicted = correct + wrong_pos + pitch_off + timing_only + extra_detection
+    assert r.total_predicted == 18
+
+
+def test_aggregate_decompositions_sums_bucketwise() -> None:
+    a = ErrorDecomposition(correct=5, wrong_position_same_pitch=2)
+    b = ErrorDecomposition(correct=10, missed_onset=3, extra_detection=1)
+    agg = aggregate_decompositions([a, b])
+    assert agg.correct == 15
+    assert agg.wrong_position_same_pitch == 2
+    assert agg.missed_onset == 3
+    assert agg.extra_detection == 1
+    assert agg.pitch_off == 0
+
+
+def test_aggregate_empty_returns_zeros() -> None:
+    agg = aggregate_decompositions([])
+    assert agg == ErrorDecomposition()
+    assert agg.total_loss == 0
+
+
+def test_rejects_invalid_tolerances() -> None:
+    with pytest.raises(ValueError, match="onset_tolerance_s"):
+        decompose_errors([], [], onset_tolerance_s=0.0)
+    with pytest.raises(ValueError, match=">="):
+        decompose_errors([], [], onset_tolerance_s=0.1, timing_extended_tolerance_s=0.05)
+
+
+def test_each_pred_matches_at_most_one_gold() -> None:
+    """Two gold events at the same time should not both claim one pred."""
+    gold = [_ev(0.0, 0, 0), _ev(0.0, 0, 0)]
+    pred = [_ev(0.0, 0, 0)]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 1
+    assert r.missed_onset == 1
+    assert r.extra_detection == 0
+
+
+def test_greedy_picks_closest_onset() -> None:
+    """When multiple same-position preds are within tolerance, the closest-by-onset wins."""
+    gold = [_ev(0.0, 0, 0)]
+    pred = [_ev(0.04, 0, 0), _ev(0.01, 0, 0)]  # both within 50 ms; 0.01 is closer
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 1
+    assert r.extra_detection == 1
+
+
+def test_chord_cluster_priority_pitch_over_onset() -> None:
+    """Multi-gold same-onset chord: matcher should pair by pitch, not by onset proximity.
+
+    Two gold events at the same onset with different pitches, paired
+    with two preds whose pitches match the gold (but whose on-the-wire
+    ordering doesn't). Onset-only greediness would mis-pair them and
+    inflate ``pitch_off``. The priority-based matcher must pair on
+    pitch.
+    """
+    gold = [
+        _ev(0.0, 0, 0, pitch=40),  # low E
+        _ev(0.0, 1, 2, pitch=47),  # A string fret 2
+    ]
+    pred = [
+        # Different on-the-wire order: pitch=47 first.
+        _ev(0.01, 1, 2, pitch=47),  # → matches gold[1] (correct)
+        _ev(0.01, 0, 0, pitch=40),  # → matches gold[0] (correct)
+    ]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 2
+    assert r.pitch_off == 0
+    assert r.wrong_position_same_pitch == 0
+
+
+def test_chord_cluster_priority_falls_back_to_position_match_then_pitch() -> None:
+    """When one pred has the right position and another has the right pitch,
+    the same-position match wins for ``correct`` accounting.
+    """
+    gold = [_ev(0.0, 0, 0, pitch=40)]
+    pred = [
+        # Same pitch as gold but different position
+        _ev(0.005, 5, 0, pitch=64),  # noise; nothing in common
+        _ev(0.020, 0, 0, pitch=40),  # exact match; further in onset
+    ]
+
+    r = decompose_errors(pred, gold)
+
+    assert r.correct == 1  # picked the same-position match even though it's further
diff --git a/tabvision/tests/unit/test_eval_manifest.py b/tabvision/tests/unit/test_eval_manifest.py
index 7810ce1..bad81d4 100644
--- a/tabvision/tests/unit/test_eval_manifest.py
+++ b/tabvision/tests/unit/test_eval_manifest.py
@@ -55,7 +55,8 @@ def test_manifest_validation_is_json_serializable_and_sorted(tmp_path: Path) ->
 source = "EGDB"
 split = "test"
 media_path = "$TABVISION_DATA_ROOT/egdb/b.wav"
-annotation_path = "$TABVISION_DATA_ROOT/egdb/b.jams"
+annotation_path = "$TABVISION_DATA_ROOT/egdb/b.gp5"
+annotation_format = "egdb_gp"
 
 [[clips]]
 id = "a"
@@ -64,6 +65,7 @@ def test_manifest_validation_is_json_serializable_and_sorted(tmp_path: Path) ->
 split = "validation"
 media_path = "$TABVISION_DATA_ROOT/guitarset/a.wav"
 annotation_path = "$TABVISION_DATA_ROOT/guitarset/a.jams"
+annotation_format = "guitarset_jams"
 """.strip()
         + "\n",
         encoding="utf-8",
@@ -78,3 +80,112 @@ def test_manifest_validation_is_json_serializable_and_sorted(tmp_path: Path) ->
     assert payload["present_tiers"] == ["clean_acoustic_strummed", "distorted_electric"]
     assert payload["passed"] is True
     assert tomllib.loads(manifest.read_text(encoding="utf-8"))["clips"][0]["id"] == "b"
+
+
+def test_annotation_format_is_required(tmp_path: Path) -> None:
+    """Phase 0: every clip must declare its parser dispatch key."""
+    manifest = tmp_path / "manifest.toml"
+    manifest.write_text(
+        """
+[[clips]]
+id = "missing-format"
+tier = "clean_acoustic_strummed"
+source = "GuitarSet"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/guitarset/a.wav"
+annotation_path = "$TABVISION_DATA_ROOT/guitarset/a.jams"
+""".strip()
+        + "\n",
+        encoding="utf-8",
+    )
+
+    result = validate_manifest(manifest)
+
+    assert not result.passed
+    assert any(
+        item.code == "MISSING_ANNOTATION_FORMAT" and item.severity == "fail"
+        for item in result.items
+    )
+
+
+def test_synthetic_source_blocked_in_test_split(tmp_path: Path) -> None:
+    """Cross-contamination guard: synthetic-source clip in test split is rejected."""
+    manifest = tmp_path / "manifest.toml"
+    manifest.write_text(
+        """
+[[clips]]
+id = "synth-in-test"
+tier = "clean_electric"
+source = "synthtab/electric"
+split = "test"
+media_path = "$TABVISION_DATA_ROOT/synthtab/x.wav"
+annotation_path = "$TABVISION_DATA_ROOT/synthtab/x.json"
+annotation_format = "synthtab_json"
+""".strip()
+        + "\n",
+        encoding="utf-8",
+    )
+
+    result = validate_manifest(manifest)
+
+    assert not result.passed
+    failures = [
+        item
+        for item in result.items
+        if item.code == "SYNTHETIC_IN_EVAL_SPLIT" and item.severity == "fail"
+    ]
+    assert len(failures) == 1
+    assert failures[0].clip_id == "synth-in-test"
+
+
+def test_synthetic_source_blocked_in_validation_split(tmp_path: Path) -> None:
+    manifest = tmp_path / "manifest.toml"
+    manifest.write_text(
+        """
+[[clips]]
+id = "synth-in-validation"
+tier = "clean_electric"
+source = "DadaGP/render-001"
+split = "validation"
+media_path = "$TABVISION_DATA_ROOT/dadagp/x.wav"
+annotation_path = "$TABVISION_DATA_ROOT/dadagp/x.json"
+annotation_format = "dadagp_json"
+""".strip()
+        + "\n",
+        encoding="utf-8",
+    )
+
+    result = validate_manifest(manifest)
+
+    failures = [
+        item
+        for item in result.items
+        if item.code == "SYNTHETIC_IN_EVAL_SPLIT" and item.severity == "fail"
+    ]
+    assert len(failures) == 1
+    assert failures[0].clip_id == "synth-in-validation"
+
+
+def test_synthetic_source_allowed_in_train_split(tmp_path: Path) -> None:
+    """Synthetic data is permitted as training material (per design plan §4.2)."""
+    manifest = tmp_path / "manifest.toml"
+    manifest.write_text(
+        """
+[[clips]]
+id = "synth-in-train"
+tier = "clean_electric"
+source = "synthtab/electric"
+split = "train"
+media_path = "$TABVISION_DATA_ROOT/synthtab/x.wav"
+annotation_path = "$TABVISION_DATA_ROOT/synthtab/x.json"
+annotation_format = "synthtab_json"
+""".strip()
+        + "\n",
+        encoding="utf-8",
+    )
+
+    result = validate_manifest(manifest)
+
+    assert not any(
+        item.code == "SYNTHETIC_IN_EVAL_SPLIT" for item in result.items
+    )
diff --git a/tabvision/tests/unit/test_manifest_builder.py b/tabvision/tests/unit/test_manifest_builder.py
new file mode 100644
index 0000000..5f011f7
--- /dev/null
+++ b/tabvision/tests/unit/test_manifest_builder.py
@@ -0,0 +1,397 @@
+"""Tests for the composite-eval manifest builder (Phase 0)."""
+
+from __future__ import annotations
+
+import json
+import tomllib
+from pathlib import Path
+
+import pytest
+
+from tabvision.eval.manifest import validate_manifest
+from tabvision.eval.manifest_builder import (
+    ClipEntry,
+    apply_limits,
+    build_manifest,
+    render_toml,
+    scan_guitar_techs,
+    scan_guitarset,
+    summarise_coverage,
+)
+
+
+def _make_guitarset_layout(
+    root: Path,
+    tracks: list[tuple[str, dict | None]],
+) -> None:
+    """Build a fake GuitarSet directory at ``root``.
+
+    Each ``tracks`` tuple is ``(track_id, jams_payload)``. Pass payload
+    ``None`` to write the JAMS but omit the audio file (simulates a
+    half-present clip that the scanner should skip). The audio file is
+    a zero-byte placeholder when payload is not ``None``.
+    """
+    annotation_dir = root / "annotation"
+    audio_dir = root / "audio_mono-mic"
+    annotation_dir.mkdir(parents=True, exist_ok=True)
+    audio_dir.mkdir(parents=True, exist_ok=True)
+    for track_id, payload in tracks:
+        jams_path = annotation_dir / f"{track_id}.jams"
+        jams_path.write_text(json.dumps(payload or {"annotations": []}), encoding="utf-8")
+        if payload is not None:
+            (audio_dir / f"{track_id}_mic.wav").write_bytes(b"")
+
+
+def test_scan_guitarset_classifies_comp_and_solo(tmp_path: Path) -> None:
+    _make_guitarset_layout(
+        tmp_path,
+        [
+            ("05_Rock1-90-C#_comp", {"annotations": []}),
+            ("05_Funk1-114-Ab_solo", {"annotations": []}),
+        ],
+    )
+
+    entries = scan_guitarset(tmp_path)
+
+    by_id = {entry.id: entry for entry in entries}
+    assert by_id["guitarset/05_Rock1-90-C#_comp"].tier == "clean_acoustic_strummed"
+    assert by_id["guitarset/05_Funk1-114-Ab_solo"].tier == "clean_acoustic_single_line"
+    for entry in entries:
+        assert entry.source == "GuitarSet"
+        assert entry.annotation_format == "guitarset_jams"
+
+
+def test_scan_guitarset_assigns_validation_split_for_player_05(tmp_path: Path) -> None:
+    _make_guitarset_layout(
+        tmp_path,
+        [
+            ("00_Rock1-90-C#_comp", {"annotations": []}),
+            ("05_Rock1-90-C#_comp", {"annotations": []}),
+        ],
+    )
+
+    entries = scan_guitarset(tmp_path)
+
+    by_id = {entry.id: entry for entry in entries}
+    assert by_id["guitarset/00_Rock1-90-C#_comp"].split == "train"
+    assert by_id["guitarset/05_Rock1-90-C#_comp"].split == "validation"
+
+
+def test_scan_guitarset_skips_when_audio_missing(tmp_path: Path) -> None:
+    """A JAMS without matching audio is skipped silently."""
+    _make_guitarset_layout(
+        tmp_path,
+        [
+            ("05_OnlyAnnot-90-A_comp", None),  # JAMS present, no audio
+        ],
+    )
+    assert scan_guitarset(tmp_path) == []
+
+
+def test_scan_guitarset_skips_unrecognised_suffix(tmp_path: Path) -> None:
+    """Tracks without _comp or _solo suffix are skipped."""
+    _make_guitarset_layout(
+        tmp_path,
+        [
+            ("05_OddTrackId-90-A_other", {"annotations": []}),
+        ],
+    )
+    assert scan_guitarset(tmp_path) == []
+
+
+def test_scan_guitarset_returns_empty_for_missing_root(tmp_path: Path) -> None:
+    assert scan_guitarset(tmp_path / "nonexistent") == []
+
+
+def test_scan_guitarset_returns_empty_for_partial_layout(tmp_path: Path) -> None:
+    """Root with annotation/ but no audio_mono-mic/ returns empty."""
+    (tmp_path / "annotation").mkdir()
+    assert scan_guitarset(tmp_path) == []
+
+
+def test_scan_guitar_techs_returns_empty_stub(tmp_path: Path) -> None:
+    """Guitar-TECHS scanner is a stub until the dataset is acquired."""
+    assert scan_guitar_techs(tmp_path) == []
+
+
+def _entry(clip_id: str, tier: str = "clean_acoustic_strummed") -> ClipEntry:
+    return ClipEntry(
+        id=clip_id,
+        tier=tier,
+        source="GuitarSet",
+        split="validation",
+        media_path=f"/data/{clip_id}.wav",
+        annotation_path=f"/data/{clip_id}.jams",
+        annotation_format="guitarset_jams",
+    )
+
+
+def test_apply_limits_caps_per_tier_deterministically() -> None:
+    entries = [
+        _entry("a", "clean_acoustic_strummed"),
+        _entry("b", "clean_acoustic_strummed"),
+        _entry("c", "clean_acoustic_strummed"),
+        _entry("d", "clean_acoustic_single_line"),
+        _entry("e", "clean_acoustic_single_line"),
+    ]
+
+    capped = apply_limits(entries, max_clips_per_tier=2)
+
+    # 2 per tier, sorted by id within each tier
+    ids = [entry.id for entry in capped]
+    assert ids == ["a", "b", "d", "e"]
+
+
+def test_apply_limits_applies_total_after_per_tier() -> None:
+    entries = [
+        _entry("a", "clean_acoustic_strummed"),
+        _entry("b", "clean_acoustic_strummed"),
+        _entry("c", "clean_acoustic_single_line"),
+    ]
+
+    capped = apply_limits(entries, max_clips_per_tier=2, total_limit=2)
+
+    assert [entry.id for entry in capped] == ["a", "b"]
+
+
+def test_apply_limits_with_no_caps_preserves_all_sorted() -> None:
+    entries = [_entry("b"), _entry("a"), _entry("c")]
+    out = apply_limits(entries)
+    assert [entry.id for entry in out] == ["a", "b", "c"]
+
+
+def test_render_toml_round_trips_via_tomllib() -> None:
+    entries = [
+        _entry("a", "clean_acoustic_strummed"),
+        _entry("b", "clean_acoustic_single_line"),
+    ]
+    text = render_toml(entries)
+    parsed = tomllib.loads(text)
+    assert len(parsed["clips"]) == 2
+    by_id = {clip["id"]: clip for clip in parsed["clips"]}
+    assert by_id["a"]["tier"] == "clean_acoustic_strummed"
+    assert by_id["a"]["annotation_format"] == "guitarset_jams"
+
+
+def test_render_toml_is_byte_stable() -> None:
+    """Same entries → same bytes, regardless of input order."""
+    entries_in_order_a = [_entry("z"), _entry("a"), _entry("m")]
+    entries_in_order_b = [_entry("a"), _entry("m"), _entry("z")]
+    assert render_toml(entries_in_order_a) == render_toml(entries_in_order_b)
+
+
+def test_render_toml_emits_header_when_provided() -> None:
+    text = render_toml([_entry("a")], header_comment="hello world")
+    assert text.startswith("# hello world\n")
+
+
+def test_render_toml_rewrites_paths_under_data_root(tmp_path: Path) -> None:
+    """media/annotation paths under data_root become $TABVISION_DATA_ROOT/<rest>."""
+    data_root = tmp_path / "datasets"
+    data_root.mkdir()
+    entry = ClipEntry(
+        id="clip-x",
+        tier="clean_acoustic_strummed",
+        source="GuitarSet",
+        split="validation",
+        media_path=str((data_root / "guitarset" / "audio.wav").resolve()),
+        annotation_path=str((data_root / "guitarset" / "ann.jams").resolve()),
+        annotation_format="guitarset_jams",
+    )
+    text = render_toml([entry], data_root=data_root)
+    assert '"$TABVISION_DATA_ROOT/guitarset/audio.wav"' in text
+    assert '"$TABVISION_DATA_ROOT/guitarset/ann.jams"' in text
+    # Paths NOT under data_root should be untouched.
+    assert "/datasets/" not in text  # absolute prefix is gone
+
+
+def test_render_toml_leaves_paths_outside_data_root_alone(tmp_path: Path) -> None:
+    data_root = tmp_path / "datasets"
+    data_root.mkdir()
+    other = tmp_path / "elsewhere" / "x.wav"
+    other.parent.mkdir(parents=True)
+    other.write_bytes(b"")
+    entry = ClipEntry(
+        id="clip-x",
+        tier="clean_acoustic_strummed",
+        source="GuitarSet",
+        split="validation",
+        media_path=str(other.resolve()),
+        annotation_path=str(other.resolve()),
+        annotation_format="guitarset_jams",
+    )
+    text = render_toml([entry], data_root=data_root)
+    assert "$TABVISION_DATA_ROOT" not in text
+    assert str(other.resolve()) in text
+
+
+def test_render_toml_with_no_data_root_is_unchanged(tmp_path: Path) -> None:
+    """Backward-compat: omitting data_root keeps current absolute-path output."""
+    entry = ClipEntry(
+        id="clip-x",
+        tier="clean_acoustic_strummed",
+        source="GuitarSet",
+        split="validation",
+        media_path="/some/abs/path.wav",
+        annotation_path="/some/abs/path.jams",
+        annotation_format="guitarset_jams",
+    )
+    text = render_toml([entry], data_root=None)
+    assert "/some/abs/path.wav" in text
+    assert "$TABVISION_DATA_ROOT" not in text
+
+
+def test_summarise_coverage_reports_per_tier_and_per_split() -> None:
+    entries = [
+        _entry("a", "clean_acoustic_strummed"),
+        _entry("b", "clean_acoustic_strummed"),
+        _entry("c", "clean_acoustic_single_line"),
+    ]
+    summary = summarise_coverage(entries)
+    assert "Total clips: 3" in summary
+    assert "clean_acoustic_strummed: 2 clips" in summary
+    assert "clean_acoustic_single_line: 1 clips" in summary
+
+
+def test_build_manifest_skips_missing_roots(tmp_path: Path) -> None:
+    """Missing GuitarSet root → empty result, no exception."""
+    entries = build_manifest(guitarset_root=tmp_path / "nope")
+    assert entries == []
+
+
+def test_build_manifest_splits_filter(tmp_path: Path) -> None:
+    """``splits=('validation',)`` should keep only player-05 clips."""
+    _make_guitarset_layout(
+        tmp_path / "guitarset",
+        [
+            ("00_Rock1-90-C#_comp", {"annotations": []}),  # train
+            ("05_Funk1-114-Ab_solo", {"annotations": []}),  # validation
+        ],
+    )
+
+    train_only = build_manifest(
+        guitarset_root=tmp_path / "guitarset",
+        splits=("train",),
+    )
+    validation_only = build_manifest(
+        guitarset_root=tmp_path / "guitarset",
+        splits=("validation",),
+    )
+    both = build_manifest(guitarset_root=tmp_path / "guitarset")
+
+    assert {entry.id for entry in train_only} == {"guitarset/00_Rock1-90-C#_comp"}
+    assert {entry.id for entry in validation_only} == {
+        "guitarset/05_Funk1-114-Ab_solo"
+    }
+    assert len(both) == 2
+
+
+def test_build_manifest_emits_synthetic_train_clip_ok(tmp_path: Path) -> None:
+    """Training-split synthetic clips should pass the in-builder guard."""
+    # Use a custom ClipEntry-yielding scanner via the public function
+    entries = [
+        ClipEntry(
+            id="synthetic-train-01",
+            tier="distorted_electric",
+            source="synthtab/electric",
+            split="train",
+            media_path="/data/x.wav",
+            annotation_path="/data/x.json",
+            annotation_format="synthtab_json",
+        ),
+    ]
+    # The guard should be a no-op for train split; verify via apply_limits roundtrip.
+    out = apply_limits(entries, max_clips_per_tier=1)
+    assert len(out) == 1
+
+
+def test_main_writes_manifest_and_passes_validation(
+    tmp_path: Path, capsys: pytest.CaptureFixture[str]
+) -> None:
+    """End-to-end: build_composite_manifest builds → manifest validates."""
+    _make_guitarset_layout(
+        tmp_path / "guitarset",
+        [
+            (
+                "05_Rock1-90-C#_comp",
+                {
+                    "annotations": [
+                        {
+                            "namespace": "note_midi",
+                            "annotation_metadata": {"data_source": "0"},
+                            "data": [
+                                {"time": 0.0, "duration": 0.5, "value": 40},
+                            ],
+                        }
+                    ]
+                },
+            ),
+            (
+                "05_Funk1-114-Ab_solo",
+                {
+                    "annotations": [
+                        {
+                            "namespace": "note_midi",
+                            "annotation_metadata": {"data_source": "0"},
+                            "data": [
+                                {"time": 1.0, "duration": 0.5, "value": 45},
+                            ],
+                        }
+                    ]
+                },
+            ),
+        ],
+    )
+    output = tmp_path / "composite.toml"
+
+    from tabvision.eval.manifest_builder import main
+
+    rc = main(
+        [
+            "--guitarset",
+            str(tmp_path / "guitarset"),
+            "--output",
+            str(output),
+        ]
+    )
+
+    assert rc == 0
+    assert output.is_file()
+    captured = capsys.readouterr()
+    assert "Wrote 2 clips" in captured.out
+    assert "Manifest validation passed." in captured.out
+
+    # The emitted manifest should itself validate cleanly.
+    validation = validate_manifest(output)
+    assert validation.passed
+
+
+def test_main_requires_at_least_one_root(tmp_path: Path) -> None:
+    """Without --guitarset / --guitar-techs, the CLI exits with usage error."""
+    from tabvision.eval.manifest_builder import main
+
+    with pytest.raises(SystemExit) as excinfo:
+        main(["--output", str(tmp_path / "x.toml")])
+    assert excinfo.value.code == 2
+
+
+def test_main_returns_1_when_no_clips_discovered(
+    tmp_path: Path, capsys: pytest.CaptureFixture[str]
+) -> None:
+    """Specifying a path with no matching data → rc=1, no output file."""
+    output = tmp_path / "composite.toml"
+    from tabvision.eval.manifest_builder import main
+
+    rc = main(
+        [
+            "--guitarset",
+            str(tmp_path / "empty"),
+            "--output",
+            str(output),
+        ]
+    )
+
+    assert rc == 1
+    assert not output.exists()
+    captured = capsys.readouterr()
+    assert "No clips discovered" in captured.out
diff --git a/tabvision/tests/unit/test_parser_guitar_techs_midi.py b/tabvision/tests/unit/test_parser_guitar_techs_midi.py
new file mode 100644
index 0000000..34f109c
--- /dev/null
+++ b/tabvision/tests/unit/test_parser_guitar_techs_midi.py
@@ -0,0 +1,161 @@
+"""Tests for the Guitar-TECHS MIDI parser (Phase 0)."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+pretty_midi = pytest.importorskip("pretty_midi")
+
+from tabvision.eval.parsers import get_parser  # noqa: E402
+from tabvision.eval.parsers.guitar_techs_midi import (  # noqa: E402
+    DEFAULT_TRACK_TO_STRING,
+    parse,
+)
+from tabvision.types import GuitarConfig  # noqa: E402
+
+
+def _make_midi(tmp_path: Path, *tracks_of_notes: list[tuple[int, float, float]]) -> Path:
+    """Build a multi-track MIDI fixture.
+
+    Each positional arg is a list of ``(pitch, start, end)`` tuples for
+    one track. Pass an empty list to create an empty track.
+    """
+    midi = pretty_midi.PrettyMIDI()
+    for notes in tracks_of_notes:
+        instrument = pretty_midi.Instrument(program=24)  # acoustic guitar
+        for pitch, start, end in notes:
+            instrument.notes.append(
+                pretty_midi.Note(velocity=80, pitch=pitch, start=start, end=end)
+            )
+        midi.instruments.append(instrument)
+    midi_path = tmp_path / "clip.mid"
+    midi.write(str(midi_path))
+    return midi_path
+
+
+def test_track_zero_maps_to_low_e_string(tmp_path: Path) -> None:
+    """Track 0 should carry low-E notes (string_idx 0, MIDI 40 → fret 0)."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 0.0, 0.5)],
+        [],
+        [],
+        [],
+        [],
+        [],
+    )
+
+    events = parse(midi_path)
+
+    assert len(events) == 1
+    assert events[0].string_idx == 0
+    assert events[0].fret == 0
+    assert events[0].pitch_midi == 40
+
+
+def test_per_string_pitch_to_fret_derivation(tmp_path: Path) -> None:
+    """Pitch minus open-string MIDI gives the fret for each string."""
+    # Standard tuning MIDI: (40, 45, 50, 55, 59, 64) — low E .. high E.
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 0.00, 0.10)],  # track 0 (E2)  → fret 0
+        [(50, 0.10, 0.20)],  # track 1 (A2 + 5 semitones) → fret 5
+        [(55, 0.20, 0.30)],  # track 2 (D3 + 5 semitones) → fret 5
+        [(62, 0.30, 0.40)],  # track 3 (G3 + 7 semitones) → fret 7
+        [(64, 0.40, 0.50)],  # track 4 (B3 + 5 semitones) → fret 5
+        [(76, 0.50, 0.60)],  # track 5 (high E + 12) → fret 12
+    )
+
+    events = parse(midi_path)
+
+    by_string = {ev.string_idx: ev.fret for ev in events}
+    assert by_string == {0: 0, 1: 5, 2: 5, 3: 7, 4: 5, 5: 12}
+
+
+def test_drops_notes_outside_fret_range(tmp_path: Path) -> None:
+    """Notes that imply fret < 0 or > max_fret are skipped silently."""
+    # MIDI 35 < open low-E (40) → fret -5, drop.
+    # MIDI 90 > 40+24 → fret 50, drop.
+    midi_path = _make_midi(
+        tmp_path,
+        [(35, 0.0, 0.1), (90, 0.5, 0.6)],
+        [], [], [], [], [],
+    )
+
+    assert parse(midi_path) == []
+
+
+def test_events_sorted_by_onset(tmp_path: Path) -> None:
+    """Output is sorted by ``(onset_s, string_idx, fret)`` regardless of input order."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 2.00, 2.10), (40, 0.00, 0.10)],
+        [], [], [], [], [],
+    )
+
+    events = parse(midi_path)
+    assert [ev.onset_s for ev in events] == [0.0, 2.0]
+
+
+def test_capo_filters_below_capo_fret(tmp_path: Path) -> None:
+    """``cfg.capo`` raises the lower-bound for accepted frets."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 0.0, 0.1), (42, 0.1, 0.2)],
+        [], [], [], [], [],
+    )
+
+    cfg = GuitarConfig(capo=3)
+    events = parse(midi_path, cfg)
+    # MIDI 40 → fret 0 < capo 3, dropped. MIDI 42 → fret 2 < 3, dropped.
+    assert events == []
+
+
+def test_extra_tracks_beyond_six_are_ignored(tmp_path: Path) -> None:
+    """If a MIDI has > 6 tracks, only the first 6 are read."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 0.0, 0.1)],
+        [], [], [], [], [],
+        [(40, 0.0, 0.1)],  # 7th track — outside the mapping
+    )
+
+    events = parse(midi_path)
+    assert len(events) == 1
+    assert events[0].string_idx == 0
+
+
+def test_custom_track_to_string_mapping(tmp_path: Path) -> None:
+    """A reversed mapping should put track 0's notes on high E."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(64, 0.0, 0.1)],
+        [], [], [], [], [],
+    )
+
+    reversed_map: tuple[int, ...] = (5, 4, 3, 2, 1, 0)
+    events = parse(midi_path, track_to_string=reversed_map)
+
+    assert len(events) == 1
+    assert events[0].string_idx == 5
+    assert events[0].fret == 0
+
+
+def test_default_mapping_is_identity() -> None:
+    assert DEFAULT_TRACK_TO_STRING == (0, 1, 2, 3, 4, 5)
+
+
+def test_dispatch_via_registry(tmp_path: Path) -> None:
+    """End-to-end: parser is reachable via the composite-eval dispatch path."""
+    midi_path = _make_midi(
+        tmp_path,
+        [(40, 0.0, 0.1)],
+        [], [], [], [], [],
+    )
+    parser = get_parser("guitar_techs_midi")
+    assert parser is parse
+
+    events = parser(midi_path, None)
+    assert len(events) == 1
diff --git a/tabvision/tests/unit/test_parsers_registry.py b/tabvision/tests/unit/test_parsers_registry.py
new file mode 100644
index 0000000..a661f91
--- /dev/null
+++ b/tabvision/tests/unit/test_parsers_registry.py
@@ -0,0 +1,85 @@
+"""Tests for the annotation-parser registry (Phase 0)."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pytest
+
+from tabvision.eval.parsers import (
+    clear_parsers,
+    get_parser,
+    list_parsers,
+    register_parser,
+)
+from tabvision.eval.parsers.registry import _PARSERS as _GLOBAL_PARSERS
+
+
+@pytest.fixture
+def isolated_registry():
+    """Save + restore the registry around tests that mutate it."""
+    saved = dict(_GLOBAL_PARSERS)
+    yield
+    clear_parsers()
+    _GLOBAL_PARSERS.update(saved)
+
+
+def test_builtin_parsers_registered_on_import():
+    """The package import should auto-register at least GuitarSet JAMS."""
+    parsers = list_parsers()
+    assert "guitarset_jams" in parsers
+
+
+def test_get_parser_returns_callable():
+    parser = get_parser("guitarset_jams")
+    assert callable(parser)
+
+
+def test_get_parser_raises_keyerror_with_known_formats_listed():
+    with pytest.raises(KeyError) as excinfo:
+        get_parser("nonexistent_format")
+    assert "guitarset_jams" in str(excinfo.value)
+
+
+def test_register_parser_rejects_duplicate(isolated_registry):
+    def fake_parser(path, cfg=None):
+        return []
+
+    with pytest.raises(ValueError, match="already registered"):
+        register_parser("guitarset_jams", fake_parser)
+
+
+def test_register_then_get_roundtrip(isolated_registry):
+    def fake_parser(path, cfg=None):
+        return []
+
+    register_parser("fake_format", fake_parser)
+    assert get_parser("fake_format") is fake_parser
+    assert "fake_format" in list_parsers()
+
+
+def test_dispatch_via_registry_parses_jams(tmp_path: Path):
+    """End-to-end: composite-eval dispatch path runs through the registry."""
+    payload = {
+        "annotations": [
+            {
+                "namespace": "note_midi",
+                "annotation_metadata": {"data_source": "0"},
+                "data": [
+                    {"time": 0.10, "duration": 0.25, "value": 42},
+                ],
+            }
+        ]
+    }
+    jams_path = tmp_path / "clip.jams"
+    jams_path.write_text(json.dumps(payload), encoding="utf-8")
+
+    parser = get_parser("guitarset_jams")
+    events = parser(jams_path, None)
+
+    assert len(events) == 1
+    assert events[0].string_idx == 0
+    assert events[0].pitch_midi == 42
+    # Low E = MIDI 40, so MIDI 42 on string 0 → fret 2.
+    assert events[0].fret == 2