Skip to content
13 changes: 8 additions & 5 deletions LICENSES.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,14 @@ Phase 0 (this document) produces the initial map; Phase 9 verifies.

| Dataset | Phase | License | Status | Notes |
|---|---|---|---|---|
| GuitarSet | 1.5 / 7 | CC-BY-4.0 | ✅ | https://guitarset.weebly.com — JAMS annotations, hexaphonic. Already used in v0 finetune work. Re-distribution requires attribution; not committed to repo. |
| IDMT-SMT-Guitar | 1.5 / 7 | research-use, registration | ⚠️ | Training-only; not redistributed in our repo. Verify scope of "research use" for portfolio context. |
| EGDB | 1.5 / 7 | TBD | ⚠️ | https://github.com/ss12f32v/GuitarTranscription — multi-amp distorted electric. Verify before relying on it for distorted-electric tier eval. |
| DadaGP | 7 | TBD | ⚠️ | https://github.com/dada-bots/dadaGP — GuitarPro tabs as synthetic-data substrate. |
| User clips (existing 11/20 self-recorded) | 1.5 (bonus) | self-owned | ✅ | iPhone OOD bonus tier per design doc §6. Owned by Patrick. |
| GuitarSet | 1.5 / 7 / **Phase 0 (this PR)** | CC-BY-4.0 | ✅ | https://guitarset.weebly.com — JAMS annotations, hexaphonic. Already used in v0 finetune work. Re-distribution requires attribution; not committed to repo. **Used as the only data source for the 2026-05-13 composite baseline** (player 05 held-out validation; 60 tracks; 8 715 gold notes). |
| Guitar-TECHS | Phase 0 (planned) / 1.5 / 7 | CC-BY-4.0 (paper §4 + Zenodo) | ⚠️ | arXiv:2501.03720 — 5h12m multi-mic + DI; per-string MIDI annotations. Acquisition planned per Phase 0 impl plan §3.2; on-disk scanner stub in `tabvision/tabvision/eval/manifest_builder.py:scan_guitar_techs`. Required attribution must appear in the public README. |
| IDMT-SMT-Guitar | 1.5 / 7 | research-use, registration | ⚠️ | Training-only; not redistributed in our repo. Verified 2026-05-13 research pass; superseded by Guitar-TECHS for v1 acceptance — kept for potential future training augmentation. |
| EGDB | 1.5 / 7 | **none on repo — author email pending** | ⚠️ | https://ss12f32v.github.io/Guitar-Transcription/ — 240 tracks, ~12h with multi-amp electric variants, GuitarPro tabs + aligned MIDI. **Portfolio-use written permission required** before any acquisition (LICENSE file is null per 2026-05-13 verification). Email `f08946011@ntu.edu.tw`; template in `docs/plans/2026-05-12-tab-f1-to-spec-design.md` §8.2. |
| ~~GOAT~~ | DROPPED | request-only, research-only | ❌ | arXiv:2509.22655. Verified 2026-05-13: distribution gated per-use ("for research purposes only, upon request") due to copyrighted cover-song content. Not portfolio-compatible per SPEC §1.5; removed from the eval composite. |
| ~~SynthTab~~ | DROPPED from default pipeline | dataset CC-BY-NC-4.0 (code CC-BY-4.0) | ❌ | github.com/yongyizang/SynthTab. Dataset NC clause taints derived weights (SynthTab paper treats trained models as derivative work). Not portfolio-compatible per SPEC §1.5; removed from the planned pretrain pipeline 2026-05-13. The repo code (Apache/CC-BY) remains MIT-style usable for our own renderers if needed. |
| DadaGP | research/dev only — **not in default pipeline** | access-by-email; underlying GP tabs derive from copyrighted songs | ⚠️ | https://github.com/dada-bots/dadaGP. Per 2026-05-13 design plan §4.2, acceptable as internal training augmentation only. Synthetic-source clips are blocked from non-train manifest splits by `tabvision.eval.manifest.validate_manifest` (the `SYNTHETIC_IN_EVAL_SPLIT` guard). |
| ~~User clips (the 20 self-recorded set)~~ | BANNED | self-owned | ⛔ | Banned from all roles per 2026-05-13 design plan D10 — not as accuracy gate, dev set, or label source. Replaced by the public-corpus composite. |
| Roboflow `b101/guitar-3` | 3 (training) | **CC BY 4.0** | ✅ | **Verified 2026-05-05.** Source: https://universe.roboflow.com/b101/guitar-3. Forked into Patrick's workspace as `patricks-workspace-vozcg/guitar-3-4efcd` v2; YOLOv8-OBB export downloaded (926 images, 710/144/72 split, classes: fret / neck / nut). License declared in the dataset's README.dataset.txt: "License: CC BY 4.0". Attribution: "guitar 3" by b101 on Roboflow Universe (https://universe.roboflow.com/b101/guitar-3), CC BY 4.0; export downloaded May 5, 2026 via the Roboflow SDK. **Required attribution must appear in the public README and any blog post.** |

## Library dependencies (default pipeline)
Expand Down
56 changes: 56 additions & 0 deletions docs/DECISIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,62 @@ Format:

---

## 2026-05-13 — Tab F1 v1 acceptance: per-tier targets + public-corpus composite

**Phase:** Accuracy work (cross-cuts Phases 1, 2, 3, 5, 7, 8 of the SPEC)
**Decision tree:** Design plan adoption + SPEC §1.4 amendment proposal
**Branch taken:** Replace the aggregate 0.88 Tab F1 acceptance gate with
a per-tier table; drop SynthTab (CC-BY-NC) and GOAT (request-only) from
the default pipeline; rely on GuitarSet + Guitar-TECHS + EGDB
(license-pending) for the public-corpus composite eval.

**Evidence:**
- Strategy / decision record: `docs/plans/2026-05-12-tab-f1-to-spec-design.md`
- Phase 0 implementation plan: `docs/plans/2026-05-13-tab-f1-phase-0-implementation.md`
- SPEC amendment block: `SPEC.md` §1.4.1 (per-tier table + composite test set)
- First baseline artifact (2 of 4 tiers covered): `docs/EVAL_REPORTS/composite_baseline_2026-05-13.md`
- Companion error decomposition: `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md`
- Implementation branch with the eval harness: `impl/tab-f1-phase-0`

**Reasoning:** The 2026-05-08 GuitarSet validation showed aggregate Tab
F1 = 0.6104 with comp tracks at 0.670 and solo tracks at 0.508. The
aggregate target hid the dominant failure axis (string/fret assignment
on single-line passages), and the SPEC §1.4 numbers (0.94 / 0.86 / 0.90
/ 0.82) baked in implicit per-tier expectations that the project hadn't
explicitly negotiated. The 2026-05-13 user conversation locked in
relaxed v1 targets (0.85 / 0.90 / 0.87 / 0.80), kept the original SPEC
numbers as the v1.1 / portfolio stretch reference, and committed to
audio-only fusion priors + cheap pitch post-processing as the leverage
path (no SynthTab pretrain → no NC license taint on shipped weights).

**Per-tier acceptance gate (v1):**

| Tier | v1 target | 2026-05-13 baseline (mean / lower 95% CI) |
|---|---:|---:|
| Clean acoustic single-line | 0.85 | 0.5076 / 0.4448 (fail) |
| Clean acoustic strummed | 0.90 | 0.6708 / 0.6015 (fail) |
| Clean electric | 0.87 | missing — pending Guitar-TECHS |
| Distorted electric | 0.80 | missing — pending EGDB |

Both covered tiers fail by ~25–35 pp. Per the error decomposition,
`wrong_position_same_pitch` accounts for 77% of single-line loss and
50% of strummed loss — Phases 1-7 of the design plan target this
bucket.

**Decisions inventoried in the design plan (D1–D11):**

- D1 Per-tier replaces aggregate. D2 Targets table. D3 Composite eval.
D4 No SynthTab. D5 Video qualitative-only. D6 Free-tier compute first
(Local > Colab > Kaggle > Lightning > Modal). D7 1-2 month cadence.
D8 No stretch (bends/slides) in v1. D9 D2 numbers on top-1 only.
D10 Personal clips fully banned. D11 This is a SPEC §1.4 amendment,
not a SPEC-achievement plan.

**Open Phase 0 user actions:** Lightning Studios / Kaggle / Colab / W&B
account verification; EGDB author email; Guitar-TECHS Zenodo download.

---

## 2026-05-05 — Project name kept as `tabvision` (not `tabify`)

**Phase:** 0
Expand Down
41 changes: 41 additions & 0 deletions docs/EVAL_REPORTS/composite_baseline_2026-05-13.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Composite per-tier baseline

## Coverage

**2 of 4 tiers measured.** Clean acoustic single-line + strummed covered
via the GuitarSet validation split (held-out player 05, 60 tracks,
8 715 gold notes). **Clean electric and distorted electric tiers
pending Guitar-TECHS / EGDB acquisition** per the strategy doc §3.1 and
Phase 0 implementation plan §3.2 — see the "missing" rows below.

This is the first artifact of `impl/tab-f1-phase-0`. Companion
6-bucket error decomposition: [`tab_f1_error_decomposition_2026-05-13.md`](tab_f1_error_decomposition_2026-05-13.md).

## Per-tier results

| Tier | Clips | Gold notes | Tab F1 mean | Tab F1 lower-95 | Target | Status | Onset F1 | Pitch F1 |
|---|---:|---:|---:|---:|---:|---|---:|---:|
| clean_acoustic_single_line | 30 | 2179 | 0.5076 | 0.4448 | 0.85 | fail | 0.9375 | 0.9304 |
| clean_acoustic_strummed | 30 | 6536 | 0.6708 | 0.6015 | 0.90 | fail | 0.9229 | 0.9005 |
| clean_electric | 0 | 0 | — | — | 0.87 | missing | — | — |
| distorted_electric | 0 | 0 | — | — | 0.80 | missing | — | — |

## Per-source breakdown

| Tier | Source | Clips | Tab F1 mean | Onset F1 mean | Pitch F1 mean |
|---|---|---:|---:|---:|---:|
| clean_acoustic_single_line | GuitarSet | 30 | 0.5076 | 0.9375 | 0.9304 |
| clean_acoustic_strummed | GuitarSet | 30 | 0.6708 | 0.9229 | 0.9005 |

## Methodology

- Manifest: `data/eval/composite.toml`
- Audio backend: `highres`
- Position prior: `guitarset-v1`
- Eval-harness SHA: `9a7e957` (the commit that landed both this baseline
artifact and the chord-cluster matcher fix in
`tabvision.eval.error_decomposition.decompose_errors`)
- Onset tolerance: 50 ms
- Bootstrap: N=10,000, seed=42, 95% percentile interval
- Acceptance gate: `lower_95_CI >= target` per design plan §5

45 changes: 45 additions & 0 deletions docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Tab F1 error decomposition

## Diagnostic summary

**Dominant failure bucket on every covered tier is
`wrong_position_same_pitch`** — the audio detected the right pitch
within onset tolerance but the system placed it on the wrong
(string, fret).

| Tier | Loss share — wrong_position_same_pitch |
|---|---:|
| clean_acoustic_single_line | **77.5%** (910 / 1174 loss events) |
| clean_acoustic_strummed | **49.7%** (1548 / 3112 loss events) |
| Aggregate | **57.3%** (2458 / 4286 loss events) |

This matches the strategy doc §2 diagnostic exactly. The audio side
is at SPEC (Pitch F1 ≥ 0.90 on both covered tiers); the gap to D2
per-tier targets is almost entirely string/fret assignment, and it
gets worse on single-line passages where chord-cluster constraints
can't help the fusion.

Companion baseline report: [`composite_baseline_2026-05-13.md`](composite_baseline_2026-05-13.md).

Six-bucket port of the apr-28 7-bucket harness; the seventh apr-28
bucket (`muted_undetectable`) is deferred until the §8 `TabEvent`
contract carries a muted/X flag.

## Aggregate (all tiers)

| Bucket | Count | Share of loss |
|---|---:|---:|
| correct | 4986 | — |
| wrong_position_same_pitch | 2458 | 57.3% |
| pitch_off | 505 | 11.8% |
| timing_only | 94 | 2.2% |
| missed_onset | 672 | 15.7% |
| extra_detection | 557 | 13.0% |

## Per-tier breakdown

| Tier | correct | wrong_position_same_pitch | pitch_off | timing_only | missed_onset | extra_detection |
|---|---|---|---|---|---|---|
| clean_acoustic_single_line | 1125 | 910 | 19 | 17 | 108 | 120 |
| clean_acoustic_strummed | 3861 | 1548 | 486 | 77 | 564 | 437 |

2 changes: 1 addition & 1 deletion docs/plans/2026-05-12-tab-f1-to-spec-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ phase's evidence justifies starting it.
the composite eval. Acquire Guitar-TECHS; send EGDB email; verify free
compute accounts. **No production code changes.** Acceptance: per-tier
baseline numbers exist for ≥ 3 of 4 tiers with bootstrap CIs;
per-tier 7-bucket error breakdown exists. [Companion:
per-tier six-bucket error breakdown exists. [Companion:
`2026-05-13-tab-f1-phase-0-implementation.md`.]
- **Phase 1 — Pitch ceiling lift (cheap moves).** Voicing/silence gate
+ peak-picking + Basic Pitch pitch-only ensemble. Acceptance: Pitch
Expand Down
12 changes: 7 additions & 5 deletions docs/plans/2026-05-13-tab-f1-phase-0-implementation.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@ Acceptance, copied from the strategy doc §6:

- Per-tier baseline numbers for ≥ 3 of 4 D2 tiers with **bootstrap
95% CIs**, on the composite eval set.
- Per-tier 7-bucket error decomposition on the same set.
- Per-tier six-bucket error decomposition on the same set
(port of the apr-28 7-bucket harness; ``muted_undetectable`` deferred
until the §8 ``TabEvent`` contract carries a muted/X flag).
- Free-tier compute accounts (Local / Colab / Kaggle / Lightning / W&B)
verified.
- EGDB author email sent; reply tracked in `docs/DECISIONS.md`.
Expand All @@ -43,10 +45,10 @@ Acceptance, copied from the strategy doc §6:
| `tabvision/tests/unit/test_parser_guitarset_jams.py` | JAMS parser round-trip test |
| `tabvision/tests/unit/test_parser_guitar_techs_midi.py` | MIDI parser round-trip test |
| `tabvision/tests/unit/test_bootstrap_ci.py` | CI helper correctness on known distributions |
| `tabvision/tests/unit/test_error_decomposition.py` | 7-bucket assignment correctness on synthetic predicted/gold pairs |
| `tabvision/tests/unit/test_error_decomposition.py` | Per-bucket assignment correctness on synthetic predicted/gold pairs (six buckets populated) |
| `tabvision/tests/integration/test_composite_eval_smoke.py` | End-to-end smoke: 5-clip manifest → tier numbers exist + CIs computed |
| `docs/EVAL_REPORTS/composite_baseline_2026-05-13.md` | First baseline report (output of Phase 0E) |
| `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md` | First 7-bucket decomposition (output of Phase 0D) |
| `docs/EVAL_REPORTS/tab_f1_error_decomposition_2026-05-13.md` | First six-bucket decomposition (output of Phase 0D) |

### 1.2 Modified files

Expand Down Expand Up @@ -215,8 +217,8 @@ Must contain:

Must contain:

- Aggregate 7-bucket table (counts + share-of-loss).
- Per-tier 7-bucket table.
- Aggregate six-bucket table (counts + share-of-loss).
- Per-tier six-bucket table.
- A "biggest lever per tier" callout: which bucket dominates each
tier's loss. Phase 1+ priorities derive from this.

Expand Down
Loading
Loading