Skip to content

gen: reverse engineer code generator for mx/core#152

Merged
webern merged 4 commits into
masterfrom
codegen
May 22, 2026
Merged

gen: reverse engineer code generator for mx/core#152
webern merged 4 commits into
masterfrom
codegen

Conversation

@webern
Copy link
Copy Markdown
Owner

@webern webern commented May 22, 2026

Relates to #58.

This branch introduces revgen, a new code generator under gen/ aimed at reproducing the hand-tuned mx/core element classes from the MusicXML XSD. It's the first concrete step toward unsticking us from MusicXML 3.0 and eventually generating 4.0 code.

What's in here

  • gen/ — revgen generator: a new generator that reverse-engineers the existing mx/core output from the MusicXML XSD, preserving the bespoke human choices baked into the current generated code.
  • src/private/mx/core/elements/ regeneration: mx/core/elements/ regenerated via revgen, so the tree now reflects what the generator can produce.
  • Revert of partial MusicXML 4.0 backports: removes incidental 4.0 additions and fixes typos that revgen exposed, so the baseline matches what the generator targets.
  • Small impl/test fixups: minor adjustments in mx/impl/ and mxtest/core/ to match the regenerated core.
  • docs/ai/projects/gen/: project planning, strategy notes, triage docs, and test-run logs from the M2 iteration cycle.

Scope / non-goals

  • No MusicXML 4.0 features are added in this PR — the goal is faithful reproduction of the current mx/core from XSD as a foundation.
  • The mx/api/ public surface is unchanged.

Validation

  • Round-trip and core tests were run iteratively during M2; logs are checked in under docs/ai/projects/gen/m2/ for traceability.
  • CI (make test-all, make check, xcode targets) will exercise the regenerated core.

webern added 4 commits May 22, 2026 12:17
Adds the docs/ai/projects/gen project directory used to drive the
reverse-engineering of the mx/core codegen pipeline and the subsequent
fix-up work tracked as Milestone 2 (fix-gen).

Project structure:
- index.md, plan.md, state.md - milestone roadmap (M1 revgen → M2
  fix-gen → M3 coverage → M4 better-gen → M5 mxml4-types) and current
  session pointers.
- log.md plus log-archive/ - append-only session log for the current
  phase, with prior phases rotated into log-archive/.
- m2/ - Milestone 2 working materials:
  - triage.md classifies every non-generated change in the "src:
    issues caused by revgen" commit as BUG / BENIGN / WEIRD with
    severity.
  - triage-tests.md clusters the 62 make test-all failures left after
    Issues A-F into root-cause groups R1-R7 and assigns each test.
  - revgen-issues.diff freezes the raw diff under triage.
  - test-all-*.log files freeze make test-all output at each
    iteration baseline (R2, R4 i1-i6, R5, D3+D1 candidate, D4
    candidate - the last being the 0-failure M2 exit).
- strategy-subagent.md - subagent-per-iteration loop and prompt
  template used to chew through the M2 failing-test backlog.
- design/ - current-state design docs:
  - forensics.md analyzes the original Ruby codegen and the resulting
    mx/core patterns.
  - overrides.md catalogs overrides by taxonomy
    (RULE/EXC/FIX/SUBSTRATE/ANOMALY).

Also drops docs/sounds.xml (MusicXML 4.0 sounds reference data) and
archives the older revgen scoring doc under design/.archived/, since
make test-all pass/fail - not the diff penalty - is the fitness
function from M2 onward.
Lands the reverse-engineered code generator that reproduces the
existing src/private/mx/core/elements/ classes from docs/musicxml.xsd.
This is the Milestone 1 (revgen) deliverable from
docs/ai/projects/gen/plan.md: a generator that emits every C++ element
class in mx/core with no skipped elements (SKIP_ELEMENTS and
CHOICE_SKIP are both empty as of revgen iteration 40).

Contents:
- gen/generate.py - the generator (~13.8k lines of Python). Parses the
  XSD, walks the model, and emits .h/.cpp pairs into
  src/private/mx/core/elements/. Combines a shared rule-based path
  (TREE_ELEMENTS / TREE_ELEMENT_CONFIG) with six bespoke handlers
  (credit, lyric, part-list, harmony, score-wrapper-family, note)
  registered in BESPOKE_ELEMENTS for elements whose shape cannot be
  expressed by the shared path. Bespoke handlers still drive off the
  parsed XSD model so spec changes propagate automatically.
- gen/eval.py - diff scoring tool that compares generated output to
  the checked-in mx/core. Used as a secondary diagnostic signal during
  revgen; make test-all is the primary fitness function from M2 on.
- gen/eval_config.yaml - scoring category rules consumed by eval.py.
  Edits to this file require user approval (see project index.md).
- gen/.gitignore - ignores local generator scratch output.
- gen/gen_attrs.py, gen/gen_enum_members.py, gen/gen_enums.py -
  promoted out of gen/experiment/ now that they are part of the
  supported toolchain.

Generated mx/core code is not checked in; the workflow is
`python3 gen/generate.py && make fmt && make test-all`, then reset
with `git checkout -- src/private/mx/core/ && git clean -fd
src/private/mx/core/`.
Non-generated consuming code that has to agree with what the revgen
generator emits from the 3.x XSD in docs/musicxml.xsd. Classified per
docs/ai/projects/gen/m2/triage.md.

ArpeggiateFunctions.cpp / NotationsWriter.cpp:
Revert a MusicXML 4.0 backport that had bolted a `none` value onto
core::UpDown by way of a hand-rolled core::UpDownNone enum used for
the arpeggiate `direction` attribute. The 3.x XSD only defines
up-down (up | down), so the generator emits ArpeggiateAttributes with
a core::UpDown direction; the impl layer must match. The
api::MarkType::arpeggiate (undirected) case now sets
hasDirection = false without assigning a value, and TODOs are left
in place noting that the 'none' value returns as first-class in
MusicXML 4.0 and should be restored under Milestone 5 (mxml4-types).

ArpeggiateTest.cpp:
Matching UpDownNone::up -> UpDown::up change in the core-level test
fixture. Test semantics are unchanged; tests remain an invariant per
the project's cardinal rules.

MetronomeTest.cpp:
Fix three `setPerMinuteOtBeatUnitChoice` typos to
`setPerMinuteOrBeatUnitChoice` (Ot -> Or). The generator produces
the correctly-spelled setter name on BeatUnitPer, so the existing
test was relying on a typo that the regenerated header no longer
carries.

Part of Milestone 2 (fix-gen) in docs/ai/projects/gen/plan.md.
Output of `python3 gen/generate.py && make fmt` against the
hand-written src/private/mx/core/elements/ tree, run on top of the
revgen + M2 fix-gen work tracked in docs/ai/projects/gen/.

These files are *generated*, not hand-edited. The generator
(gen/generate.py) was reverse-engineered from the existing mx/core
sources and docs/musicxml.xsd as Milestone 1 of the gen project, then
iterated on through Milestone 2 (fix-gen) until make test-all reports
0 failures with the generated tree in place (final baseline:
docs/ai/projects/gen/m2/test-all-d4-candidate.log, 2678 test cases,
9914 assertions).

The project policy (plan.md, state.md) is normally to *not* check in
regenerated code - the loop is `generate → fmt → test-all → reset` and
the canonical mx/core stays hand-written. This commit deliberately
breaks that policy to snapshot the generator output on the tip of the
M2 work so the residual hand-written vs. generated diff is reviewable
in git history.

Scope: src/private/mx/core/elements/ only. 454 files changed
(+2890 / -2500). No new files; no removed files; SKIP_ELEMENTS and
CHOICE_SKIP in the generator are both empty. The remaining diff is
the "unexplained residual" the gen project's eval.py measures, and is
the surface area Milestone 3+ will continue to drive down.

Per docs/ai/projects/gen/state.md, the next session should NOT treat
this commit as canonical mx/core. To return to the hand-written tree,
revert this commit (or `git checkout <parent> -- src/private/mx/core/
&& git clean -fd src/private/mx/core/`).
@webern webern changed the title Add revgen: reverse-engineering code generator for mx/core gen: reverse engineer code generator for mx/core May 22, 2026
@webern webern merged commit 2eaac80 into master May 22, 2026
5 checks passed
@webern webern deleted the codegen branch May 22, 2026 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant