Skip to content

gen: fix failing core roundtrip tests#155

Merged
webern merged 14 commits into
masterfrom
fix-core-dev
May 22, 2026
Merged

gen: fix failing core roundtrip tests#155
webern merged 14 commits into
masterfrom
fix-core-dev

Conversation

@webern
Copy link
Copy Markdown
Owner

@webern webern commented May 22, 2026

Milestone 3 of the gen project (#58). Closes the loop on the Milestone 2 regenerator (codenamed revgen for reverse engineering codegen): gets the full test suite green and improves how we handle MusicXML inputs that don't conform to the XSD.

What

  • make test-all green: 3028/3028, 9914 assertions. make test-core-dev 350/350.
  • Three bugs in hand-written core fixed:
    • XsString writer didn't escape \r, so carriage returns were normalized to \n on the next read (XML 1.0 §2.11).
    • CommaSeparatedListOfPositiveIntegers::parse discarded ", " vs "," spacing on import.
    • slash-type was treated as always-present, but the XSD allows empty <slash/> and <beat-repeat/>. Removed the CHILD_MIN_OCCURS_OVERRIDE and regenerated.
  • One harness fix: width is tenths-typed, so the import harness now strips its trailing zeros like other decimal fields.

Why

Many failing core-roundtrip cases were inputs that violate the MusicXML XSD. Diffing a round-trip against a schema-invalid input is not a useful signal. Introduced a {file}.invalid sibling-file convention (body documents the violation), honored by corert and ignored by api import. Documented in data/README.md. Eleven inputs marked .invalid.

The suite is now a meaningful fitness function for the next milestone: pointing the generator at MusicXML 4.0.

Deferred

MusicXML 4.0 UpDownNone backport (M2 issue A). TODOs in mx/impl/NotationsWriter.cpp:398 and mx/impl/ArpeggiateFunctions.cpp:35.

webern added 14 commits May 22, 2026 16:28
The decimal-field normalizer used during test prep already strips trailing
zeros from 9 tenths-typed fields so the expected XML matches what
PreciseDecimal::toStream emits. 'width' (a tenths attribute on measure /
note) was missing from the set, causing core roundtrip failures on inputs
like '<measure width="564.40">' that mx::core writes back as 'width="564.4"'.

Drops make test-core-dev failures 31 -> 14 (e.g. Invention 2, Invention 5,
Invention 4, Invention 6, ...). make test stays at 0.
Introduce a repo-wide marker convention: a sibling file {file}.invalid next
to any invalid MusicXML input. The body explains in prose why the file is
invalid. The core roundtrip suite (schema-strict, via mx::core::Document)
skips any input that has such a marker.

Other suites (e.g. api import) ignore the marker and continue exercising
reader leniency on the same files.

Marks data/musuite/testInvalid.xml (intentionally contains <invalid_element/>
and <another_invalid_element/> not present in the MusicXML XSD).

test-core-dev: 14 -> 13 failed. test: 0 -> 0 failed.
Static analysis sweep of all corert failures against the MusicXML XSD via
xmllint --schema. For each failing input cross-checked the schema
violation against the corert diff symptom; marked a file only when the
violation explains the round-trip diff.

Marked (schema violation -> corert diff explained):
- foundsuite/Deutscher Tanz D.820.1.xml (midi-program=0)
- foundsuite/O_Holy_Night-Adam-1871.xml (midi-channel=0)
- foundsuite/O_Holy_Night.xml (midi-channel=0)
- foundsuite/Rimsky-Korsakov Op11 No4.xml (sound/@dynamics=-1.11)
- lysuite/ly01e_Pitches_ParenthesizedAccidentals.xml ('double-flat'
  not in accidental enum)
- lysuite/ly32a_Notations.xml (empty <fret/>)
- lysuite/ly41g_PartNoId.xml (part missing required id)
- lysuite/ly74a_FiguredBass.xml (figured-bass missing required figure)
- lysuite/ly75a_AccordionRegistrations.xml (accordion-middle out of range)
- musuite/test_harmony.xml (kind enum violations, degree-type misorder)

Not marked (schema-valid; real library bug):
- lysuite/ly22b_Staff_Notestyles.xml
- lysuite/ly41e_StaffGroups_InstrumentNames_Linebroken.xml
- lysuite/ly45f_Repeats_InvalidEndings.xml (ending-number pattern
  allows both '1, 2, 3' and '1,2,3'; library is dropping spaces)

test-core-dev: 13 -> 3 failed. test: 0 -> 0 failed.
…rs parse

The MusicXML ending-number XSD pattern [1-9][0-9]*(, ?[1-9][0-9]*)* allows
both '1, 2, 3' and '1,2,3' as valid lexical forms of the same value. The
existing parser stripped whitespace before tokenising and never set
myIsSpacingDesired, so every output was the compact form regardless of
input. toString already handled both forms via the flag; only import
needed to record which form the source used.

Detection looks for digit-comma-space-digit so junk input that happens
to contain ', ' (covered by the existing StringsTest EndingNumber02
case) is unaffected.

Fixes core-roundtrip lysuite/ly45f_Repeats_InvalidEndings.xml.
test-core-dev: 3 -> 2 failed. test-all: 3028 cases, 2 failed (the two
remaining known corert bugs ly22b and ly41e). make check clean.
A literal 0x0D byte in element text content is normalized to 0x0A by
any XML parser per XML 1.0 §2.11 (End-of-line handling). To preserve a
carriage return across an XML round-trip, it must be emitted as the
numeric character reference &#xd;, which is not subject to EOL
normalization.

XsString::toStream previously escaped only <, >, and &. The reader path
already handled &#xd; correctly (pugixml expands the escape into a raw
\r after EOL normalization runs), so values like part-name text with
embedded carriage returns survived the load but were dropped on the
next write. Round-trip of lysuite/ly41e_StaffGroups_InstrumentNames_Linebroken.xml
turned 'Long\rStaff\rName' into 'Long\nStaff\nName'.

Fix is a one-case extension to the escape switch in XsString.cpp; the
type is hand-written so no codegen change is involved. The remaining
core-roundtrip failure (lysuite/ly22b_Staff_Notestyles.xml, child count
mismatch inside measure-style) is unrelated.

make test-core-dev: 2 -> 1 failed
make test:           0 -> 0 failed
make test-all:    3026 -> 3027 passed of 3028
…fixtures

After the gen fix that made slash-type properly optional (minOccurs=0
per the XSD slash group), default-constructed Slash and BeatRepeat
objects no longer emit <slash-type>eighth</slash-type> automatically.
Update the per-element tests and the shared Properties fixture builder
to call setHasSlashType(true) explicitly so they exercise the same
populated-slash case they were already asserting in their expected
streams.

No new assertions; only added the missing setter calls. Fixes the 23
mxtest/core/* test cases that broke under make test-all after the
gen fix in d43a222.
After M3 completion, condensed log.md, plan.md, state.md, and index.md.
Iteration-by-iteration narrative replaced with milestone summaries that
keep commit hashes, the audit trail, the operational invariants (lessons
captured during M2), and the project conventions. Drops stale figures
(generator line count, output file count) and stale gotchas that no
longer apply after M3.
@webern webern changed the title gen M3: clean green on core roundtrip + test-all (refs #58) gen: fix failing core roundtrip tests May 22, 2026
@webern webern merged commit 60b7122 into master May 22, 2026
5 checks passed
@webern webern deleted the fix-core-dev branch May 22, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant