perf: patch midi-file writeBytes to in-place push (13.4× writer speedup)#110
Closed
elicwhite wants to merge 14 commits into
Closed
perf: patch midi-file writeBytes to in-place push (13.4× writer speedup)#110elicwhite wants to merge 14 commits into
elicwhite wants to merge 14 commits into
Conversation
Widens ParsedChart.metadata to Partial<typeof defaultMetadata> & {
extraIniFields?: Record<string, string>}, and has parseChartAndIni merge
[Song]-section values + song.ini into it (ini wins, matching the
"song.ini is authoritative, [Song] is a legacy overlap" stance). Unknown
ini key/value pairs move from ParseChartAndIniResult.iniUnknownValues
onto parsedChart.metadata.extraIniFields for round-trip writing.
ParseChartAndIniResult.iniMetadata and iniUnknownValues are removed --
the same data is now reachable via result.parsedChart.metadata.
iniFolderIssues and iniMetadataIssues stay (they're diagnostics, not
duplicate data).
scanChart's diff-against-default checks now read the merged metadata via
parseResult.parsedChart.metadata; "set" is defined as "present AND not
equal to the default". This also fixes a pre-existing inconsistency
where chart.delay was taken from ini but chart.chart_offset was taken
from [Song] -- both now come from the merged metadata with ini winning.
Also derives defaultIniChartModifiers as a projection of defaultMetadata
so the 8 parse-behavior defaults can't drift from the 40-field source
of truth.
…art (state-derived) The three derived boolean flags on ParsedChart (hasLyrics, hasVocals, hasForcedNotes) were parse-time snapshots that went stale whenever consumers mutated chart data post-parse. Remove them from ParsedChart entirely; scanChart derives all three at scan time from the current chart state, piggybacking on its existing single note-walk (no extra iterations). - hasLyrics / hasVocals: computed from parsedChart.vocalTracks.parts — constant-time existence check. - hasForcedNotes: inverts resolveFretModifiers. A note is 'forced' iff its resolved flag disagrees with the natural HOPO state the parser would pick without any force events, or it carries the tap flag (tap can only come from explicit forceTap). Inlined helpers for isNaturalHopo, isFretChord, isSameFretNote, isInFretNote, computeHopoThresholdTicks in chart-scanner.ts (duplicated from notes-parser.ts; a follow-up can extract to a shared helper). Semantic change to note: under the new definition, redundantly-applied force events (e.g. explicit forceHopo on a naturally-HOPO note) no longer contribute to hasForcedNotes — the chart plays identically with or without them, so state-derived detection correctly says false. This eliminates the need for any 'hasForcedNotes backstop' in writers since the flag round-trips naturally: the writer emits force events exactly when a flag disagrees with natural state, and the scanner detects those same disagreements on re-parse. ScannedChart.notesData's shape is unchanged (hasLyrics / hasVocals / hasForcedNotes still present); only the population source changes. Consumers that previously read parsedChart.hasLyrics / parsedChart.hasVocals / parsedChart.hasForcedNotes must switch to scanChart output.
The chart scanner, chart writer, and MIDI writer each had their own
near-identical copies of isFretChord / isSameFretNote / isInFretNote /
isNaturalHopo over NoteEvent[], and the parser had the same three
helpers again over TrackEvent[]. This consolidates all five callsites
into src/chart/natural-hopo.ts.
The helpers are generic (<T, E extends { type: T }> + an isFret
predicate); thin wrappers are exported for each concrete group type:
- NoteEvent-based (scanner + writers): isFretChord / isSameFretNote /
isInFretNote, plus the combined isNaturalHopo.
- TrackEvent-based (parser's resolveFretModifiers):
isFretChordRawEvents / isSameFretNoteRawEvents /
isInFretNoteRawEvents. The parser still builds its natural-HOPO
check inline — it compares effectiveNotes to lastNotes while
passing the raw pre-coalesced events to isSameFretNoteRawEvents,
a subtlety no NoteEvent callsite has.
The isFretNoteType / isFretEventType predicates are also exported,
so the parser's two standalone isFretNote(...) callsites (filtering
event lists outside the group helpers) share the same definition too.
The parser's four local helpers (isFretNote, isSameFretNote, isFretChord,
isInFretNote) are deleted.
Net: ~70 LOC of duplication removed across scanner, parser, and (in
the writer branches further up the stack) the two writers. One file
owns every natural-HOPO rule.
This commit updates the scanner + parser; the writers move to the
shared helpers in the two branches further up the stack that own them.
Builds a minimal valid ParsedChart from scratch: default 480 resolution, 120 BPM at tick 0, 4/4 time signature at tick 0, empty tracks/sections/ metadata/vocal parts/unrecognized events. chartBytes defaults to an empty Uint8Array (no source bytes), format defaults to 'chart', iniChartModifiers to the library defaults. Options let callers override resolution, bpm, timeSignature, and format. Useful for programmatic chart generation (e.g. downstream code that builds charts up from scratch rather than parsing source bytes).
Serializes IniMetadata (Partial<typeof defaultMetadata> + extraIniFields) back to song.ini text with: - [song] header - Known fields emitted in the canonical defaultMetadata order - Undefined values skipped - Booleans as True/False (Clone Hero convention) - extraIniFields appended in insertion order - CRLF line endings Derives its field order from Object.keys(defaultMetadata) rather than hardcoding a separate FIELD_ORDER list, so new fields added to defaultMetadata automatically participate in writing.
…emission Port of the non-instrument-track half of the chart writer into scan-chart: - [Song] emits the subset of metadata the chart body supports (Name, Artist, Charter, Album, Genre, Year, Resolution, Offset, PreviewStart, Difficulty). Other fields live exclusively in song.ini. - [SyncTrack] emits tempos as `B millibeats` and time signatures as `TS numerator [denominator-exponent]`, sorted by tick with TS before B at the same tick. - [Events] emits section markers (wrapped as [section name] to survive the parser's greedy trailing-\] regex), end events, unrecognized global events (with bracket-stripping when source is .mid so round-trip to .chart emits naked text), and vocal phrase_start/phrase_end/lyric events from the normalized vocalTracks.parts.vocals. - Unrecognized chart sections are re-emitted verbatim. Instrument tracks ([ExpertSingle] etc.) land in a follow-up PR.
Completes writeChartFile with per-track section emission: - [ExpertSingle], [HardDrums], etc. named from instrument + difficulty. - Drum tracks: base notes (N 0..4 or 0..5 for 5-lane), cymbal markers (N 66/67/68) only in fourLanePro, accent (N 34..38) and ghost (N 40..44) markers matching the emitted note's eventType, one N 109 flam per group, double kick as N 32 only. - 5-lane round-trip: greenDrum+cymbal emits as N 4 (orange) while plain greenDrum stays at N 5; blueDrums coinciding with greenDrum+cymbal emit as N 5 to keep the parser's drumType detection on fiveLane. - Fret/GHL tracks: notes via the 5-fret and 6-fret maps, tap as N 6, forceUnnatural (N 5) when the note's hopo/strum flag disagrees with the natural-HOPO state. - Star power S 2, activation lanes S 64, flex lanes S 65/66, versus phrases S 0/1, solo sections as E 'solo'/'soloend' (tick + length - 1 for soloend to round-trip). - Per-track text events (including disco flip mix markers). - Disco-flip state transitions per difficulty from red/yellow drum disco/discoNoflip flags → 'mix <diff> drums0[d|dnoflip]' text events. - Coda events: if no 'coda' in unrecognizedEvents, synthesize from drumFreestyleSections where isCoda=true. - hasForcedNotes backstop: if hasForcedNotes is set but no forceUnnatural emitted, append N 5 0 at a vacant tick in the first fret track to preserve the flag on round-trip.
Port of the core MIDI writer infrastructure: - writeMidiFile(chart) entry point: builds a Format-1 MIDI and returns Uint8Array via midi-file's writeMidi - TEMPO TRACK: trackName + setTempo (from chart.tempos) + timeSignature (from chart.timeSignatures), all at their absolute ticks - EVENTS track: trackName + section text events (unwrapped so YARG's NormalizeTextEvent doesn't strip names containing ]) + [end] events + global events (bracket-wrap when source is .chart so MIDI output follows convention) + [coda] derived from drumFreestyleSections when not already present in unrecognizedEvents - Unrecognized MIDI tracks: verbatim pass-through with abs-tick → delta-tick conversion; duplicate track names suffixed to keep the internal trackMap unique - finalizeMidiTrack helper: sorts by tick with a type-priority tiebreaker, converts absolute ticks to delta times, appends endOfTrack Instrument tracks (PART DRUMS, PART GUITAR, …) and vocal tracks (PART VOCALS, HARM1/2/3) land in follow-up PRs.
Adds drum emission to writeMidiFile: - buildDrumTrack: groups one or more ParsedTrackData entries (one per difficulty) into a single PART DRUMS MIDI track. Emits trackName, delegates per-difficulty notes to emitDrumNotes, collects and dedupes instrument-wide star power (MIDI 116) / solo (MIDI 103) / activation lanes (MIDI 120), emits flex lanes (MIDI 126/127 with per-difficulty LDS velocity), passes through per-track text events and unrecognizedMidiEvents from the first difficulty. - emitDrumNotes: per-difficulty note emission with full modifier support: base drum notes (MIDI 96..100 expert / 84..88 hard / 72..76 medium / 60..64 easy); double-kick as MIDI 95 only, emitted AFTER regular kick at the same tick (YARG insertion order quirk); tom markers (MIDI 110/111/112) only in fourLanePro; accent (velocity 127) / ghost (velocity 1) encoding; one flam marker (MIDI 109) per group; 5-lane green+cymbal remapped to offset 4 (orange pad); blue-at-same-tick-as- green+cymbal remapped to offset 5 so fiveLane detection succeeds; fourLanePro green+tom fallback to offset 5 when a conflicting green+cymbal exists at the same tick across difficulties; disco-flip state-transition text events (`[mix <diff> drums0[d|dnoflip]]`). - fourLanePro sentinel greenTomMarker: when drumType=1 but no tom markers were emitted (all yellow/blue defaulted to cymbal, green-toms used offset-5 encoding), emit a MIDI 112 at a safe tick so scan-chart's drumType detection still picks fourLanePro. - [ENABLE_CHART_DYNAMICS] text event at tick 0 when any accent or ghost was emitted. - computeLengthOverrides: prevents scan-chart's trimSustains from collapsing adjacent short-sustain drum-note chains by attributing the combined chain length to the first note. - Shared helpers: addNoteOnOff, addNoteOnOffWithChannel (with zero-length seq-pairing to survive finalizeMidiTrack's sort), deduplicateSections, instrumentTrackNames table. Fret tracks (PART GUITAR, GHL) and vocal tracks (PART VOCALS, HARM1/2/3) remain no-ops until the follow-up PRs land — the writeMidiFile entry iterates trackData but skips any non-drum instrument group.
Adds fret-instrument emission to writeMidiFile. - buildFretTrack: one MIDI track per instrument group (guitar / bass / guitarghl / bassghl / keys / guitarcoop / rhythm / GHL variants); multiple difficulties merge into the same track. - emitFretNotes: 5-fret (95/83/71/59 diffStarts) and 6-fret/GHL (94/82/70/58) note-number mapping; open notes routed two ways: ENHANCED_OPENS mode (diffStart+0, preserves chord-with-open) when any group contains both open and non-open, else forceOpen-SysEx fallback (diffStart+1 + SysEx, collapses chords with open but safer next to animations at easy diffStart). Coalesces fret-note length with overlapping animations at the same tick:noteNumber via pre-computed seq slots so animation ordering survives re-parse. - Force modifier ranges (forceHopo / forceStrum / forceTap) via emitFretModifierRanges — inverts resolveFretModifiers: natural HOPO state is re-derived with the MIDI threshold (1 + res/3), and a modifier emits only when the resolved flag disagrees. forceHopo = diffStart+6 (5-fret) / +7 (GHL); forceStrum = +7 / +8; forceTap as SysEx. - reconstructModifierRanges: collapses per-tick hit sets into minimal contiguous ranges over noteTicksInOrder. - Instrument-wide emission: star power (MIDI 116), solo (MIDI 103), flex lanes (126/127 with per-difficulty LDS velocity), per-track text events, versus phrases (105/106), animations (with skip when the fret note above already emitted at the same tick:noteNumber), per-track unrecognizedMidiEvents. - [ENHANCED_OPENS] text event when that mode is active. - Helpers added: addSysExOnOff, reconstructModifierRanges, midiIsFretChord / midiIsSameFretNote / midiIsInFretNote. Vocal tracks (PART VOCALS / HARM1-3) land in the follow-up PR. Tests: 15 new test cases covering track naming, 5-fret + GHL note-number mapping, open-note encoding (both modes), force modifier emission (hopo/strum/tap + non-emission when natural matches), instrument-wide sections, and round-trip through parseChartAndIni for both 5-fret modifiers and GHL chord-with-open. 53 midi-writer tests, 397 total scan-chart tests passing at the tip.
Adds vocal-track emission to writeMidiFile. One MIDI track per vocal
part (vocals → PART VOCALS, harmony{1,2,3} → HARM{1,2,3}; HARM* not
PART HARM*, matching the convention used in the wild).
Phrase-marker emission is careful about YARG's CopyDownPhrases, which
at parse time copies HARM1's notePhrases onto HARM2/HARM3:
- PART VOCALS: notePhrases → 105 (player:2 → 106)
- HARM1: notePhrases → 105, staticLyricPhrases → 106
- HARM2: only staticLyricPhrases → 106 (note 105 comes from HARM1
via CopyDown on re-parse — would double-count otherwise)
- HARM3: no phrase markers at all
Lyrics and notes are union'd across both phrase sets (note 105 and 106
can have different boundaries; emitting the union keeps all lyrics
that belong on the track). Zero-length vocal notes are preserved via
per-event `seq` tags so finalizeMidiTrack keeps noteOn immediately
before its matching noteOff.
Star power (MIDI 116) is suppressed on HARM2/HARM3 — CopyDown recreates
it from HARM1 on re-parse. Range shifts (MIDI 0) and lyric shifts (MIDI
1) are per-part for lossless round-trip, with fallback to the
track-level arrays for the part that owns them (PART VOCALS, or HARM1
when PART VOCALS is absent). Raw vocal-track text events (stance
markers, facial anim triggers) are re-emitted verbatim so that
stance-only tracks survive round-trip — YARG marks a VocalsPart
non-empty iff it has phrases or text events.
Tests: 18 new cases covering track naming (PART VOCALS, HARM1/2/3 with
no PART HARM* prefix), phrase-marker routing per part, pitched/
percussion/out-of-range note emission, lyric union across phrase sets,
star power suppression on HARM2/3, range+lyric shifts, text events,
and round-trip through parseChartAndIni for both PART VOCALS and the
full harmony stack with CopyDown semantics preserved.
71 midi-writer tests, 415 total scan-chart tests passing.
End-to-end coverage for the scan-chart writers: build a ParsedChart,
serialize via writeChartFile/writeMidiFile, re-parse via parseChartAndIni,
and assert that structured fields survive.
The per-feature tests (chart-writer.test.ts, midi-writer.test.ts) already
cover granular correctness. This file's job is interactions across tracks
— e.g. a chart with drums + guitar + vocals populated together — and the
"everything enabled at once" shape the earlier tests skip.
Coverage:
- tempo/TS/sections/endEvents (both formats)
- drums: kick, tom/cymbal, accent, ghost, 2x-kick (both formats)
flam: .mid only (the .chart parser doesn't recognize N 109 — writer
emits it, parser gap is pre-existing)
- 5-fret: base colors + forceHopo + forceTap + sustains (both formats)
star power + solo sections (both formats)
- GHL: open + chord-with-open ENHANCED_OPENS path (both formats)
- vocals: PART VOCALS full shape; HARM1/2/3 with CopyDown semantics (.mid)
- multi-track: drums + guitar + bass (both formats); drums + guitar +
vocals (.mid)
- metadata: [Song] fields and chart_offset via .chart
18 new tests; 433 total passing.
Bundles a ParsedChart with non-chart files (audio, album art, extras)
into a single value, and provides the write-side companion to
parseChartAndIni's input shape:
files (in) ChartDocument (out)
---------------- -----------------------
notes.chart or notes.mid → parsedChart
song.ini → parsedChart.metadata
song.ogg, album.png, ... → assets (passthrough)
writeChartFolder produces the reverse transformation:
- notes.chart or notes.mid from writeChartFile / writeMidiFile based on
parsedChart.format
- song.ini from writeIniFile(parsedChart.metadata) — chart_offset is
naturally skipped since writeIniFile only emits keys in defaultMetadata
- every entry in doc.assets, in order
No separate `metadata` field on ChartDocument — it lives on
parsedChart.metadata (the consolidated shape produced by parseChartAndIni).
Tests: 9 cases covering format selection, ini content (chart_offset not
leaked, extraIniFields preserved), asset passthrough + ordering, and
full round-trip through parseChartAndIni for both .chart and .mid with
assets.
442 total scan-chart tests passing.
Writer.prototype.writeBytes was O(n²): this.buffer = this.buffer.concat(Array.prototype.slice.call(arr, 0)) Every call allocated a new array and replaced `this.buffer`, so writing N bytes via M calls cost O(N·M) time and churned the GC. Replace with an in-place push loop — each call is O(arr.length), total write is O(N). Measured on the writer autoresearch bench (2000 charts, 500 .chart + 1500 .mid, 8 workers): - mean: 90.115 ms → 6.721 ms (13.4× faster) - p50: 59.944 ms → 5.764 ms (10.4× faster) - p95: 274.443 ms → 17.255 ms (15.9× faster) - p99: 560.784 ms → 25.263 ms (22.2× faster) - wall: 24.972s → 5.496s (4.5× faster) - summed writer time: 162.206s → 12.098s (93% reduction) 0 hash mismatches, 442/442 tests still green — byte-identical output. CPU profile at baseline showed: 35.76% Writer.writeUInt8 21.06% Writer.writeBytes 13.81% GC of total writer time. Fixing writeBytes removes both the direct cost (it's a handful of the 19 writeBytes callsites downstream of every track write) and the GC pressure from the allocate-and-replace pattern.
This was referenced Apr 20, 2026
Contributor
Author
|
Closing — these patches were for the upstream |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Writer.prototype.writeBytesinmidi-file1.2.4 is O(n²):Every call allocates two new arrays and replaces
this.buffer, so writing N bytes via M calls costs O(N·M) and churns the GC. Replace with an in-place push loop — each call is O(arr.length), total write is O(N).Measurement
I set up an autoresearch bench that parses + writes + re-parses 2000 charts (500
.chart+ 1500.mid) across 8 workers, timing only thewriteChartFile/writeMidiFilecall and validating every re-parse against a baseline ScannedChart-equivalent hash. CPU profile at baseline showed 35.76% inWriter.writeUInt8+ 21.06% inWriter.writeBytes+ 13.81% GC — all inmidi-fileitself.13.4× speedup on mean writer time. 0 hash mismatches, 442/442 tests pass — byte-identical output.
Test plan
yarn testgreenStack (bottom → top)