Skip to content

Match single quotes in one O(n) pass instead of an O(n^2) scan#249

Merged
dereuromark merged 1 commit into
masterfrom
perf/single-quote-on-pass-matching
Jun 16, 2026
Merged

Match single quotes in one O(n) pass instead of an O(n^2) scan#249
dereuromark merged 1 commit into
masterfrom
perf/single-quote-on-pass-matching

Conversation

@dereuromark

Copy link
Copy Markdown
Contributor

What

InlineParser::buildSingleQuoteMatchCache() decides which single quotes are curly opener/closer pairs and which are apostrophes. It collected potential openers and closers into two lists, then matched each closer against the opener list with a nested scan - O(closers x openers). On quote-heavy text that is quadratic and noticeably slow.

This folds classification and matching into a single forward pass using an opener stack:

if ($prevIsSpace && !$nextIsSpace) {
    $openerStack[] = $i;                      // potential opener
} elseif (!$prevIsSpace && $nextIsSpaceOrPunct && $openerStack) {
    $matched[array_pop($openerStack)] = $i;   // pair with innermost opener
}

The stack top is always the largest-index still-open opener, so popping it reproduces the former "nearest preceding unmatched opener" pairing exactly - now in O(n).

Behavior

Unchanged. The smart-quote semantics, including the official conformance corpus (tests/official/smart.test), render identically; only the cost changes. A new SingleQuoteMatchingTest locks the pairing for matched, lone-opener, nested, and apostrophe-then-pair cases.

This is a perf-only, behavior-preserving change. It deliberately does NOT adopt carve-php's flanking-rule rewrite (which drops opener/closer matching entirely); that diverges from djot.js semantics and would break the conformance corpus. Only the O(n) stack-merge idea is carried over.

Also removes the unused findMatchingSingleQuoteCloser() method, a leftover of the old approach.

buildSingleQuoteMatchCache() collected potential openers and closers into
two lists, then matched each closer against the opener list with a nested
scan - O(closers x openers), which degraded badly on quote-heavy text.

Fold the classification and matching into a single forward pass: push each
opener onto a stack and, at a closer, pop the top opener. The stack top is
always the largest-index still-open opener, so popping it reproduces the
former nearest-preceding-unmatched-opener pairing exactly, now in O(n).

Behavior is unchanged - the smart-quote semantics (including the official
conformance corpus) are identical; only the cost changes. Also drops the
unused findMatchingSingleQuoteCloser() relic of the old approach.
@dereuromark dereuromark added the enhancement New feature or request label Jun 16, 2026
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.37%. Comparing base (5cf4c4d) to head (695195f).

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #249      +/-   ##
============================================
+ Coverage     92.10%   92.37%   +0.27%     
+ Complexity     3596     3576      -20     
============================================
  Files           107      107              
  Lines         10167    10129      -38     
============================================
- Hits           9364     9357       -7     
+ Misses          803      772      -31     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dereuromark dereuromark merged commit 58a181a into master Jun 16, 2026
6 checks passed
@dereuromark dereuromark deleted the perf/single-quote-on-pass-matching branch June 16, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant