Fix block content leaking out of marker-line nested-list items#251
Merged
Conversation
A nested list opened on its parent item's marker line (`- - A`) was treated as line-scoped: the inner list was materialized from an isolated single-line slice, so its item closed immediately. A block placed in that item then leaked out to the outer item, and a following same-indent marker fragmented the list into two. The reference djot.js 0.3.2 keeps the inner item open: it absorbs blocks indented to its content column and continues the list across following same-indent markers. djot-php now matches that. When an item's content itself begins with a list marker, collect the whole nested region (lines indented past the inner marker column, plus markers at it, across blank lines) into the item's lines and parse them as blocks. The existing recursive list parser then builds a persistent inner list - reusing the nested-list handling that already works for a sublist appearing on a following line, rather than adding a parallel path. A non-marker line at the inner marker column, or anything less indented, stays outer-item content. Cases now matching the reference: - `- - A\n\n block for A\n - B` -> inner item A keeps the block; B stays in the same inner list. - `- - A\n\n block under A` -> block stays inside inner item A (no leak to the outer item). - `- - A\n - B\n - C` -> single tight inner list [A, B, C] (unchanged). The paragraph-interrupt rule and bare-marker rejection are untouched. carve-php inherits this parser and was affected by the same bug.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #251 +/- ##
============================================
- Coverage 92.37% 92.34% -0.04%
- Complexity 3576 3593 +17
============================================
Files 107 107
Lines 10129 10165 +36
============================================
+ Hits 9357 9387 +30
- Misses 772 778 +6 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
A nested list opened on its parent item's marker line (
- - A) was treated as line-scoped. The inner list was materialized from an isolated single-line slice, so its item closed immediately at the line break. As a result:The reference djot.js 0.3.2 keeps the inner item open: it absorbs blocks indented to its content column and continues the list across following same-indent markers. djot-php now matches that.
(carve-php inherits this parser and was affected by the same bug.)
The fix
When an item's content itself begins with a list marker (the marker-line sublist case), collect the whole nested region (lines indented past the inner marker column, plus markers sitting at it, across blank lines) into the item's lines and parse them as blocks. The existing recursive list parser then builds a persistent inner list - reusing the nested-list handling that already works for a sublist appearing on a following line, rather than adding a parallel path.
Boundary rules (verified against reference djot.js 0.3.2):
Cases now matching the reference
Case 1 - input
- - A\n\n block for A\n - B:Case 2 - input
- - A\n\n block under A(block stays inside inner item A, no leak):Case 3 - input
- - A\n - B\n - C(already worked, kept as a regression guard): single tight inner list[A, B, C].Scope
BlockParser::tryParseList(); a corpus diff of 33 non-marker-line list inputs shows zero output changes versus master. New tests pin Cases 1, 2, and 3 to exact HTML.