Skip to content

parser.c: Use SWAR to skip consecutive spaces#886

Merged
byroot merged 1 commit into
ruby:masterfrom
byroot:parser-whitespace-switch
Nov 1, 2025
Merged

parser.c: Use SWAR to skip consecutive spaces#886
byroot merged 1 commit into
ruby:masterfrom
byroot:parser-whitespace-switch

Conversation

@byroot

@byroot byroot commented Nov 1, 2025

Copy link
Copy Markdown
Member

Closes: #881

If we encounter a newline, it is likely that the document is pretty printed, hence that the newline is followed by multiple spaces.

In such case we can use SWAR to count up to eight consecutive spaces at once.

== Parsing activitypub.json (58160 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.118k i/100ms
Calculating -------------------------------------
               after     11.223k (± 0.7%) i/s   (89.10 μs/i) -     57.018k in   5.080522s

Comparison:
              before:    10834.4 i/s
               after:    11223.4 i/s - 1.04x  faster

== Parsing twitter.json (567916 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   118.000 i/100ms
Calculating -------------------------------------
               after      1.188k (± 1.0%) i/s  (841.62 μs/i) -      6.018k in   5.065355s

Comparison:
              before:     1094.8 i/s
               after:     1188.2 i/s - 1.09x  faster

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    58.000 i/100ms
Calculating -------------------------------------
               after    570.506 (± 3.7%) i/s    (1.75 ms/i) -      2.900k in   5.091529s

Comparison:
              before:      419.6 i/s
               after:      570.5 i/s - 1.36x  faster

== Parsing float parsing (2251051 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    22.000 i/100ms
Calculating -------------------------------------
               after    212.010 (± 1.9%) i/s    (4.72 ms/i) -      1.078k in   5.086885s

Comparison:
              before:      189.4 i/s
               after:      212.0 i/s - 1.12x  faster

FYI: @samyron

Closes: ruby#881

If we encounter a newline, it is likely that the document is pretty printed,
hence that the newline is followed by multiple spaces.

In such case we can use SWAR to count up to eight consecutive spaces at once.

```
== Parsing activitypub.json (58160 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.118k i/100ms
Calculating -------------------------------------
               after     11.223k (± 0.7%) i/s   (89.10 μs/i) -     57.018k in   5.080522s

Comparison:
              before:    10834.4 i/s
               after:    11223.4 i/s - 1.04x  faster

== Parsing twitter.json (567916 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   118.000 i/100ms
Calculating -------------------------------------
               after      1.188k (± 1.0%) i/s  (841.62 μs/i) -      6.018k in   5.065355s

Comparison:
              before:     1094.8 i/s
               after:     1188.2 i/s - 1.09x  faster

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    58.000 i/100ms
Calculating -------------------------------------
               after    570.506 (± 3.7%) i/s    (1.75 ms/i) -      2.900k in   5.091529s

Comparison:
              before:      419.6 i/s
               after:      570.5 i/s - 1.36x  faster

== Parsing float parsing (2251051 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    22.000 i/100ms
Calculating -------------------------------------
               after    212.010 (± 1.9%) i/s    (4.72 ms/i) -      1.078k in   5.086885s

Comparison:
              before:      189.4 i/s
               after:      212.0 i/s - 1.12x  faster
```

Co-Authored-By: Scott Myron <samyron@gmail.com>
@byroot byroot force-pushed the parser-whitespace-switch branch from 9cd6375 to b3fd7b2 Compare November 1, 2025 11:55
@byroot byroot merged commit acbf40b into ruby:master Nov 1, 2025
37 checks passed
@samyron

samyron commented Nov 1, 2025

Copy link
Copy Markdown
Contributor

Thank you for the improvements!

@byroot byroot deleted the parser-whitespace-switch branch November 1, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants