CORE-9082#14
Merged
Merged
Conversation
bfcdfd3 to
8917632
Compare
MariusVolkhart
approved these changes
May 22, 2026
RyanLuMaye
approved these changes
May 28, 2026
Comment on lines
+315
to
+322
| val generex = Generex("[\\W]") | ||
| repeat(10_000) { | ||
| val result = generex.random() | ||
| assertThat(result).hasLength(1) | ||
| val c = result[0] | ||
| val isWord = c == '_' || c in 'a'..'z' || c in 'A'..'Z' || c in '0'..'9' | ||
| assertThat(isWord).isFalse() | ||
| } |
There was a problem hiding this comment.
Suggested change
| val generex = Generex("[\\W]") | |
| repeat(10_000) { | |
| val result = generex.random() | |
| assertThat(result).hasLength(1) | |
| val c = result[0] | |
| val isWord = c == '_' || c in 'a'..'z' || c in 'A'..'Z' || c in '0'..'9' | |
| assertThat(isWord).isFalse() | |
| } | |
| val generex = Generex("[\\W]") | |
| val expected = Pattern.compile("\\W") | |
| repeat(10_000) { | |
| assertThat(generex.random()).matches(expected) | |
| } |
Same with other tests
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8917632 to
a115630
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add documentation for known limitations, and fix a long-standing bug (originally introduced in 2014) where shorthand classes inside a character class produced invalid generations.
[a-z\d] was rewritten by a post-pass replaceAll to [a-z[0-9]]. Brics, the underlying regex engine, does not support nested character classes — it parsed the outer [...] as the class [a-z[0-9] followed by a literal ], so every generated string ended with one or more stray ] characters.
Shorthand expansion now happens inline while normalizing the pattern, with awareness of whether the cursor is inside a character class:
Also fixes a related off-by-one in [^X...] where the leading ^ was being emitted twice.
Added parameterized tests covering every shorthand in every position inside [...] (alone, with literal neighbors, with explicit ranges, in negated outer classes, under quantifiers), plus regression tests for every entry in the new LIMITATIONS.md.