Add property tests for WP_Token_Map#48
Conversation
ee891ae to
3572d0d
Compare
|
1. Array Export Key Length Bug: What would break: Any caller round-tripping through Fix: Use Why important: 2. Bug: Bug: If the would-be long-token group key contained NUL, lookup returned Bug: What would break: Parsers calling Fix: Gate long-token lookup on Why important: 3. ASCII Matching Bug: Small-token Bug: Short-token ASCII-insensitive matching had a parenthesization error: it effectively uppercased a boolean comparison instead of uppercasing the stored byte. A token like Bug: Case folding needed to be explicitly ASCII-only and byte-oriented. Non-ASCII bytes must stay literal. What would break: ASCII-insensitive lookup could produce false negatives for short tokens, possible false positives from packed storage boundary matches, and incorrect behavior around high bytes if comparison semantics were not explicitly byte-preserving. Fix: Scan small-token storage by fixed record boundaries; add Why important: The API mode is specifically 4. Folded Group-Key Collisions Bug: Long tokens are grouped by a fixed-length prefix. In ASCII-insensitive mode, multiple stored group keys can fold to the same lookup key, e.g. What would break: Tokens in later folded-equivalent groups were invisible to Fix: Add Why important: Longest-match behavior is a core parser invariant. Missing one folded-equivalent group can produce wrong tokenization, not just a missed optimization. 5. Precomputed Source Escaping Bug: What would break: Generated PHP source could become syntactically invalid or semantically different. Examples: Fix: Build long-group binary data with Why important: Precomputed tables are meant to be pasted/evaluated as PHP source for fast static loading. If source generation is not byte-safe, the optimization path can corrupt maps or generate invalid PHP for perfectly valid token data. |
What
Adds deterministic property tests for
WP_Token_Mapand fixes divergences they exposed.Why
Existing tests covered fixture behavior, but not adversarial/generated token sets, non-default
key_length, ASCII-insensitive byte semantics, or generated source escaping.Details
contains()vs a linear referenceread_token()vs a longest-match referenceto_array()/from_array()behavioral round-tripskey_length=1export into_array().$, control bytes, and high bytes.Verification
vendor/bin/phpcs --standard=phpcs.xml.dist src/wp-includes/class-wp-token-map.php tests/phpunit/tests/wp-token-map/wpTokenMapProperties.phpWP_TESTS_SKIP_INSTALL=1 ./vendor/bin/phpunit --group token-mapWP_TESTS_SKIP_INSTALL=1 ./vendor/bin/phpunit --group html-api-token-mapTrac ticket:
Use of AI Tools
Yes! Codex GPT 5.5, Claude Fable 5, others.
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.