Elements: Guard against non-string className in render filter#12028
Elements: Guard against non-string className in render filter#12028aaronrobertshaw wants to merge 8 commits into
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
andrewserong
left a comment
There was a problem hiding this comment.
This defensive check makes good sense to me, the added test looks good and is passing for me locally.
LGTM 🚀
Co-authored-by: Mukesh Panchal <mukeshpanchal27@users.noreply.github.com>
|
This PR also addresses the following PHPStan rule level 10 errors: |
…ass_name() implementations
|
I just noticed that the changes here are extremely similar to something else we did recently in r62359 (20b5d10, via #11686). Therefore, I've aligned Now that we have two instances of the same code, it probably makes sense to factor out into a helper for parsing CSS class names. cc @sirreal @dmsnell |
…lass_name()` block render filter callback. The `wp_render_elements_class_name()` function reads the `className` block attribute and passes it straight to `preg_match()`. While `className` is expected to be a string, malformed or corrupted stored block data can hold another type, such as an array, which triggers a fatal `TypeError` on PHP 8+. Guard against this by returning the block content unchanged when `className` is not a string. While here, align the implementation with `wp_render_custom_css_class_name()` from r62359: replace the regular expression with a `str_contains()` short-circuit followed by an HTML-spec-compliant `strtok()` walk over the class tokens. This also corrects a latent matching bug: the previous `\bwp-elements-\S+\b` pattern treated the hyphen as a word boundary, so a class such as `my-wp-elements-foo` was erroneously matched. Tokenizing the attribute first ensures only a standalone `wp-elements-*` class is applied. Add regression tests for the non-string and substring-prefix cases, and resolve PHPStan errors at rule level 10 (`missingType.iterableValue`, `offsetAccess.nonOffsetAccessible`, `argument.type`) by adding an array shape to the `@phpstan-param` tag. Developed in #12028 and WordPress/gutenberg#78841. Follow-up to r58074, r62359. Props aaronrobertshaw, andrewserong, mukesh27, westonruter. Fixes #65379. git-svn-id: https://develop.svn.wordpress.org/trunk@62475 602fd350-edb4-49c9-b593-d223f7449a82
…lass_name()` block render filter callback. The `wp_render_elements_class_name()` function reads the `className` block attribute and passes it straight to `preg_match()`. While `className` is expected to be a string, malformed or corrupted stored block data can hold another type, such as an array, which triggers a fatal `TypeError` on PHP 8+. Guard against this by returning the block content unchanged when `className` is not a string. While here, align the implementation with `wp_render_custom_css_class_name()` from r62359: replace the regular expression with a `str_contains()` short-circuit followed by an HTML-spec-compliant `strtok()` walk over the class tokens. This also corrects a latent matching bug: the previous `\bwp-elements-\S+\b` pattern treated the hyphen as a word boundary, so a class such as `my-wp-elements-foo` was erroneously matched. Tokenizing the attribute first ensures only a standalone `wp-elements-*` class is applied. Add regression tests for the non-string and substring-prefix cases, and resolve PHPStan errors at rule level 10 (`missingType.iterableValue`, `offsetAccess.nonOffsetAccessible`, `argument.type`) by adding an array shape to the `@phpstan-param` tag. Developed in WordPress/wordpress-develop#12028 and WordPress/gutenberg#78841. Follow-up to r58074, r62359. Props aaronrobertshaw, andrewserong, mukesh27, westonruter. Fixes #65379. Built from https://develop.svn.wordpress.org/trunk@62475 git-svn-id: http://core.svn.wordpress.org/trunk@61756 1a063a9b-81f0-0310-95a4-ce76da25c4cd
|
thanks @westonruter
I was thinking that the last place we left this was that it was unclear if the abstraction warranted a new method, or a basic wrapper with the Tag Processor. $class_processor = new WP_HTML_Tag_Processor( '<div>' );
$class_processor->next_tag();
$class_processor->set_attribute( 'class', $processor->get_attribute( 'class' ) );
foreach ( $class_processor->class_list() as $class_name ) {
…
}one of the issues I remember is that this is actually a quite odd task to perform because it requires extracting the probably okay IMO to let some of these sit duplicated for a while, but always happy to continue to explore what WordPress could provide to make things better. |
|
@dmsnell right, in the case here, there would not be any HTML entities, so directly we wouldn't need to do that encoding work. |
I wonder how common this is and how useful it might be broadly for extenders. Have you thought about further about the utility? Here's some sketching, if it seems valuable we can create a ticket: /**
* Splits a string containing multiple class names yielding individual values.
*
* This operates on _plaintext_. DO NOT USE WITH HTML ENCODED VALUES. It will
* not perform any decoding. Only use on plaintext inputs!
*
* Named for classnames for discoverability as most common use case, but suitable
* for any HTML whitespace delimited list (DOMTokenList).
*
* For HTML encoded values, use {@see WP_HTML_Tag_Processor}.
*
* Splits on HTML ASCII whitespace. This matches the behavior of DOMTokenList
* and is suitable for class lists (…add more things that use DOMTokenList).
*
* @todo add examples to documentation…
*
* @param string $classnames Plaintext space-delimited list.
* @return Generator<string> Individual items.
*/
function classnames_to_class_list( $classnames ): Generator {
if ( ! is_string( $classnames ) ) {
// _doing_it_wrong?
return;
}
$delimiters = " \t\n\f\r";
$length = strlen( $classnames );
$offset = 0;
while ( $offset < $length ) {
$offset += strspn( $classnames, $delimiters, $offset );
if ( $offset >= $length ) {
break;
}
$token_length = strcspn( $classnames, $delimiters, $offset );
yield substr( $classnames, $offset, $token_length );
$offset += $token_length;
}
}
$classnames = "\t important\t\f\n\r\nsuper wp-block-find-me \x0B-vertical-tab-preceded ";
// Find a single class name
$class_name = null;
foreach ( classnames_to_class_list( $classnames ) as $class_name_candidate ) {
if ( str_starts_with( $class_name_candidate, 'wp-block' ) ) {
$class_name = $class_name_candidate;
break;
}
}
var_dump( $class_name ); // "wp-block-find-me"
// Get an array. Much better than a simple `explode()`!
$naive = explode( $classnames, ' ' );
var_export( $naive ); // [ ' ' ]
// \S is close, but still wrong. U+000B LINE TABULATION was ignored!
preg_match_all( '/\S+/', $classnames, $naive_regex );
var_export( $naive_regex ); // [ 'important', 'super', 'wp-block-find-me', '-vertical-tab-preceded' ]
$better = iterator_to_array( classnames_to_class_list( $classnames ), false );
var_export( $better ); // [ 'button', 'is-primary', 'js-open', "\x0B-vertical-tab-preceded" ] |
Trac ticket: https://core.trac.wordpress.org/ticket/65379
This PR brings the changes from the following Gutenberg PR to core:
WordPress/gutenberg#78841
Description
Hardens the elements block support render filter against invalid block attribute data.
wp_render_elements_class_name()currently readsattrs.classNameand passes it intopreg_match(). WhileclassNameis expected to be astring, malformed or corrupted stored block data can contain other types (for example, anarray), which can trigger a fatalTypeErrorin PHP 8+.This backport adds a defensive type check so that when
classNameis not a string, the function returns the original block content unchanged instead of attempting regex matching.A regression test is also added to ensure non-string
classNamevalues do not cause fatals and continue to fail gracefully.Testing
• Run the PHPUnit tests for elements block support:
wpRenderElementsSupport.php• Verify the new regression test passes:
test_elements_block_support_class_with_non_string_class_name• Verify existing tests in the same file continue to pass.