Skip to content

Performance: avoid over-allocation in wp_is_numeric_array()#12100

Open
dmsnell wants to merge 4 commits into
WordPress:trunkfrom
dmsnell:perf/avoid-over-allocating-numeric-array-keys
Open

Performance: avoid over-allocation in wp_is_numeric_array()#12100
dmsnell wants to merge 4 commits into
WordPress:trunkfrom
dmsnell:perf/avoid-over-allocating-numeric-array-keys

Conversation

@dmsnell

@dmsnell dmsnell commented Jun 5, 2026

Copy link
Copy Markdown
Member

Trac ticket: Core-65467

When a trace of allocations revealed that wp_is_numeric_array() accounted for a significant fraction of the allocations in a page render, it was observed that the function eagerly allocates and copies array keys and then filters them when all it wants to know is whether a single key in the array meets a condition.

In this patch the array_filter( array_keys() ) invocation is replaced with early-aborting iteration to avoid the memory allocation and copying.

This patch prepared during WCEU 2026 Contributor Day.

Follow-up to [34927].
Follow-up to 374b39d.

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@yusufhay

yusufhay commented Jun 8, 2026

Copy link
Copy Markdown

Tested the patch locally against current origin/trunk in a disposable worktree.

Results:

  • php -l src/wp-includes/functions.php passed.
  • git diff --check passed.
  • vendor/bin/phpunit --filter Tests_Functions_WpIsNumericArray passed: 8 tests, 8 assertions.

The existing focused coverage for wp_is_numeric_array() passes with the allocation-friendly implementation.

When a trace of allocations revealed that `wp_is_numeric_array()`
accounted for a significant fraction of the allocations in a page
render, it was observed that the function eagerly allocates and copies
array keys and then filters them when all it wants to know is whether a
single key in the array meets a condition.

In this patch the `array_filter( array_keys() )` invocation is replaced
with early-aborting iteration to avoid the memory allocation and
copying.

This patch prepared during WCEU 2026 Contributor Day.

Follow-up to [34927].
Follow-up to 374b39d.
@dmsnell dmsnell force-pushed the perf/avoid-over-allocating-numeric-array-keys branch from d0a5456 to b8d0fdc Compare June 15, 2026 23:42
@dmsnell dmsnell marked this pull request as ready for review June 15, 2026 23:43
@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dmsnell, westonruter, yusufmudagal.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@dmsnell

dmsnell commented Jun 15, 2026

Copy link
Copy Markdown
Member Author

thanks @yusufhay — just a side note, if may not be worth your time checking those things since GitHub Actions already does in the CI jobs.


@westonruter for your perusal, I don’t expect this to amount to much, but it seemed like an obvious refactor to avoid the extra allocation and computation which is immediately disregarded.

on the other hand, with hundreds of recursive calls handling theme.json it might measure in the sub-millisecond range of impact.

one final question: if we assume that wp_is_numeric_array() was supposed to be an implementation of array_is_list() in that it indicates if there’s an associative array vs. a purely numeric array then we could substitute in array_is_list() when it’s available and that should be a practically constant lookup internal to PHP.

my assumption is that this was the goal of the original function but the difference is in sparse arrays. if an array doesn’t start with key 0 and have monotonically-increasing integer keys, it’s not a list. happy to follow-up with that change and label it a potentially-breaking backwards-compatibility change.

$a = array();
$a[1] = true;
$a[5] = false;
false === array_is_list( $a );
true  === wp_is_numeric_array( $a );

@westonruter

Copy link
Copy Markdown
Member

@dmsnell This is great. A couple suggestions.

I too have run across this function specifically when reviewing this method:

private function is_associative_array( $value ): bool {
return is_array( $value ) && ! wp_is_numeric_array( $value );
}

I also wondered if it was originally intended to be an implementation of array_is_list(). However, it seems like maybe this is not a safe assumption to make given that it is easy in PHP to turn a list into a numerically-indexed array when using array_filter() in a WP filter, for example. Often I've had to resort to passing arrays through array_values() to turn them into true lists again.

Maybe we should update the docblock to encourage devs to use array_is_list() instead. We can then also create a ticket to audit the uses of wp_is_numeric_array() to consider replacing with array_is_list().

Also related is render_block_core_block() which contains:

$is_assoc_array = is_array( $content_data['values'] ) && ! wp_is_numeric_array( $content_data['values'] );

Also related:

function rest_is_array( $maybe_array ) {
if ( is_scalar( $maybe_array ) ) {
$maybe_array = wp_parse_list( $maybe_array );
}
return wp_is_numeric_array( $maybe_array );
}

Co-Authored-By: Weston Ruter <westonruter@git.wordpress.org>
@dmsnell

dmsnell commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

it is easy in PHP to turn a list into a numerically-indexed array when using array_filter() in a WP filter, for example. Often I've had to resort to passing arrays through array_values() to turn them into true lists again.

this is a really good point @westonruter, one which I think I’ve run into before and forgot.

encourage devs to use array_is_list() instead

I left a note showing the differences. Now I’m a bit worried that if we try and push people towards array_is_list() we might cause them to overlook the nuance.

Perhaps in a separate PR we could examine returning this, but it’s more work than I want to devote to this right now…

function wp_is_numeric_array( $data ) {
	if ( ! is_array( $data ) ) {
		return false;
	}

	if ( array_is_list( $data ) ) {
		return true;
	}

	foreach ( $data as $key => $value ) {
		if ( is_string( $key ) ) {
			return false;
		}
	}

	return true;
}

the premise being that the internal speedup for most arrays would outweigh the additional calling cost.

thanks for the instant review!

Comment thread src/wp-includes/functions.php Outdated
* @since 4.4.0
*
* @param mixed $data Variable to check.
* @return bool Whether the variable is a list.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @phpstan-assert-if-true would be a nice addition here:

Suggested change
* @return bool Whether the variable is a list.
*
* @phpstan-assert-if-true array<int, mixed> $data

Static analysis with PHPStan would then satisfied in cases like this when attempting to iterate over $data:

if ( wp_is_numeric_array( $data ) ) {
$new_data = array();
foreach ( $data as $item ) {

Otherwise, PHPStan here is typing $data as mixed:

phpstan: Argument of an invalid type mixed supplied for foreach, only iterables are supported.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, if this were just an implementation of array_is_list() then it could do:

@phpstan-assert-if-true list<mixed> $data

But it seems perhaps unsafe to assume that every case of wp_is_numeric_array() being used to today intends array_is_list().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phpstan-assert-if-true array<int, mixed> $data

I have not seen this, but I guess it’s tantamount to guard functions in TypeScript? seeing as how it’s already in place elsewhere I’ll add it, though I am curious about whether you have concern about basing the types on the trust of docblock comments vs. inferences.

In TypeScript projects I’ve been skeptical of this due to the loss of constraints and soundness. It does provide good autocomplete, and a function like this should be relatively easy to ensure the constraints…

it seems perhaps unsafe to assume that every case of wp_is_numeric_array() being used to today intends array_is_list().

yeah I think so. I didn’t have confidence in my assumption about it, but I knew that changing the behavior would invite breakage. 🙃

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not seen this, but I guess it’s tantamount to guard functions in TypeScript? seeing as how it’s already in place elsewhere I’ll add it, though I am curious about whether you have concern about basing the types on the trust of docblock comments vs. inferences.

It's a good point. I'm not sure why PHPStan doesn't just infer what will be returned by looking at the executed code. I inquired of Claude:

Good question — and the short answer is that PHPStan deliberately doesn't do this for user-defined functions, regardless of how analyzable the body is.

PHPStan treats function calls as black boxes

When you call wp_is_numeric_array( $x ), PHPStan only knows the declared return type: bool. It does not re-analyze the function body at each call site to figure out what true vs. false implies about the arguments. So at a call site like:

if ( wp_is_numeric_array( $x ) ) {
    // Without the annotation, $x is still `mixed` here.
}

…there is no narrowing at all unless you tell PHPStan the relationship. The @phpstan-assert-if-true annotation is precisely the mechanism for declaring "if this returns true, the argument is array<int, mixed>."

This is by design. Back-propagating "the return was true, therefore argument has shape X" from arbitrary control flow is, in general, undecidable and would be prohibitively expensive to do for every function at every call site. PHPStan only does this kind of inference for:

  1. Hardcoded built-insis_array(), is_string(), array_is_list() (which narrows to list<...>), etc. These are special-cased in PHPStan's engine. array_is_list is exactly analogous to what we want here, but it gets narrowing only because PHPStan ships explicit knowledge of it. A WordPress function it's never heard of gets nothing.
  2. Functions with explicit @phpstan-assert* annotations — the manual opt-in.
  3. Custom TypeSpecifyingExtensions — a PHP class you register that programmatically tells PHPStan how to narrow. That's the heavier alternative to the annotation.

Why the body wouldn't be "easy" to invert anyway

Even setting aside the design decision, the narrowing here isn't a simple matter of reading the body. The true result comes from the fall-through path after a loop:

foreach ( $data as $key => $value ) {
    if ( is_string( $key ) ) {
        return false;
    }
}
return true;

To conclude array<int, mixed>, an analyzer would have to reason: "the loop visited every key, none satisfied is_string, array keys are only ever int|string, therefore every key is int." That's universal quantification over a loop combined with exhaustiveness reasoning — exactly the sort of thing static analyzers don't attempt when deriving implications from return values.

So the annotation isn't compensating for a PHPStan weakness so much as supplying knowledge PHPStan never tries to derive on its own for custom functions — the same way @param/@return exist because it can't always infer those either.

One thing worth noting: the annotation is a promise PHPStan trusts but lightly verifies. With @phpstan-assert*, PHPStan does check that the asserted type is compatible with what the body could return, but it won't catch a subtly-wrong assertion. So the correctness of array<int, mixed> here rests on the human reasoning above, not on PHPStan having proven it.

@westonruter

Copy link
Copy Markdown
Member

(Sorry, I somehow didn't submit those line review comments when I submitted my other comment.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants