Skip to content

Bootstrap: AGENTS_API_LOADED all-or-nothing guard breaks under version skew (two coexisting copies) #279

@chubes4

Description

@chubes4

Summary

The substrate's agents-api.php bootstrap uses a single coarse AGENTS_API_LOADED define to short-circuit its entire class-loading block:

// agents-api.php
if ( defined( 'AGENTS_API_LOADED' ) ) {
    return;
}
define( 'AGENTS_API_LOADED', true );
define( 'AGENTS_API_PATH', __DIR__ . '/' );
define( 'AGENTS_API_PLUGIN_FILE', __FILE__ );

require_once AGENTS_API_PATH . 'src/Registry/class-wp-agent-runtime-overrides.php';
require_once AGENTS_API_PATH . 'src/Packages/class-wp-agent-package-artifact.php';
// ... ~80 more require_once for the full class set
require_once AGENTS_API_PATH . 'src/Packages/class-wp-agent-package-artifact-hasher.php';
// ...

That guard correctly prevents a double-define on a single copy. But it is fatal when two different versions of the substrate coexist in one PHP process — a very common shape once agents-api is both:

  1. vendored into a consumer via Composer (automattic/agents-api: dev-mainvendor/automattic/agents-api/), and
  2. separately network-activated as a standalone plugin at a different (older) version.

Whichever copy's agents-api.php is included first defines AGENTS_API_LOADED + AGENTS_API_PATH and loads its class set. The second copy hits if ( defined( 'AGENTS_API_LOADED' ) ) return; and bails before requiring any of its class files — even classes the first (older) copy never shipped. Class loading is all-or-nothing per-constant, not idempotent per-class, so the in-memory class set is frozen to whichever copy won the include race, regardless of which is newer.

Real-world failure this caused

On a WordPress multisite where data-machine vendors agents-api (dev-main, newer) and an older standalone agents-api plugin (v0.1.0) was network-activated:

  • wp_get_active_network_plugins() returned the standalone plugin first → it defined AGENTS_API_LOADED and AGENTS_API_PATH pointing at its own older tree.
  • The older copy's require list did not include class-wp-agent-package-artifact-hasher.php (the file didn't even ship in that version).
  • When the consumer's vendor/autoload.php later ran, Composer's autoload_files entry required the newer vendored agents-api.php, which immediately returned on the constant guard — never loading WP_Agent_Package_Artifact_Hasher (or ~30 other newer Packages/Tools/Approvals/Transcripts classes).
  • Net effect: a hard PHP fatal Class "WP_Agent_Package_Artifact_Hasher" not found on any code path that touched a class only present in the newer set (in this case, agent bundle export — every profile dead).

Runtime proof at the time:

AGENTS_API_PATH=.../plugins/agents-api/          <-- standalone (old) won the race
hasher_class_exists=NO
composer_autoload_files_has_agentsapi_hash=YES   <-- Composer "included" the vendored copy, but it early-returned
constant_defined=YES

Downstream report with full trace: Extra-Chill/data-machine#2477.

We worked around it operationally by deleting the older standalone copy so only the vendored one loads. But the substrate shouldn't be silently corruptible by version skew — the next vendoring + activation drift reintroduces it.

Why this is a substrate-layer bug (not just an ops mistake)

Vendoring a WP-shaped library via Composer while the same library is also a standalone plugin is a normal, expected deployment shape. The substrate is the only layer that can make class loading robust against it. Consumers can't reliably "win" the race (network plugin order is not under their control), and asking every operator to guarantee single-copy + newest-wins is fragile.

Proposed fix direction

Make class loading idempotent per-class and decouple it from the one-time-side-effects guard:

  1. Separate concerns. Keep AGENTS_API_LOADED for one-time side-effect registration (hooks, ability registration, etc.), but do not gate the require_once class-file list behind it. Class files are already idempotent via require_once per path — the problem is only the early return.
  2. Top-up loading. On include, require the full class-file manifest regardless of whether another copy already defined the constant, so a newer copy can load classes an older copy is missing. require_once is path-keyed, so if the same copy is included twice nothing is re-run; if a different copy is included, its (possibly newer) files load and fill the gaps.
    • Guard define() calls individually (defined() || define()) and class declarations are already protected by class_exists()-style PHP semantics under require_once, but to be safe against two copies declaring the same class name from different paths, wrap each require_once target in a class_exists( 'WP_Agent_...', false ) check before requiring.
  3. Optional: version awareness. Expose the loaded substrate version (e.g. AGENTS_API_VERSION) and, when a second copy is included, prefer the newer file set or at least emit a clear warning on skew instead of silently freezing the older set.

The key invariant: if two copies of the substrate are present, the process should end up with the union (newest) class set, or fail loudly with an actionable message — never silently load an incomplete older set and fatal later on a missing class.

Repro shape (minimal)

  1. Place substrate copy A (older, missing class-wp-agent-package-artifact-hasher.php from its require list) so it loads first and defines AGENTS_API_LOADED.
  2. Place substrate copy B (newer, requires the hasher) so its agents-api.php is included second (e.g. via a consumer's Composer autoload_files).
  3. Call any code that references WP_Agent_Package_Artifact_Hasher → fatal Class not found, despite copy B shipping it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions