feat: GameVariant encapsulation (rebased on main)#19
Merged
Conversation
pyproject sets addopts='-n auto -m "not slow"'; -p no:xdist disables the xdist plugin but leaves '-n auto' as an unrecognized arg, so the slow-parity stage errored out before running. -o 'addopts=' clears the inherited args; the explicit -m slow + paths run as intended.
Previous '-o addopts=' cleared *all* addopts including '-n auto', so slow parity tests ran on a single core and projected to 10+ hours versus IMPL's ~40 min target. Set addopts='-n auto' explicitly: drops the '-m not slow' filter (so explicit '-m slow' applies) while keeping xdist parallelism.
Bundles CC4 game-rule axes (red_agent, target_weight, op_zone_servers, resilience_roles, num_steps) into a frozen GameVariant dataclass with named presets (CC4_STOCK, CIA_RESILIENCE, CIA_C/I/A). Both backends read from the variant; recipe YAMLs reference variants by name. - New game_variant + game_variants modules; VARIANTS registry. - New JaxborgScenarioGenerator: subclass with op-zone server count fixed when op_zone_servers is set. - New cyborg_env_factory: make_cyborg_env (pure construction) + reset_cyborg_env (explicit reset + role-map inject) + CyborgReset dataclass. No proxy wrappers. - New jax_env_factory: make_jax_env(variant, ...). - topology.build_topology / ScenarioEnv / FsmRedCC4Env now thread op_zone_min_servers through to constrain op-zone subnet server count. - recipe.py: train_variant(recipe) / eval_variant(recipe) resolvers; projections expose TRAIN_VARIANT / EVAL_VARIANT. - Drop RED_AGENT / RESILIENCE_MODE / RESILIENCE_TARGET_WEIGHT keys and string-matching "needs roles" predicates. - Migrate all 5 CybORG-side rollout sites and 3 JAX-side files (test_red_selectors, check_red_bias, ippo_jax) to variant API. - assign_resilience_roles raises ValueError on <3 candidates instead of silently truncating (op_zone_servers=3 makes <3 unreachable in CIA mode anyway). - Recipe YAMLs migrated; no back-compat.
assign_resilience_roles_from_const argsorts noise scores with non-candidate hosts pushed to +inf; with <3 op-zone server candidates the inf-tail leaks non-candidate indices into ranks[0..2], silently tagging non-op-zone hosts. The CIA metric reads those tags and would produce a meaningless score. Validate eagerly at make_jax_env construction instead of masking the symptom inside the JAX function: - generative: reject variants whose worst-case candidate count (2 * alpha-floor + 2 beta-min) is <3 — catches op_zone_servers=0. - snapshot: load each topology_path and count active op-zone servers, rejecting any below 3. Adds count_resilience_candidates(const) for the eager check.
Two latent issues from PR #11 that GameVariant encapsulation made detectable: 1. CIA selectors hardcoded `_FIXED_CIA_TARGET_WEIGHT=10.0` and ignored `variant.target_weight`, while CybORG-side `CRedAgent.with_weight` honored it (defaulted to 5.0 from the GameVariant default). Drop the constant, plumb the variant's `target_weight` through `_cia_c/i/a`, and set `CIA_C/I/A.target_weight=10.0` so both sides read from one source of truth. 2. CybORG `_CIARedAgent` mirrored only host-bias from JAX, not the action-prob override at FSM_R (root, undiscovered) that shifts mass toward Impact + Degrade. Override `state_transitions_probability` on `_CIARedAgent` to match `_CIA_PROB_MATRIX[FSM_R]` element-wise. Adds a parity test asserting the JAX `_CIA_PROB_MATRIX[FSM_R]` row and the CybORG `_CIARedAgent.state_transitions_probability['R']` row agree.
The L4 cross-backend equivalence stage was previously running vanilla CC4 on both sides regardless of `--recipe`, so any resilience-mode "drift" was actually policy-OOD (resilience-trained policy on stock-CC4 eval). Add a single `resolve_eval_variant(recipe_name=..., checkpoint=..., default=...)` helper in `jaxborg.recipe` that resolves the variant via explicit recipe → checkpoint sidecar → default. Thread it through: - Parity harness: `transfer_cli`, `jax_rollout`, `cyborg_rollout`, `cyborg_bridge`, `diagnostics`. JAX side switches from direct `FsmRedCC4Env(...)` to `make_jax_env(variant)`. CybORG bridge becomes a thin delegator to `evaluation.cyborg_env_factory.make_cyborg_env`. - Eval entry points: `baselines_cyborg`, `baselines_jax`, `benchmark_jax`, `export_trajectory`, `generate_cynex_trajectories`. Each now accepts `--recipe` (and `--checkpoint` where relevant); benchmark stays variant- agnostic since steady-state throughput doesn't depend on the selector. - Resilience role injection: harness rollouts call `inject_role_map` after `env.reset()` when `variant.resilience_roles` is set, so the CybORG-side `_CIARedAgent`/`ResilienceRedAgent` instances actually see the per-episode role map. Eval scripts go through `reset_cyborg_env`, which handles this automatically. - `make_cyborg_env(wrapper_class=None)` returns the raw `CybORG` instance (used by `export_trajectory.make_env` which drives `parallel_step` directly). - `test_make_cyborg_env_accepts_distinct_seeds` now greps both the bridge and the underlying factory, since the bridge delegates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Re-lands the GameVariant encapsulation work onto
main. The original PR #17 was stacked onpr-11-refactorand got merged into that branch instead ofmain, so the work isn't reachable frommain. This PR cherry-picks the 6 unique commits onto a fresh branch offorigin/main.Commits (cherry-picked from
origin/pr-11-refactor)What's in here
Test plan
Cross-backend simulator drift (independent of this PR)
The L4 cross-backend equivalence stage now actually exercises the variant (previously the harness ran vanilla CC4 regardless of `--recipe`, so apparent drift was policy-OOD). A separate pre-existing JAX↔CybORG simulator gap on stock CC4 (~−275 vs the matched-v2 +5.4 ± 58 baseline) exists in the ~10 FSM/red/selector commits between `2026-04-25` matched-v2 and the start of this stack, and needs an independent bisect.