TheWizardsCode · SorraTheOrc · Dec 2, 2025 · Dec 2, 2025 · Dec 2, 2025 · Dec 2, 2025
diff --git a/.pm/tracker.md b/.pm/tracker.md
@@ -1,11 +1,17 @@
 # Project Task Tracker
 
-**Last Updated:** 2025-12-02T06:23:51Z
+**Last Updated:** 2025-12-02T18:55:00Z
 
 ## Status Summary
 
 **Recent Progress (since last update):**
 
+- 🎉 **Task 8.4.1 (Content Pipeline Tooling & CI) COMPLETED** - GitHub Issue [#23](https://github.com/TheWizardsCode/GEngine/issues/23)
+  - Content build script (`scripts/build_content.py`) validates worlds, configs, and sweeps
+  - CI workflow (`.github/workflows/content-validation.yml`) runs on content file changes
+  - Designer workflow documented in `docs/gengine/content_designer_workflow.md`
+  - 17 tests covering all validation paths, all passing
+  - Clear error messages with entity reference validation
 - 🎉 **Task 10.1.2 (Strengthen AgentSystem Tests) COMPLETED**
   - Refactored `AgentSystem` to extract scoring logic for testability.
   - Added unit tests verifying trait influence (empathy, cunning, resolve) on decision scoring.
@@ -92,9 +98,9 @@
 
 **Current Priorities:**
 
-1. 🚀 **Phase 8 Deployment** - Core complete (8.1.1, 8.2.1, 8.3.1), need CI automation (8.3.2) and content pipeline (8.4.1)
+1. 🚀 **Phase 8 Deployment** - Core complete (8.1.1, 8.2.1, 8.3.1, 8.4.1), need CI automation (8.3.2) to finish
 2. 🤖 **Phase 9 AI Testing** - Observer (9.1.1) and action layer (9.2.1) complete, LLM-enhanced (9.3.1) ready to start
-3. 🔧 **CI/CD Gap** - No automated workflows exist; high risk of regressions
+3. 🔧 **CI/CD Gap** - K8s validation workflow (8.3.2) still needed for deployment protection
 
 **Recommended Next 3 Parallel Tasks:**
 
@@ -105,37 +111,30 @@
    - Impact: Protects all environments from manifest errors
    - Estimated time: 1-2 days
 
-2. **8.3.3 - K8s Resource Tuning** (Priority: MEDIUM, Effort: Low)
-   - Why: Complete 8.3.1 resource sizing acceptance criteria
-   - Owner needed: DevOps/SRE-focused agent
-   - Parallelizable: Configuration work, independent of code
-   - Impact: Prevents resource exhaustion in production
-   - Estimated time: 4-6 hours
-   - Prerequisites: Smoke test data from 8.3.1
-
-3. **9.3.1 - LLM-Enhanced AI Decisions** (Priority: MEDIUM, Effort: High)
+2. **9.3.1 - LLM-Enhanced AI Decisions** (Priority: MEDIUM, Effort: High)
    - Why: Builds on completed AI foundation (9.1.1, 9.2.1)
    - Owner needed: AI/ML-focused agent with LLM experience
    - Parallelizable: AI/ML work, independent of infrastructure
    - Impact: Enables advanced AI testing capabilities
    - Estimated time: 3-5 days
 
-**Alternative (if no AI/ML owner available):**
-- **8.4.1 - Content Pipeline Tooling** instead of 9.3.1
-  - Priority: MEDIUM, Effort: Medium
-  - Unblocks content designers
-  - Estimated time: 2-3 days
+3. **10.1.3 - Expand SimEngine API Tests** (Priority: HIGH, Effort: Medium)
+   - Why: Improve core system test coverage
+   - Owner needed: Test-focused agent
+   - Parallelizable: Test work, independent of infrastructure
+   - Impact: Better regression detection for core engine
+   - Estimated time: 2-3 days
 
 **Key Risks:**
 
 - 🔴 **K8s CI validation missing** - Bad manifests can break deployment (8.3.2) - HIGH IMPACT
-- ⚠️ **Phase 8 content pipeline needs ownership** - Task 8.4.1 requires assignment
 - ⚠️ **Phase 9 LLM enhancement ready** - Rule-based AI complete, LLM-enhanced (9.3.1) unblocked but needs owner
+- ✅ **Phase 8 content pipeline complete** - Task 8.4.1 finished with build script, CI workflow, and documentation (2025-12-02)
 - ✅ **Phase 8 observability complete** - Task 8.3.1 Prometheus annotations and smoke tests added (2025-12-01)
 - ✅ **Phase 7 delivery risk eliminated** - All core player features complete and tested, per-agent modifiers enabled by default
 - ✅ **Containerization complete** - Docker/Compose and K8s manifests tested and documented
 - ✅ **AI player foundation complete** - Observer and action layer shipped with 112 tests
-- ✅ **Clean repository state** - Issues #21, #24, #25 closed (verified 2025-12-01)
+- ✅ **Clean repository state** - Issues #21, #23, #24, #25 closed (verified 2025-12-02)
 
 |    ID | Task                                            | Status      | Priority | Responsible        | Updated    |
 | ----: | ----------------------------------------------- | ----------- | -------- | ------------------ | ---------- |
@@ -168,7 +167,7 @@
 | 8.3.3 | K8s Resource Sizing & Tuning (M8.3.y)           | completed   | Medium   | devops-agent       | 2025-12-02 |
 | 8.3.3 | Gateway/LLM Prometheus Metrics (M8.3.x)         | not-started | Medium   | TBD (ask Ross)     | 2025-12-01 |
 | 8.3.4 | Integrate K8s Smoke Test into CI (M8.3.x)       | not-started | Medium   | TBD (ask Ross)     | 2025-12-01 |
-| 8.4.1 | Content pipeline tooling & CI (M8.4)            | not-started | Medium   | TBD (ask Ross)     | 2025-11-30 |
+| 8.4.1 | Content pipeline tooling & CI (M8.4)            | completed   | Medium   | devops-agent       | 2025-12-02 |
 | 9.1.1 | AI Observer foundation acceptance (M9.1)        | completed   | Medium   | gamedev-agent      | 2025-11-30 |
 | 9.2.1 | Rule-based AI action layer (M9.2)               | completed   | Medium   | gamedev-agent      | 2025-12-01 |
 | 9.3.1 | LLM-enhanced AI decisions (M9.3)                | not-started | Medium   | TBD (ask Ross)     | 2025-11-30 |
@@ -717,16 +716,44 @@
 ### 8.4.1 — Content Pipeline Tooling & CI (M8.4)
 - **GitHub Issue:** [#23](https://github.com/TheWizardsCode/GEngine/issues/23)
 - **Description:** Implement content build tooling (`scripts/build_content.py`), CI validation hooks, and documentation so designers can author/test YAML and story seeds efficiently.
-- **Acceptance Criteria:** Content build step produces artifacts consumed by simulation; CI validates content on change; designer workflow documented.
+- **Acceptance Criteria:**
+  - ✅ Content build step produces artifacts consumed by simulation
+  - ✅ CI validates content on change (schema, references, integrity)
+  - ✅ Designer workflow documented
+  - ✅ Clear error messages for content validation failures
 - **Priority:** Medium
-- **Responsible:** TBD (ask Ross)
-- **Dependencies:** Stable content schema and directory structure.
+- **Responsible:** devops-agent
+- **Status:** ✅ COMPLETED
+- **Dependencies:** Stable content schema and directory structure (✅ complete).
 - **Risks & Mitigations:**
   - Risk: Pipeline friction slows content iteration. Mitigation: Optimize for designer ergonomics, provide quick local commands.
-- **Next Steps:**
-  1. Implement build script.
-  2. Wire into CI.
-  3. Document designer workflow.
+- **Completion Notes:**
+  - **Build Script** (`scripts/build_content.py`):
+    - Validates world definitions (`world.yml` and `story_seeds.yml`) with entity reference checking
+    - Validates simulation configuration (`simulation.yml`) against Pydantic schema
+    - Validates difficulty sweep configurations (`content/config/sweeps/*/`)
+    - Outputs JSON manifest with validation results and file lists
+    - Clear error messages with icons (❌/✓) and bullet-point formatting
+    - Exit codes: 0 (success), 1 (validation errors), 2 (file/config errors)
+  - **CI Workflow** (`.github/workflows/content-validation.yml`):
+    - Triggers on push to main and PRs that modify content files
+    - Monitors: `content/**/*.yml`, `content/**/*.yaml`, `scripts/build_content.py`, `.github/workflows/content-*.yml`
+    - Runs validation via `uv run python scripts/build_content.py --verbose --output content-manifest.json`
+    - Uploads content manifest artifact for debugging
+    - Blocks PR merge on validation failures
+  - **Designer Documentation** (`docs/gengine/content_designer_workflow.md`):
+    - Content types and structure (worlds, configs, sweeps)
+    - YAML schema examples with annotations
+    - Local validation instructions with exit codes
+    - CI/CD validation details and artifact retrieval
+    - Troubleshooting section with common validation errors
+    - Best practices for content authors
+  - **Test Coverage** (`tests/scripts/test_build_content.py`):
+    - 17 tests covering all validation paths
+    - Tests for valid content, missing files, invalid schemas, bad entity references
+    - Integration tests validating real repository content
+    - All tests passing
+- **Last Updated:** 2025-12-02
 
 ### 9.1.1 — AI Observer Foundation Acceptance (M9.1)
 - **GitHub Issue:** [#19](https://github.com/TheWizardsCode/GEngine/issues/19)

diff --git a/pyproject.toml b/pyproject.toml
@@ -47,6 +47,8 @@ build-backend = "setuptools.build_meta"
 
 [tool.ruff]
 line-length = 88
+
+[tool.ruff.lint]
 select = ["E", "F", "B", "I"]
 
 [tool.pytest.ini_options]

diff --git a/scripts/analyze_difficulty_profiles.py b/scripts/analyze_difficulty_profiles.py
@@ -51,7 +51,9 @@ def from_telemetry(cls, preset: str, data: dict[str, Any]) -> "DifficultyProfile
         # Calculate faction balance as delta between faction legitimacies
         faction_leg = data.get("faction_legitimacy", {})
         leg_values = list(faction_leg.values())
-        faction_balance = max(leg_values) - min(leg_values) if len(leg_values) >= 2 else 0.0
+        faction_balance = (
+            max(leg_values) - min(leg_values) if len(leg_values) >= 2 else 0.0
+        )
 
         # Economic pressure from price volatility
         economy = data.get("last_economy", {})
@@ -161,17 +163,16 @@ def compare_profiles(profiles: dict[str, DifficultyProfile]) -> dict[str, Any]:
                 "✓ Unrest correctly increases with difficulty (harder = more unrest)"
             )
         else:
-            findings.append(
-                "⚠ Unrest does not consistently increase with difficulty"
-            )
+            findings.append("⚠ Unrest does not consistently increase with difficulty")
 
     # Check for extreme values
     for preset, profile in profiles.items():
         if profile.stability_end <= 0.0:
             findings.append(f"⚠ {preset}: Stability collapsed to 0 (may be too harsh)")
         if profile.anomalies > 100:
             findings.append(
-                f"⚠ {preset}: High anomaly count ({profile.anomalies}) indicates system stress"
+                f"⚠ {preset}: High anomaly count ({profile.anomalies}) "
+                "indicates system stress"
             )
 
     # Check differentiation between adjacent difficulties
@@ -181,8 +182,8 @@ def compare_profiles(profiles: dict[str, DifficultyProfile]) -> dict[str, Any]:
         stability_diff = abs(prof1.stability_end - prof2.stability_end)
         if stability_diff < 0.05:
             findings.append(
-                f"⚠ {p1} vs {p2}: Stability difference is minimal ({stability_diff:.3f}), "
-                "consider widening gap"
+                f"⚠ {p1} vs {p2}: Stability difference is minimal "
+                f"({stability_diff:.3f}), consider widening gap"
             )
 
     comparison["findings"] = findings

diff --git a/scripts/eoe_dump_state.py b/scripts/eoe_dump_state.py
@@ -4,7 +4,6 @@
 
 import argparse
 from pathlib import Path
-from typing import Optional
 
 from gengine.echoes.content import load_world_bundle
 from gengine.echoes.persistence import save_snapshot

diff --git a/scripts/plot_environment_trajectories.py b/scripts/plot_environment_trajectories.py
@@ -50,15 +50,17 @@ def main(argv: Sequence[str] | None = None) -> int:
     runs = _collect_runs(args.run)
     if not runs:
         raise SystemExit(
-            "No telemetry files found. Provide --run LABEL=PATH or rerun the sweeps to generate JSON."
+            "No telemetry files found. Provide --run LABEL=PATH "
+            "or rerun the sweeps to generate JSON."
         )
 
     fig, (ax_pollution, ax_unrest) = plt.subplots(2, 1, sharex=True, figsize=(10, 6))
     for label, path in runs.items():
         ticks, pollution, unrest = _extract_series(path)
         if len(ticks) < 2:
             print(
-                f"Warning: {label} only provided {len(ticks)} sample(s); increase focus.history_length before capturing telemetry."
+                f"Warning: {label} only provided {len(ticks)} sample(s); "
+                "increase focus.history_length before capturing telemetry."
             )
         ax_pollution.plot(ticks, pollution, label=label)
         ax_unrest.plot(ticks, unrest, label=label)

diff --git a/scripts/run_difficulty_sweeps.py b/scripts/run_difficulty_sweeps.py
@@ -69,10 +69,14 @@ def run_difficulty_sweeps(
                 sys.stderr.write(f"[SKIP] Config not found: {config_root}\n")
             continue
 
-        output_path = output_dir / f"difficulty-{preset}-sweep.json" if output_dir else None
+        output_path = (
+            output_dir / f"difficulty-{preset}-sweep.json" if output_dir else None
+        )
 
         if verbose:
-            sys.stderr.write(f"\n[START] {preset.upper()} difficulty ({ticks} ticks, seed={seed})\n")
+            sys.stderr.write(
+                f"\n[START] {preset.upper()} difficulty ({ticks} ticks, seed={seed})\n"
+            )
 
         start = perf_counter()
         summary = run_headless_sim(
@@ -106,7 +110,9 @@ def run_difficulty_sweeps(
 
     total_elapsed = perf_counter() - start_total
     if verbose:
-        sys.stderr.write(f"\n[COMPLETE] {len(results)} presets in {total_elapsed:.1f}s\n")
+        sys.stderr.write(
+            f"\n[COMPLETE] {len(results)} presets in {total_elapsed:.1f}s\n"
+        )
 
     return results
 

diff --git a/scripts/run_headless_sim.py b/scripts/run_headless_sim.py
@@ -58,14 +58,26 @@ def run_headless_sim(
         "faction_actions": sum(len(report.faction_actions) for report in reports),
         "faction_action_breakdown": _faction_breakdown(reports),
     }
-    summary["suppressed_events"] = sum(len(report.suppressed_events) for report in reports)
+    summary["suppressed_events"] = sum(
+        len(report.suppressed_events) for report in reports
+    )
     summary["director_feed"] = dict(engine.state.metadata.get("director_feed", {}))
-    summary["director_history"] = list(engine.state.metadata.get("director_history") or [])
-    summary["director_analysis"] = dict(engine.state.metadata.get("director_analysis") or {})
-    summary["director_events"] = list(engine.state.metadata.get("director_events") or [])
-    summary["director_pacing"] = dict(engine.state.metadata.get("director_pacing") or {})
+    summary["director_history"] = list(
+        engine.state.metadata.get("director_history") or []
+    )
+    summary["director_analysis"] = dict(
+        engine.state.metadata.get("director_analysis") or {}
+    )
+    summary["director_events"] = list(
+        engine.state.metadata.get("director_events") or []
+    )
+    summary["director_pacing"] = dict(
+        engine.state.metadata.get("director_pacing") or {}
+    )
     summary["story_seeds"] = list(engine.state.metadata.get("story_seeds_active") or [])
-    summary["story_seed_lifecycle"] = dict(engine.state.metadata.get("story_seed_lifecycle") or {})
+    summary["story_seed_lifecycle"] = dict(
+        engine.state.metadata.get("story_seed_lifecycle") or {}
+    )
     summary["story_seed_lifecycle_history"] = list(
         engine.state.metadata.get("story_seed_lifecycle_history") or []
     )
@@ -131,7 +143,9 @@ def _advance_in_batches(
             "ticks": len(step_reports),
             "ending_tick": last_report.tick if last_report else engine.state.tick,
             "agent_actions": sum(len(report.agent_actions) for report in step_reports),
-            "faction_actions": sum(len(report.faction_actions) for report in step_reports),
+            "faction_actions": sum(
+                len(report.faction_actions) for report in step_reports
+            ),
         }
         if last_report is not None:
             batch_payload["tick_ms"] = round(
@@ -213,8 +227,12 @@ def main(argv: Sequence[str] | None = None) -> int:
         default=None,
         help="Optional snapshot file to load instead of content",
     )
-    parser.add_argument("--ticks", "-t", type=int, default=200, help="Number of ticks to advance")
-    parser.add_argument("--seed", type=int, default=None, help="RNG seed override for determinism")
+    parser.add_argument(
+        "--ticks", "-t", type=int, default=200, help="Number of ticks to advance"
+    )
+    parser.add_argument(
+        "--seed", type=int, default=None, help="RNG seed override for determinism"
+    )
     parser.add_argument(
         "--lod",
         choices=["detailed", "balanced", "coarse"],

diff --git a/src/gengine/ai_player/actor.py b/src/gengine/ai_player/actor.py
@@ -419,9 +419,9 @@ def _create_observation_summary(
             start_value=1.0,  # Assumed start
             end_value=stability,
             delta=stability - 1.0,
-            trend="stable" if abs(stability - 1.0) < 0.01 else (
-                "increasing" if stability > 1.0 else "decreasing"
-            ),
+            trend="stable"
+            if abs(stability - 1.0) < 0.01
+            else ("increasing" if stability > 1.0 else "decreasing"),
         )
 
         # Extract faction swings
@@ -432,9 +432,9 @@ def _create_observation_summary(
                 start_value=0.5,  # Assumed start
                 end_value=leg,
                 delta=leg - 0.5,
-                trend="stable" if abs(leg - 0.5) < 0.05 else (
-                    "increasing" if leg > 0.5 else "decreasing"
-                ),
+                trend="stable"
+                if abs(leg - 0.5) < 0.05
+                else ("increasing" if leg > 0.5 else "decreasing"),
             )
 
         return ObservationReport(
@@ -472,9 +472,7 @@ def _build_telemetry(self, final_state: dict[str, Any]) -> dict[str, Any]:
 
         return {
             "action_counts": action_counts,
-            "priority_stats": {
-                k: round(v, 4) for k, v in priority_stats.items()
-            },
+            "priority_stats": {k: round(v, 4) for k, v in priority_stats.items()},
             "strategy_type": self._strategy.strategy_type.value,
             "final_state": {
                 "stability": final_state.get("stability", 1.0),

diff --git a/src/gengine/ai_player/llm_strategy.py b/src/gengine/ai_player/llm_strategy.py
@@ -83,9 +83,7 @@ def __post_init__(self) -> None:
         if self.llm_timeout_seconds <= 0:
             raise ValueError("llm_timeout_seconds must be positive")
         if not 0.0 <= self.rule_priority_scaling <= 1.0:
-            raise ValueError(
-                "rule_priority_scaling must be between 0.0 and 1.0"
-            )
+            raise ValueError("rule_priority_scaling must be between 0.0 and 1.0")
 
 
 @dataclass
@@ -270,6 +268,7 @@ def request_decision(
             if loop is not None and loop.is_running():
                 # Already in async context - use thread to avoid nested loops
                 import concurrent.futures
+
                 with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
                     # Create a new event loop in the thread
                     future = executor.submit(self._run_in_new_loop, request)
@@ -376,7 +375,8 @@ def _build_command_from_context(
         if "multiple_stressed_factions" in factors:
             factions = request.state.get("faction_legitimacy", {})
             low_factions = [
-                f for f, leg in factions.items()
+                f
+                for f, leg in factions.items()
                 if leg < self._config.complexity_threshold_legitimacy
             ]
             return (
@@ -496,7 +496,8 @@ def evaluate_complexity(
     # Check faction stress
     faction_legitimacy = state.get("faction_legitimacy", {})
     stressed_factions = sum(
-        1 for leg in faction_legitimacy.values()
+        1
+        for leg in faction_legitimacy.values()
         if leg < config.complexity_threshold_legitimacy
     )
     if stressed_factions >= config.complexity_threshold_factions: