pedrofuentes · pedrofuentes · Jun 26, 2026 · Jun 26, 2026
@@ -1,6 +1,6 @@
 # AGENTS.md — Council
 
-<!-- agents-template v0.18.0 -->
+<!-- agents-template v0.19.0 -->
 
 <role>You write tests before code, work in isolated worktree branches, and never merge without Sentinel review. These rules are enforced mechanically — Sentinel verifies compliance on every PR and non-compliant work is rejected.</role>
 

@@ -161,10 +161,10 @@ A sub-agent is a **separately-invoked tool call** (e.g., `task`, `dispatch`) exe
 
 Aggregate findings from all Phase 2 sub-agents, then classify using exactly these priority levels:
 - 🔴 **CRITICAL**: blocks merge — security vulnerability, data loss/corruption, breaking change, incorrect behavior under normal usage, missing evidence, failing tests, TDD failure
-- 🟡 **IMPORTANT**: concrete improvements with an articulated risk path. Each 🟡 must state: (1) **trigger** — what action or input activates the path, (2) **mechanism** — the reachable code path from trigger to failure, (3) **consequence** — the observable damage (data loss, error, degraded UX, outage). Missing any element → 🟢, not 🟡. Requires follow-ups tracked as GitHub issues. **If a 🟡 could cause data loss, security exposure, cascading outage, or incorrect behavior under normal usage → reclassify as 🔴.** Concerns without an articulated risk path → 🟢, not 🟡. **🟡 exclusions (classify as 🟢):** missing CHANGELOG/docs with no release/API/user-impact requirement, "better abstraction" without a failure path, rename/restructure suggestions, stylistic preferences — these lack the required trigger→mechanism→consequence chain.
+- 🟡 **IMPORTANT**: concrete improvements with an articulated risk path. Each 🟡 must state: (1) **trigger** — what action or input activates the path, (2) **mechanism** — the reachable code path from trigger to failure, (3) **consequence** — the observable damage (data loss, error, degraded UX, outage). Missing any element → 🟢, not 🟡. Requires follow-ups tracked as GitHub issues. **If a 🟡 could cause data loss, security exposure, cascading outage, or incorrect behavior under normal usage → reclassify as 🔴.** Concerns without an articulated risk path → 🟢, not 🟡. **🟡 exclusions (classify as 🟢):** missing CHANGELOG (always 🟢 — never 🟡), missing docs with no release/API/user-impact requirement, "better abstraction" without a failure path, rename/restructure suggestions, stylistic preferences — these lack the required trigger→mechanism→consequence chain.
 - 🟢 **MINOR**: polish, theoretical improvements, or speculative edge cases where no reachable trigger, concrete failure mode, or material impact is identified; does not block. **Materiality floor:** omit entirely (do not file even as 🟢) any finding whose own rationale calls the impact immaterial, negligible, or immeasurable; batch trivial polish into a single 🟢.
 
-**Severity adjustment:** The orchestrator may reclassify 🟡 → 🔴 per the rule above, or 🟡 → 🟢 when the finding lacks an articulated risk path. **NEVER** 🔴 → 🟡/🟢. Sub-agent 🔴 severity is a floor; 🟡 is advisory and subject to orchestrator calibration.
+**Severity adjustment:** The orchestrator may reclassify 🟡 → 🔴 per the rule above, or 🟡 → 🟢 when the finding lacks an articulated risk path. **NEVER** 🔴 → 🟡/🟢. Sub-agent 🔴 severity is a floor; 🟡 is advisory and subject to orchestrator calibration. Apply [`sentinel/SEVERITY-RUBRIC.md`](sentinel/SEVERITY-RUBRIC.md) — version-pinned decision procedure + golden worked-examples — for reproducible severity regardless of which agent orchestrates.
 
 **Cross-dimension findings:** Findings prefixed `[Cross: Dim X]` from one sub-agent that duplicate a finding from the target dimension → consolidate. If the target dimension missed it → adopt the cross-referenced finding at the target dimension's severity default.
 
@@ -234,7 +234,7 @@ Required action: MERGE | FILE_ISSUES_AND_MERGE | FIX_AND_REINVOKE
 **`Required action` mapping**: APPROVED→MERGE, CONDITIONAL→FILE_ISSUES_AND_MERGE, REJECTED→FIX_AND_REINVOKE. Mismatch = malformed report; re-run Sentinel.
 
 ## Phase 5 — Persist report (REQUIRED)
-Before returning, persist the FULL report to a durable location so the merge commit's `Report ID + SHA` stays auditable even if the parent's context drops the report. Preferred: post it to the reviewed PR via `gh pr review <pr> --body-file <report> --comment`. If you lack PR write access, return the report and the **invoker MUST** persist it (AGENTS.md §After Sentinel). Persisting your own report is reporting, not a code change — it does not violate read-only. Record the persisted URL/path in the Phase 2 Execution Log. Returning the report as agent text only is INSUFFICIENT.
+Before returning, persist the FULL report to a durable location so the merge commit's `Report ID + SHA` stays auditable even if the parent's context drops the report. Preferred: post it to the reviewed PR via `gh pr review <pr> --body-file <report> --comment`. If you lack PR write access, return the report and the **invoker MUST** persist it (AGENTS.md §After Sentinel). A committed `.sentinel/reports/<id>.md` fallback MUST land on a persisted branch — **never inside a throwaway/ephemeral verification worktree** (for isolated checks use a repo-relative scratch path like `.worktrees/sentinel-<id>`, treat it as scratch not storage, and clean it up — it is not a durable report location). Persisting your own report is reporting, not a code change — it does not violate read-only. Record the persisted URL/path in the Phase 2 Execution Log. Returning the report as agent text only is INSUFFICIENT.
 
 ## Deploy / release gating (optional)
 If asked to gate a deploy/release, require evidence that: release SHA matches a reviewed `main` SHA with green suite + passing build; no open 🔴 issues; all 🟡 resolved or risk-accepted (rationale on issue); versioning/changelog updated.

@@ -0,0 +1,58 @@
+# Sentinel Severity Rubric (v1)
+
+**Orchestrator Phase 3 calibration reference.** Purpose: make severity verdicts
+**reproducible across reviewers** — the same finding class yields the same severity
+regardless of which agent orchestrates. Applied AFTER sub-agent findings aggregate
+(SENTINEL.md Phase 3). Sub-agent 🔴 is a floor; 🟡/🟢 are advisory and re-calibrated here.
+
+## Decision procedure (apply in order)
+1. Does the finding have a concrete **trigger → reachable mechanism → observable
+   consequence**? No → 🟢 (or omit entirely if its own rationale calls the impact
+   immaterial/negligible).
+2. Could the consequence be **data loss, security exposure, cascading outage, or incorrect
+   behavior under NORMAL usage**? Yes → 🔴.
+3. Otherwise a concrete improvement with an articulated risk path → 🟡 (file as issue).
+4. **NEVER** downgrade a sub-agent 🔴. **NEVER** 🔴 → 🟡/🟢.
+
+## Tiers
+- 🔴 **CRITICAL** — blocks merge (REJECTED). Security vuln, data loss/corruption, breaking
+  change, incorrect behavior under normal usage, missing evidence, failing tests, TDD failure.
+- 🟡 **IMPORTANT** — concrete fix with trigger+mechanism+consequence; does not block; filed
+  as a GitHub issue (CONDITIONAL).
+- 🟢 **MINOR** — polish, theoretical/unreachable, or no articulated risk path; does not block.
+
+## Golden examples (canonical — match each new finding to the nearest row)
+
+| Finding | Severity | Why (decision step) |
+|---------|----------|---------------------|
+| Jitter applied to a server-mandated `Retry-After`, shortening the cooldown | 🔴 | Incorrect under normal usage (2) — can extend throttling |
+| Stale retry-overlay freezes health tiles at retry-time data (shows "passing" after failure) | 🔴 | Incorrect under normal usage (2) — user sees wrong state |
+| New data layer never executed by any test (wiring tests mock the hook), hiding a latent bug | 🔴 | Untested path concealing a real bug — Dim D gaming + (2) |
+| Retry storm: retries without jitter causing coordinated load spikes | 🔴 | Cascading outage (2) |
+| Non-idempotent mutation on a retried path (payment/provisioning) | 🔴 | Data corruption under normal retry (2) |
+| Missing timeout on a request-critical network call that can exhaust connections | 🔴 | Cascading outage (2) |
+| Untrusted input reaches an injection sink (SQL/shell/HTML/template) without escaping or parameterization | 🔴 | Exploitable security vuln (2) |
+| Non-CSPRNG (`Math.random()`) used to generate a token, session ID, password, or secret | 🔴 | Predictable secret → security exposure (2) |
+| New dependency with a `postinstall` script, a typosquatted name, or a swapped `resolved` URL / integrity hash in the lockfile | 🔴 | Supply-chain compromise (2) |
+| Missing timeout on a non-critical, bounded background call | 🟡 | Reachable risk, bounded blast radius (3) |
+| Test asserts an outcome but uses no concrete-value oracle (non-discriminating) | 🟡 | Reachable: a wrong value would still pass; harden it (3) |
+| Untested new error/branch path with a plausible trigger | 🟡 | Articulated risk path (3) |
+| Defensive guard whose trigger is unreachable given current callers | 🟢 | No reachable trigger (1) |
+| `Math.random()` used for UI animation / visual jitter (no security surface) | 🟢 | No security surface (1) — contrast the CSPRNG 🔴 above |
+| Dependency bump of an unused or dev-only package; no API/behavior change | 🟢 | No reachable risk (1) — contrast the typosquat 🔴 above |
+| **Missing CHANGELOG entry** | 🟢 | Non-behavioral convention; no trigger→mechanism→consequence (1). **NEVER 🟡.** |
+| Missing/incomplete docs with no release/API/user-impact requirement | 🟢 | No risk path (1) |
+| Rename / restructure / "better abstraction" without a failure path | 🟢 | No risk path (1) |
+| Stylistic preference (formatting, naming) | 🟢 | No risk path; batch into one 🟢 |
+
+## Borderline rules
+- **🟡 → 🔴** when the risk path reaches data loss, security exposure, cascading outage, or
+  normal-usage-incorrect behavior.
+- **🟡 → 🟢** when there is no trigger, no reachable mechanism, or an immaterial consequence.
+- **Pre-existing** issue the diff neither introduces nor newly reaches → 🟢 max (never 🔴/🟡).
+- Finding matches an open `sentinel:*` issue (same mechanism + fix) → **Known** (excluded
+  from verdict). 🔴 can **NEVER** be Known.
+
+## Version
+Rubric **v1** — bound to SENTINEL.md ruleset v1 (agents-template v0.19.0). Bump this version
+whenever severity semantics change, so verdicts stay reproducible against a pinned rubric.
@@ -19,7 +19,7 @@ Findings must originate from changed lines or code whose reachability, inputs, o
 
 ### Accuracy & completeness
 - README reflects current behavior — if the diff changes user-facing behavior and no docs are touched, flag 🟡 "docs likely needed." Only claim "README updated correctly" when README sections are modified in the diff.
-- CHANGELOG updated — user-facing changes documented; if CHANGELOG is absent from the diff and release-tooling config exists in the repo, skip this check (release tooling generates CHANGELOG from commits/changesets)
+- CHANGELOG updated — user-facing changes documented. If CHANGELOG is absent and release-tooling config exists, skip this check (release tooling generates it from commits/changesets). Otherwise a missing CHANGELOG is **🟢 MINOR only (never 🟡)** — a non-behavioral convention.
 - API docs current — endpoint signatures, parameters, response shapes match implementation
 - New features documented — discoverable without reading source code
 - Deprecated features noted — migration path or removal timeline provided