Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,18 @@ This repo squash-merges PRs. Two consequences worth knowing before pushing follo
- **After a squash-merge, delete the feature branch.** It contains pre-squash commits with stale SHAs; reusing it for new work re-creates the ghost-conflict problem. `git checkout main && git pull && git branch -D <branch> && git push origin --delete <branch>`.

Diagnostic: if a PR shows `blocked` and `git diff origin/main HEAD` is empty, the PR's content is already on main via squash-merge — close the PR rather than trying to merge it.

---

## CI / Required Status Checks

Never put `on.*.paths` on a workflow that is a **required** status check. A path-filtered required workflow that doesn't trigger is reported as permanently "Expected", which leaves the PR `mergeable_state: blocked` even when everything else is green (this stranded #213/#215 until #216).

Pattern for every required gate:

- **No `on.*.paths`** — the workflow always triggers, so the required check is always created.
- A lightweight always-run **`changes`** job recomputes the gate's relevant path set via `git diff origin/<base>...HEAD`.
- Each heavy job carries `needs: changes` + `if: needs.changes.outputs.run == 'true'`. A job skipped via `if:` counts as a **passing** required check, so unrelated PRs aren't blocked and don't pay for the heavy work.
- **Fail safe:** the detector defaults to `run=true` and only skips on a successful diff showing nothing relevant changed. Don't rename jobs/checks (breaks the required-context list).

Full rationale: `docs/AI-CONVENTIONS.adoc` §"CI / Required Status Checks" and the `docs/wikis/CI-and-Required-Checks.adoc` wiki page.
21 changes: 19 additions & 2 deletions .machine_readable/6a2/PLAYBOOK.a2ml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
# Runbooks, incident response, deployment procedures.

[metadata]
version = "0.2.0"
last-updated = "2026-04-25"
version = "0.2.1"
last-updated = "2026-06-13"

[deployment]
method = "gitops + manual"
Expand Down Expand Up @@ -84,3 +84,20 @@ future-canonical-urls = [
# just maint-hard-pass
# Permission audit:
# just perms-audit

[ci-required-gates]
# Required status checks must ALWAYS report, or a PR touching none of a gate's
# paths sits mergeable_state=blocked forever (the "Expected" trap; see #216,
# which stranded #213/#215). Pattern for every required gate under
# .github/workflows/ (proofs, zig-test, e2e, abi-drift, backend-assurance,
# lsp-dap-bsp, truthfulness):
# 1. NO on.*.paths — the workflow always triggers (required check always created).
# 2. An always-run `changes` job recomputes the gate's path set via
# `git diff origin/<base>...HEAD` (fail-safe: default run=true; only
# run=false on a successful diff with no relevant match).
# 3. Each heavy job: needs: changes + if: needs.changes.outputs.run == 'true'.
# A job skipped via if: reports SUCCESS to required checks.
# Do NOT rename jobs/checks (breaks branch-protection's required-context list).
# Adding a new required gate: follow this shape; never reintroduce on.*.paths.
diagnostic = "PR green but blocked => a required gate never reported (the on.*.paths trap). Fix: drop on.*.paths, add the changes-gate + if:-skipped heavy jobs."
rationale-docs = ["docs/AI-CONVENTIONS.adoc#ci-required-status-checks", "docs/wikis/CI-and-Required-Checks.adoc"]
3 changes: 2 additions & 1 deletion .machine_readable/6a2/STATE.a2ml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
[metadata]
project = "boj-server"
version = "1.1.0-wip"
last-updated = "2026-06-05"
last-updated = "2026-06-13"
status = "active"
grade = "C"

Expand Down Expand Up @@ -74,6 +74,7 @@ test-coverage = "CLOSED 2026-04-25 — 165 ExUnit tests; CRG C met"

[session-history]
entries = [
{ date = "2026-06-13", description = "CI required-gate skip-shim + estate-wide unblock. ROOT CAUSE: 7 required gates (abi-drift, backend-assurance, e2e, lsp-dap-bsp, proofs, truthfulness, zig-test) were workflow-level path-filtered (on.*.paths); on any PR touching none of a gate's paths the workflow never ran, so its required status check stayed 'Expected' forever and the PR sat mergeable_state=blocked even fully-green (stranded #213 dependabot flake.lock + #215 scripts-only). FIX (PR #216, MERGED): dropped on.*.paths from all 7; added an always-run `changes` job that recomputes each gate's original path set via `git diff origin/<base>...HEAD`; gated every heavy job with needs:changes + if:run=='true' (a job skipped via if: reports SUCCESS to required checks). Fail-safe toward running; job/check names unchanged so no branch-protection edit needed. Validated: actionlint clean; detection regex 77/77; live on #216 all 15 heavy jobs skipped->success. Then merged #213 + #215. MIRRORED to boj-server-cartridges (PR #45, MERGED): same shim on proofs/zig-test/foundry. Conventions captured this session: docs/AI-CONVENTIONS.adoc + .claude/CLAUDE.md (new 'CI / Required Status Checks' sections) + docs/wikis/CI-and-Required-Checks.adoc + PLAYBOOK [ci-required-gates]; AI-CONVENTIONS Banned-Languages 'TypeScript->ReScript' corrected to '->AffineScript' (ReScript retired 2026-04-30). Follow-ups filed: #218 (boj-server Hypatia/governance hygiene: stale GEMINI.md, unpinned governance.yml action, missing timeout-minutes, shellcheck infos); cartridges #46 (TS->AffineScript port of stack-orchestrator-mcp + proof-lsp adapters), #47 (cartridges Hypatia hygiene), #48 (language-cartridge IDE completeness)." },
{ date = "2026-06-05", description = "Truthfulness + Foundry + governance-checkpoint session. (1) Catalogue honesty (PR #196, MERGED) — router.ex defaulted every cartridge to available:true, so all 125 advertised as working when only feedback-mcp is built+real (97 build a .so that returns {\"status\":\"stub\"}; 10 have no FFI). Fixed: explicit `available` on all 125 cartridge.json (true only for feedback-mcp) + `status` where missing; router `available` default flipped true→false (a cartridge must opt in to being advertised, never by omission); new tests/truthfulness_check.sh + .github/workflows/truthfulness.yml gate builds every available cartridge and invokes its first tool, failing on a stub marker (verified to catch an injected stub→available lie); fixed router_test.exs + contract_test.exs /menu tests that asserted the pre-tiering cartridges/count shape (broken since the tiered /menu landed) to assert tier_*/summary + the availability invariant (summary.ready == count of available). believe-me-count unchanged at 5. (2) Foundry (boj-server-cartridges PR #38, MERGED) — the integrated mint→provision→configure→harness cartridge-making workflow, built ON the existing cartridge-minter + gossamer-mcp template (not from scratch). tools/foundry/proof/Foundry.idr (Idris2 0.8.0, typechecked) machine-checks two properties: NO-DROPPED-PROOFS (a `failing` block proves skipping the harness won't compile, pinned to `Can't find an implementation for Elem Truthful`) and LEAST-AUTHORITY (capability index = exact grant; no stage widens it). stages/harness.sh = the one general gate (idris2 --check + zig build + the #196 truthfulness probe + capability-subset check), verified PASS on gossamer-mcp. (3) Filed 8 tracking issues: proofs a/b/c (#197, hyperpolymath/boj-server-cartridges#36, #37), obleeny long-term #198, catch-all hygiene #199, maker #200, implement-stubs #201, gate-enforcement #202. (4) Branch reconciliation — retired superseded branches (e2e-rest-contract, feedback-otron; truthfulness-pass/feedback-otron-fix already auto-deleted on merge); reset zen-galileo to main; PRESERVED claude/cartridge-abi-proofs (unmerged proof-CI groundwork: 17 ABI .idr typecheck-fixes for 0.8.0 + typecheck-proofs.sh — captured in #197 for rebase-and-land); one clean main per repo. (5) Governance checkpoint (this commit) — STATE refreshed; ANCHOR realigned to current language policy. FLAGGED: boj-server-cartridges has NO .machine_readable/ at all (entire standard layout missing); no bot_directives/ dir in either repo; standards + rsr-template unreachable from this session (GitHub scope = boj-server + boj-server-cartridges only, no list_repos/add_repo) so the rsr-template-vs-standards divergence check is BLOCKED pending access." },
{ date = "2026-05-26", description = "Repo-tidy / rsr-template-repo taxonomy alignment (PR #149). Six commits; 136 files changed (+5096 -3322). (1) Root .adoc relocations — EXPLAINME.adoc -> docs/, BOJ_LOGIC.adoc + NeSy_SERVERS.adoc -> docs/architecture/, FUTURE_PLANS.adoc + ROADMAP.adoc -> docs/status/, QUICKSTART-{USER,DEV,MAINTAINER}.adoc -> docs/quickstarts/. Cross-refs updated in 0-AI-MANIFEST.a2ml, Justfile, elixir/boj-rest.service, methodology.a2ml (fallback-files list), docs/README.adoc, docs/accessibility/README.adoc, and two outreach drafts. Historical mentions in CHANGELOG.md and dated log entries in STATE.a2ml deliberately left intact. (2) README merge — substantive 518-line README.md converted to AsciiDoc and merged with the unique sections from the shorter 176-line README.adoc (Features bullets, Formal verification). README.md deleted. Refs in jsr.json publish list, mcp-bridge/lib/resources.js docs URL, .github/SECURITY.md, and Intentfile repointed at README.adoc. (3) Wiki + warmup conversions — five wiki pages converted .md->.adoc and moved docs/wiki/ -> docs/wikis/ (template's spelling); llm-warmup-{dev,user}.md -> docs/developer/; CARTRIDGE-PHASE-3B-COMPLETION.md -> docs/status/. Drift fix: STATE.a2ml cartridges-total 112 -> 125 (every dir under cartridges/* has cartridge.json), cartridges-with-zig-ffi 111 -> 115 (manifest-counted), cartridges-with-js-mod 111 -> 113, project-context.purpose '112 cartridges' -> '125 cartridges'. (4) Bulk docs/*.md -> .adoc — ABI-FFI-README, AI-CONVENTIONS, API-CONTRACT, CULTURAL-RESPECT, EXTENSIBILITY, FEDERATION, READINESS, THREAT-MODEL; plus relocations docs/ARCHITECTURE.md -> docs/architecture/README.adoc and docs/DEVELOPERS.md -> docs/developer/README.adoc. 99 files cross-rewritten (55 cartridge READMEs + governance/wiki/dev/architecture refs + Justfile, .github/copilot-instructions.md, SECURITY.md, src/abi/Boj/Catalogue.idr docstring, k8s/service.yaml, mcp-bridge/lib/api-clients.js, plus outreach/practice docs). New subdir orientation READMEs in docs/quickstarts/, docs/status/, docs/wikis/. docs/READINESS.adoc deliberately stays at docs/ root (not under docs/status/) because 55+ cartridge READMEs link to that canonical path. (5) Quickstart consolidation — substantive docs/QUICKSTART.md (72 lines) -> docs/quickstarts/DEV.adoc (replaces 39-line stub); docs/GETTING-STARTED.md (198 lines) -> docs/quickstarts/BUILD-FROM-SOURCE.adoc (new sibling); docs/OPERATOR-QUICKSTART.md (296 lines) -> docs/quickstarts/MAINTAINER.adoc (replaces 40-line stub). docs/quickstarts/README.adoc updated to list all four with audience guidance. Refs in Mustfile + flake.nix + CRG-LIFT-PLAN + outreach drafts updated. (6) docs/README.adoc index rewritten in full (Reading-order-by-audience + Directory-taxonomy + Standalone-docs/-root + Related-root-level sections) + last lone .md in docs/architecture/ (TYPED-WASM-MCP-BRIDGE.md) converted to .adoc. Hypatia approved-exemption manifest patch /tmp/hypatia-approved-exemptions.patch delivered to user for upstream application to hyperpolymath/hypatia (separate PR). Known-deferred (high coupling, not in this PR): PROOF-NEEDS.md (16 cross-refs incl. CI + Idris proofs + 4 Elixir test files), TOPOLOGY.md (11 cross-refs incl. CI workflow), TEST-NEEDS.md (5 cross-refs); follow-up PR should bulk-rewrite. GEMINI.md left at root — load-bearing (gemini-extension.json contextFileName). CI failures on PR #149 (ABI Specification Check, FFI Build, abi-drift verify, Aspect, E2E) all confirmed pre-existing on main; this PR's diff is comment-line-only outside docs/, cannot have introduced them." },
{ date = "2026-05-20", description = "HCG tier-2 Phase E first-session (afternoon) — sub-issue standards#100 of channel standards#91. Phases A/B/C had already merged; Phase D was scaffold-only (http-capability-gateway#12 merged 08:24Z, bench/baseline.json _status: 'scaffold-placeholder', perf-regression gate non-blocking until D-2..D-4 land). Scope: drive Phase E artefacts that are SAFE w.r.t. Phase D being scaffold-only — runbook + audit + ingress isolation. Out of scope: E1/E2/E3/E4 wiring + Trustfile PENDING→DEPLOYED flip (gated on D-3 regression alert + D-4 real numbers, per runbook §1.1). FOUR PRs shipped + ONE issue filed. (1) PR #128 (MERGED 11:30Z) — docs/integration/hcg-tier2-rollout-runbook.md, the Phase E E5 rollout-and-rollback runbook (covers §E4 + §E5 of the integration plan because standards#100 acceptance #3 names both). 308 lines: prerequisites (Phase D landings + Phase A/B/C artefacts + operational !OWNER: block + BoJ-side + gateway-side), staging cut-over (deploy + telemetry verification + 24h soak + rollback rehearsal), 10/50/100% production rollout, observability signals (gateway + BoJ + dashboards), rollback (triggers + immediate bypass + permanent disable + post-decommission), post-rollout verification + Trustfile flip. !OWNER: markers throughout §1.3 + §4 for on-call rota, dashboard URLs, prod cert paths, traffic-shift mechanism choice, freeze windows. (2) PR #130 (MERGED 11:57Z) — fix(boj): bind Cowboy to 127.0.0.1 by default (audit #6). elixir/lib/boj_rest/application.ex passes ip: explicit binding to Plug.Cowboy options; BOJ_BIND_IP env var override; new parse_bind_ip/1 helper with fail-fast on invalid input (preferred to silently degrading to 0.0.0.0 and exposing the back-side bind). 7 new unit tests in elixir/test/application_test.exs. Previously Plug.Cowboy started with port: 7700 and NO ip: option, so it defaulted to 0.0.0.0 — the contract document's 'BoJ :7700 is not externally routable' claim and the runbook §1.4 prereq #6 were OPERATIONAL ASSERTIONS, not code-enforced. (3) PR #131 (MERGED 12:35Z) — fix(k8s): Service for BoJ to ClusterIP (audit #8). k8s/service.yaml type: LoadBalancer → ClusterIP. Added hyperpolymath.dev/exposure: 'internal-only' + hyperpolymath.dev/external-via: 'http-capability-gateway (tier-2)' annotations. Header comment with kustomize override recipe for legacy/standalone deployments. Estate cross-check: hypatia/*, rsr-certifier, opsm-service all use ClusterIP for backends; only svalinn-gateway (a gateway, not a backend) uses LoadBalancer. (4) PR #132 (MERGED 12:35Z) — fix(container): APP_HOST defaults to 127.0.0.1 (audit #7). Three sites that feed the Zig adapter binary's --host flag: stapeln.toml [targets.production], container/entrypoint.sh (lines 40 + 140), container/compose.prod.yaml. The audit named only stapeln.toml; the other two sites had the same '[::]' default that the audit missed but they all feed the same --host so they had to flip together. CI auto-trigger anomaly: pull_request workflows did not fire (likely sweep122 concurrency-pool saturation); manual Governance dispatch returned all 6 jobs green; owner re-ran from Actions tab. (5) Issue #135 filed — k8s NetworkPolicy follow-up (defence-in-depth beyond ClusterIP, Low priority, Phase E acceptance-non-critical). Three threat models named: compromised neighbour pod / operator misconfiguration / §4 defence in depth. Proposed manifest shape included; CNI-support caveat documented. (6) DEFENCE IN DEPTH ACHIEVED — three independent loopback layers now block any §3 invariant 3 violation: Elixir Cowboy (#130) + Zig adapter (#132) + k8s Service (#131). (7) Phase C §3 invariant 3 correction — the standards#91 channel-status comment claimed the BoJ-side fix was 'owner-gated, not opened as a PR'; verified by git log against elixir/lib/boj_rest/trust_policy.ex that the deny clause landed in boj-server#106 (commit 40e46f6f) as part of Phase C. Stale claim corrected in memory. (8) NEXT for Phase E (gated): E1 (Containerfile + k9-svc deployment spec finalisation) / E2 (staging deployment + traffic shift) / E3 (telemetry verification under load) / E4 (production flip) / Trustfile [CLOUDFLARE_EDGE_SECURITY].rate_limiting.tier_2_gateway.status PENDING→DEPLOYED — ALL gated on Phase D-3 (regression alert armed) + D-4 (real baseline numbers populated)." },
Expand Down
4 changes: 4 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ We follow [Conventional Commits](https://www.conventionalcommits.org/):

Types: `feat`, `fix`, `docs`, `test`, `refactor`, `ci`, `chore`, `security`

### CI / Required Checks

Required status-check workflows must **always report**. Never add `on.*.paths` to a required workflow — a path-filtered required check that doesn't trigger is reported as permanently "Expected" and blocks the PR even when everything is green. Use the estate pattern: an always-run `changes` job plus heavy jobs gated by `if: needs.changes.outputs.run == 'true'` (a job skipped via `if:` passes the required check). Full rationale in `docs/AI-CONVENTIONS.adoc` §"CI / Required Status Checks" and the `docs/wikis/CI-and-Required-Checks.adoc` wiki page.

## Reporting Bugs

Before reporting:
Expand Down
31 changes: 30 additions & 1 deletion docs/AI-CONVENTIONS.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ MAINTENANCE-CHECKLIST.a2ml, or SOFTWARE-DEVELOPMENT-APPROACH.a2ml in the reposit
| Use Instead

| TypeScript
| ReScript
| AffineScript

| Node.js / npm / bun
| Deno
Expand Down Expand Up @@ -109,6 +109,35 @@ MAINTENANCE-CHECKLIST.a2ml, or SOFTWARE-DEVELOPMENT-APPROACH.a2ml in the reposit

Use `just` (Justfile) for all build, test, lint, and format tasks.

== CI / Required Status Checks

Required status checks MUST always report. A workflow that is workflow-level
path-filtered (`on.<event>.paths`) and is also a *required* check sits forever
as "Expected" on any PR that touches none of its paths -- leaving the PR
`mergeable_state: blocked` even when every other check is green and the PR is
approved. (This stranded #213 and #215 until #216.)

Rule for any required gate:

. Do *not* put `on.*.paths` on the workflow -- let it always trigger, so the
required check is always created.
. Add a lightweight always-run `changes` job that recomputes the gate's
relevant path set with `git diff origin/<base>...HEAD`.
. Gate every heavy job with `needs: changes` and
`if: needs.changes.outputs.run == 'true'`.

A job skipped via `if:` reports *success* to required checks, so the context is
always satisfied; the heavy work still runs in full when relevant files change.
The detector MUST fail safe -- default `run=true`, and set `run=false` only on a
successful diff that shows nothing relevant changed (any fetch/diff error =>
run). Job/check *names* must not change, or branch-protection's required-context
list breaks.

Reference implementations: every gate in `.github/workflows/`
(`proofs.yml`, `zig-test.yml`, `e2e.yml`, `abi-drift.yml`,
`backend-assurance.yml`, `lsp-dap-bsp.yml`, `truthfulness.yml`). Full rationale:
the link:wikis/CI-and-Required-Checks[CI & Required Checks] wiki page.

== References

* `0-AI-MANIFEST.a2ml` -- universal AI entry point
Expand Down
Loading
Loading