chore: bring regulatory-conditions-to-governance up to main#71
Closed
Amosk21 wants to merge 7 commits into
Closed
Conversation
… to governance extension
Deduplicates universal regulatory content out of two per-fixture instance
files into ARCO_governance_extension.ttl. Closes regulatory_alignment FAIL
and traceability FAIL on the Adversarial Decoy and Blanknode Ghost fixtures
by making the regulatory condition declarations visible to every fixture
that imports the governance extension.
Why
Both audit queries (check_regulatory_alignment.sparql, check_assessment_
traceability.sparql) require the condition to be typed :RegulatoryContent
in the merged graph. The conditions were declared only in
ARCO_instances_sentinel.ttl (1(a)) and ARCO_instances_creditscoring.ttl
(5(b)). Fixtures that don't import either file (Decoy, Ghost) referenced
the conditions via iao:0000136 but the type assertion wasn't present, so
both audit queries returned FAIL for fixture-distribution reasons, not
fixture-semantics reasons.
What changes
- ARCO_governance_extension.ttl: new section "3a) REGULATORY CONTENT" adds
:AnnexIII_List, :AnnexIII_Condition_1a, :AnnexIII_Condition_5b with the
same triples previously in the two instance files. Triples preserved
verbatim (rdfs:label, rdfs:comment, cco:prescribes, iao:0000136 targets).
- ARCO_instances_sentinel.ttl: removes 10 lines (the declaration block);
section header replaced with a brief migration comment. The three
references via iao:0000136 :AnnexIII_Condition_1a are preserved.
- ARCO_instances_creditscoring.ttl: removes 13 lines (the declaration
block + the self-containedness comment); section header replaced with
a brief migration comment. The three references via iao:0000136
:AnnexIII_Condition_5b are preserved.
Tests (all 7 fixtures, pre vs post pipeline diff)
- Sentinel, CreditScorer, VerificationKiosk: identical except entailed-
triples count (+34 to +89 from the additional universal regulatory
content now visible to every fixture).
- DecoySystem_001: regulatory_alignment FAIL -> PASS; traceability
FAIL -> PASS (closes the documented goal).
- GhostSystem_001: regulatory_alignment FAIL -> PASS; traceability
FAIL -> PASS; all_checks_passed false -> true.
- FlagTest_BiometricSystem_WithDerogationClaim and
FlagTest_CreditSystem_WithFraudProcess: no audit-row flip. Their
:AssessmentDocumentation instances do not link to any regulatory
condition via iao:0000136 in the source TTL, so the audit query's
AssessmentDoc -> condition path is empty independent of where the
condition is declared. The plan predicted these fixtures would flip;
the actual cause is a separate fixture-authoring gap in
ARCO_instances_flag_tests.ttl lines 90-92, 155-157. Closing that
is a separate fixture edit outside this PR's scope.
- Regression: test_gate_removal.py PASS; test_scenarios.py PASS (all
7 scenarios); test_kiosk_html_no_false_concretization.py PASS;
test_output_provenance.py 1 failure (unchanged baseline).
- HermiT vs OWL-RL cross-check: agree on every (fixture, system,
query) tuple in the certificate-grade set.
- SHACL conforms PASS on every fixture (unchanged).
- No classification flip on any fixture; no SHACL change; no other
audit-row change.
Downstream consumer audit
Grep across 03_TECHNICAL_CORE/, docs/, mcp/, .github/ for every reader
of :AnnexIII_Condition_1a / _5b / _List / :RegulatoryContent. All
reference sites either load ARCO_governance_extension.ttl (via every
pipeline / test / cross-check loader) or are documentation mentions of
the IRI itself. No consumer depends on the conditions being declared
in a specific instance file.
Deferred
- :AnnexIII_Condition_1a_Exclusion in ARCO_instances_verification.ttl
is a different class (verification-kiosk exclusion documentation per
Recital 22 / Art 3(41)). Whether to also generalize the exclusion
pattern is a separate future decision.
- FlagTest fixtures' AssessmentDocs do not link to any regulatory
condition; closing their regulatory_alignment FAIL is a separate
fixture-authoring change.
Revert
git revert HEAD
…ry-content migration
Three stale-doc fixes tied to the 2026-05-14 governance-extension move:
- ARCO_instances_flag_tests.ttl header: replaces the pre-migration text
("classification PASS but audit FAIL ... minimal instances for flag testing
only, without full regulatory content linkage") with the actual post-migration
state. Classification and exception flag remain the test target; traceability
and regulatory_alignment still FAIL but for a different reason now (local
:AssessmentDocumentation -> :AnnexIII_Condition_* iao:0000136 link absent
from this fixture, not fixture-distribution).
- LIMITATIONS.md sec 9: file reference for the
:AnnexIII_Condition_1a cco:prescribes :RemoteBiometricIdentificationProcess
class-as-individual triple updated from ARCO_instances_sentinel.ttl to
ARCO_governance_extension.ttl per the migration. Adds the 5(b) companion
triple. Also notes that gate-removal coverage is now symmetric and adversarial-
mechanism tests exist (next commit).
- README.md "Gate independence is empirically verified" sentence: drops the
"(Symmetric coverage for 5(b) is queued.)" parenthetical; corresponding row in
the active-changes table moves from "Active work" to "Landed 2026-05-14".
No pipeline behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ism assertions Two coverage gaps closed against README claims: - test_gate_removal.py: parameterized over both modeled Annex III categories. CATEGORY_1A (Sentinel) preserves the original 7 tests (5 gate removals + 2 content mutations) verbatim by triple; CATEGORY_5B (CreditScorer) adds the symmetric 7 against AnnexIII5bApplicableSystem. README "Gate independence is empirically verified" previously disclosed the 5(b) gap as queued; closes that. - test_adversarial_mechanism.py (new): asserts that DecoySystem_001's Gate 1 entailment routes through owl:equivalentClass propagation (the disposition is typed only as :WeirdScanner pre-reasoning; :BiometricIdentificationCapability is absent from the asserted triples and entailed post-reasoning), and that GhostSystem_001's disposition is a blank node (no named individual) that still satisfies owl:someValuesFrom. test_scenarios.py asserts the entailment fires; this test asserts HOW. - .github/workflows/arco-smoke-test.yml and arco-demo.yml: both workflows run the new test alongside the existing three regression tests. Pipeline behavior unchanged. test_output_provenance.py failure count unchanged at 1 (baseline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…70) Two regulatory citation defects across 6 tracked files (11 specific text edits). No schema, axiom, class, or property changes. Pipeline behavior byte-identical: ALL CHECKS PASSED, exit 0. Defect 1 (Recital 22 miscitation, 7 locations): Recital 22 of Regulation (EU) 2024/1689 governs extraterritorial scope (third-country operators), not the biometric verification carve-out. The correct anchor chain is Recital 15 (definition + carve-out), Recital 17 (RBI-specific carve-out + rationale), and Annex III item 1(a) operative clause. Fixed in: - 03_TECHNICAL_CORE/ontology/ARCO_instances_verification.ttl (lines 18, 27): kiosk fixture rdfs:comment surfaced via select_system_comment.sparql into the negative-case certificate panel - 03_TECHNICAL_CORE/scripts/run_pipeline.py (lines 445, 1243): docstring + inline comment describing the design pattern - 03_TECHNICAL_CORE/reasoning/select_system_comment.sparql (line 6): SPARQL header comment - docs/MODELING_ROADMAP.md (line 15): public-facing modeling narrative Defect 2 (Article 3(36) "intended to be used" misattribution, 4 locations in tracked files): Article 3(36) is the technical 1:1 verification definition; it does not contain the phrase "intended to be used." That framing appears verbatim in Recital 15, Recital 17, and Annex III item 1(a). Fixed in: - docs/MODELING_ROADMAP.md (line 68) - docs/kiosk_demo_v1/kiosk_demo.md (lines 65, 172) - LIMITATIONS.md (line 114) Defect 3 (Article 3(43) paraphrase fidelity, 1 location): Article 3(43) verbatim uses "system" (BFO Bucket 1) where ARCO's :PostRemoteBiometricIdentificationProcess class uses "process" (BFO Bucket 4). Light fix in LIMITATIONS.md:141 adds the verbatim regulatory text and explicitly flags the "process" framing as ARCO's modeling translation. The corresponding TTL skos:definition at ARCO_governance_extension.ttl:336 deliberately not modified: the existing rdfs:comment already discloses the unused-stub status and regulatory paraphrase context, so adding a redundant flag in the definition would violate Adequatism. Verification: - python 03_TECHNICAL_CORE/scripts/run_pipeline.py returns ALL CHECKS PASSED, exit 0 - grep "Recital 22" across the technical core returns zero hits - grep 'Article 3(36).{0,40}intended to be used' returns zero hits - All 14 fixes verified by per-fix backtest agents (deploy + verify pattern) Deferred: - Class-naming question (:PostRBI Process vs :PostRBI System for BFO Bucket 4 vs Bucket 1) tracked separately - /intake of the EU AI Act source into KB pending separate authorization - Output structure work (executive summary restructure, row-level audit-table interpretations, RAG colors, audit-trail anchor) is the separate next milestone Revert: each fix is a localized text edit in an annotation property, code comment, or prose paragraph. No cascading effects expected. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Article 6(3) derogation scope qualifier no longer concatenated into `VERIFIED (ENTAILED)` literals (certificate text, summary print, summary.json, HTML badge, cert_lines builder). Scope surfaces as a separate field everywhere it appeared: - `derogation_evaluation_scope` object in summary.json - separate `ARTICLE 6(3) DEROGATION: NOT EVALUATED (run scope)` line in certificate.txt - pre-existing `derogation_scope_badge` HTML node now the canonical HTML disclosure surface Gate 3 display now requires the USS to designate `:NaturalPersonRole` (not just USS existence). Shared `gate3_designates_expected_role` helper applied to both HTML view and determination packet status, closing the display-weaker-than-OWL-axiom gap. `test_output_provenance.py` forbidden-pattern check passes 0/0. README and LIMITATIONS updated to remove stale "failing-by-design" language and "LIVE" markers on the now-closed Gate 3 and Article 6(3) defects. Closes OPEN_PROBLEMS L3.4 (Gate 3 truth-surface) and the Article 6(3) mixed-provenance forbidden-pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New fixture `ARCO_instances_adversarial_decoy_5b.ttl` declares
`:WeirdCalculator owl:equivalentClass :CreditworthinessEvaluationCapability`,
types `:WeirdCalc_Disposition` only as the alias class, and asserts the
three-gate participants (IUS prescribing a `:CreditworthinessEvaluationProcess`
token, USS designating `:NaturalPersonRole`). Adversarial-purity
discipline matches the 1(a) decoy: no provider organisation, no
assessment documentation, no obligation, no determination ICE — only
what the gate axiom needs to fire.
New `test_credit_decoy_classifies_via_equivalent_class` in
`test_adversarial_mechanism.py` verifies five assertions:
1. pre-reasoning disposition typed as alias only
2. pre-reasoning disposition NOT typed as :CreditworthinessEvaluationCapability
3. post-reasoning disposition IS typed as :CreditworthinessEvaluationCapability
4. post-reasoning system entails :AnnexIII5bApplicableSystem
5. post-reasoning system does NOT entail :AnnexIII1aApplicableSystem
(cross-category isolation preserved)
Alias path's IRIs and labels avoid Credit / Score / Assessment / Evaluation
/ 5b vocabulary so a grep/label-matching reader sees no regulated-class
hint in the disposition, module, or system names. The `rdfs:comment` on
the alias class necessarily names :CreditworthinessEvaluationCapability
(that is what `owl:equivalentClass` documents).
README and LIMITATIONS adversarial-coverage descriptions updated to
reflect two equivalentClass decoys (1(a) + 5(b)) plus the blank-node ghost.
Closes OPEN_PROBLEMS L3.7 (5(b) adversarial equivalentClass parity).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
"the regulated class IRI never appears in the input data" is not literally true — the regulated class IRI does appear in the `owl:equivalentClass` declaration on the alias class. The safer claim is that the regulated class is not asserted as the disposition's type. The equivalentClass declaration is class-level, not instance-level, so the reasoner reaches the disposition's classification via class-equivalence propagation, not via direct type assertion. That distinction is what the test exercises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings 4 commits from
chore/regulatory-conditions-to-governanceintomain. Each was individually reviewed and tested as a sub-PR or supporting commit on the parent branch.Test plan
Local backtest against all 7 fixtures and 4 pipeline test suites:
mainpre-merge)Files affected
Deferred
None. All sub-PRs were independently merged into the parent branch.
Revert
Revert the merge commit on
mainto restore the pre-merge state. Sub-PRs remain on the parent branch for individual revert if needed.