Skip to content

chore(proof-surface): output integrity and adversarial parity#72

Merged
Amosk21 merged 5 commits into
mainfrom
chore/proof-surface-honesty
May 21, 2026
Merged

chore(proof-surface): output integrity and adversarial parity#72
Amosk21 merged 5 commits into
mainfrom
chore/proof-surface-honesty

Conversation

@Amosk21

@Amosk21 Amosk21 commented May 21, 2026

Copy link
Copy Markdown
Owner

Verdict

This PR did not drift into new production modeling commitments. It tightens the proof surface, separates graph-backed claims from run metadata, aligns display output with the OWL gate, and makes adversarial owl:equivalentClass coverage symmetric across the two modeled Annex III categories.

Commit 1: citation anchors for verification carve-out

2b52861 fix(citations): correct EU AI Act anchors for verification carve-out (#70)

Three citation / prose-fidelity fixes against Regulation (EU) 2024/1689. No code logic, schema, production axiom, class, or property changes. Classification behavior is unchanged. The kiosk rdfs:comment is surfaced by select_system_comment.sparql, so the certificate prose-scope text reflects the corrected wording.

Fixed:

  • Recital 22 miscitation: verification-vs-identification framing now anchors to Recital 15, Recital 17, and Annex III item 1(a).
  • Article 3(36) intended-use misattribution: Article 3(36) supplies the technical 1:1 verification definition; the intended-use framing comes from the recitals and Annex III clause.
  • Article 3(42)/(43) noun translation: the regulation defines these in terms of systems; ARCO models them as process classes because classification reasons over the deployment-and-operation occurrence.

Files: ARCO_instances_verification.ttl, select_system_comment.sparql, run_pipeline.py comment blocks, LIMITATIONS.md, docs/MODELING_ROADMAP.md, docs/kiosk_demo_v1/kiosk_demo.md.

Stats: 18 insertions, 15 deletions.

Commit 2: output provenance split + Gate 3 display match

295d003 fix(output): separate scope metadata from graph-backed entailment

Changes:

  • Article 6(3) provenance split: emitted Annex III status is now pure graph-backed VERIFIED (ENTAILED) or NOT APPLICABLE; Article 6(3) non-evaluation is surfaced separately as run-scope metadata. summary.json now emits derogation_evaluation_scope, matching the manifest field the pipeline previously declared but did not emit.
  • Gate 3 display match: gate3_designates_expected_role(gate_evidence) now requires the bound role to equal :NaturalPersonRole, replacing the weaker bool(uss_uri) display check. The helper is used for both HTML and determination_packet.json.

Files: run_pipeline.py, test_output_provenance.py, LIMITATIONS.md, README.md, .gitignore.

Stats: 88 insertions, 37 deletions.

Commit 3: 5(b) adversarial equivalentClass parity

aa1c8d1 test(adversarial): add 5(b) equivalentClass decoy parity

Adds a 5(b) adversarial fixture parallel to the existing 1(a) decoy.

New fixture: ARCO_instances_adversarial_decoy_5b.ttl.

  • Declares test-only alias class :WeirdCalculator owl:equivalentClass :CreditworthinessEvaluationCapability.
  • Uses a minimal three-gate fixture: system, module, disposition, one Gate 2 process token, one IUS, and one USS.
  • Adds no provider organization, assessment documentation, compliance obligation, or determination ICE.
  • Keeps alias-path IRIs and labels free of Credit/Score/Evaluation/Assessment/5b vocabulary; the alias-class comment necessarily names the regulated class because that is what the owl:equivalentClass axiom documents.

New test: test_credit_decoy_classifies_via_equivalent_class() in test_adversarial_mechanism.py.

The test verifies:

  • Pre-reasoning system is not already :AnnexIII5bApplicableSystem.
  • Pre-reasoning disposition is typed as :WeirdCalculator.
  • Pre-reasoning disposition is not directly typed as :CreditworthinessEvaluationCapability.
  • Post-reasoning disposition entails as :CreditworthinessEvaluationCapability.
  • Post-reasoning system entails as :AnnexIII5bApplicableSystem.
  • Post-reasoning system does not entail as :AnnexIII1aApplicableSystem.

Docs updated: README and LIMITATIONS now describe the three adversarial cases: 1(a) decoy, 5(b) decoy, and blank-node ghost.

Stats: 4 files changed, 1 new file, 168 insertions, 4 deletions.

Commit 4: README adversarial-coverage precision

db149eb docs(readme): tighten adversarial-coverage claim

One-line overclaim correction:

  • Old: the regulated class IRI never appears in the input data
  • New: the regulated class is not asserted as the disposition's type

Reason: the owl:equivalentClass declaration in each decoy fixture necessarily names the regulated class IRI. The narrower claim is the actual proof target: the disposition is not directly typed as the regulated class; the reasoner reaches that classification through class-equivalence propagation.

Stats: 1 insertion, 1 deletion.

Aggregate

  • Commits: 4
  • Files changed: 11 total, including 10 existing files modified and 1 new fixture file
  • Insertions / deletions: 274 / 56
  • Core/governance ontology axioms changed: 0
  • Test-only fixture axiom added: 1 owl:equivalentClass
  • New production/core domain classes or properties: 0
  • Fixture-scoped test alias class added: :WeirdCalculator
  • SHACL shapes changed: 0
  • SPARQL queries added: 0
  • Output payload change: summary.json now emits derogation_evaluation_scope
  • Previously failing-by-design test now passing: test_output_provenance.py

Validation

Run locally on chore/proof-surface-honesty:

  • python 03_TECHNICAL_CORE/scripts/test_output_provenance.py
  • python 03_TECHNICAL_CORE/scripts/test_gate_removal.py
  • python 03_TECHNICAL_CORE/scripts/test_adversarial_mechanism.py
  • python 03_TECHNICAL_CORE/scripts/test_scenarios.py

All passed.

Not included

Intentionally outside this PR:

  • Local backtest synthesis under runs/.
  • Local OPEN_PROBLEMS.md register edits.
  • L3.8 audit-hardening implementation work.
  • Kiosk proof table.
  • L2.6 / L2.7 / L2.8 / L2.9 foundation-modeling decisions.
  • Untracked docs/plans/ artifacts.

Amosk21 and others added 5 commits May 20, 2026 21:45
…70)

Two regulatory citation defects across 6 tracked files (11 specific text
edits). No schema, axiom, class, or property changes. Pipeline behavior
byte-identical: ALL CHECKS PASSED, exit 0.

Defect 1 (Recital 22 miscitation, 7 locations):
Recital 22 of Regulation (EU) 2024/1689 governs extraterritorial scope
(third-country operators), not the biometric verification carve-out. The
correct anchor chain is Recital 15 (definition + carve-out), Recital 17
(RBI-specific carve-out + rationale), and Annex III item 1(a) operative
clause. Fixed in:
  - 03_TECHNICAL_CORE/ontology/ARCO_instances_verification.ttl (lines
    18, 27): kiosk fixture rdfs:comment surfaced via
    select_system_comment.sparql into the negative-case certificate
    panel
  - 03_TECHNICAL_CORE/scripts/run_pipeline.py (lines 445, 1243):
    docstring + inline comment describing the design pattern
  - 03_TECHNICAL_CORE/reasoning/select_system_comment.sparql (line 6):
    SPARQL header comment
  - docs/MODELING_ROADMAP.md (line 15): public-facing modeling narrative

Defect 2 (Article 3(36) "intended to be used" misattribution, 4
locations in tracked files):
Article 3(36) is the technical 1:1 verification definition; it does not
contain the phrase "intended to be used." That framing appears verbatim
in Recital 15, Recital 17, and Annex III item 1(a). Fixed in:
  - docs/MODELING_ROADMAP.md (line 68)
  - docs/kiosk_demo_v1/kiosk_demo.md (lines 65, 172)
  - LIMITATIONS.md (line 114)

Defect 3 (Article 3(43) paraphrase fidelity, 1 location):
Article 3(43) verbatim uses "system" (BFO Bucket 1) where ARCO's
:PostRemoteBiometricIdentificationProcess class uses "process" (BFO
Bucket 4). Light fix in LIMITATIONS.md:141 adds the verbatim regulatory
text and explicitly flags the "process" framing as ARCO's modeling
translation. The corresponding TTL skos:definition at
ARCO_governance_extension.ttl:336 deliberately not modified: the
existing rdfs:comment already discloses the unused-stub status and
regulatory paraphrase context, so adding a redundant flag in the
definition would violate Adequatism.

Verification:
  - python 03_TECHNICAL_CORE/scripts/run_pipeline.py returns ALL CHECKS
    PASSED, exit 0
  - grep "Recital 22" across the technical core returns zero hits
  - grep 'Article 3(36).{0,40}intended to be used' returns zero hits
  - All 14 fixes verified by per-fix backtest agents (deploy + verify
    pattern)

Deferred:
  - Class-naming question (:PostRBI Process vs :PostRBI System for BFO
    Bucket 4 vs Bucket 1) tracked separately
  - /intake of the EU AI Act source into KB pending separate
    authorization
  - Output structure work (executive summary restructure, row-level
    audit-table interpretations, RAG colors, audit-trail anchor) is the
    separate next milestone

Revert: each fix is a localized text edit in an annotation property,
code comment, or prose paragraph. No cascading effects expected.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Article 6(3) derogation scope qualifier no longer concatenated into
`VERIFIED (ENTAILED)` literals (certificate text, summary print,
summary.json, HTML badge, cert_lines builder). Scope surfaces as a
separate field everywhere it appeared:
- `derogation_evaluation_scope` object in summary.json
- separate `ARTICLE 6(3) DEROGATION: NOT EVALUATED (run scope)` line
  in certificate.txt
- pre-existing `derogation_scope_badge` HTML node now the canonical
  HTML disclosure surface

Gate 3 display now requires the USS to designate `:NaturalPersonRole`
(not just USS existence). Shared `gate3_designates_expected_role`
helper applied to both HTML view and determination packet status,
closing the display-weaker-than-OWL-axiom gap.

`test_output_provenance.py` forbidden-pattern check passes 0/0. README
and LIMITATIONS updated to remove stale "failing-by-design" language
and "LIVE" markers on the now-closed Gate 3 and Article 6(3) defects.

Closes OPEN_PROBLEMS L3.4 (Gate 3 truth-surface) and the Article 6(3)
mixed-provenance forbidden-pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New fixture `ARCO_instances_adversarial_decoy_5b.ttl` declares
`:WeirdCalculator owl:equivalentClass :CreditworthinessEvaluationCapability`,
types `:WeirdCalc_Disposition` only as the alias class, and asserts the
three-gate participants (IUS prescribing a `:CreditworthinessEvaluationProcess`
token, USS designating `:NaturalPersonRole`). Adversarial-purity
discipline matches the 1(a) decoy: no provider organisation, no
assessment documentation, no obligation, no determination ICE — only
what the gate axiom needs to fire.

New `test_credit_decoy_classifies_via_equivalent_class` in
`test_adversarial_mechanism.py` verifies five assertions:
  1. pre-reasoning disposition typed as alias only
  2. pre-reasoning disposition NOT typed as :CreditworthinessEvaluationCapability
  3. post-reasoning disposition IS typed as :CreditworthinessEvaluationCapability
  4. post-reasoning system entails :AnnexIII5bApplicableSystem
  5. post-reasoning system does NOT entail :AnnexIII1aApplicableSystem
     (cross-category isolation preserved)

Alias path's IRIs and labels avoid Credit / Score / Assessment / Evaluation
/ 5b vocabulary so a grep/label-matching reader sees no regulated-class
hint in the disposition, module, or system names. The `rdfs:comment` on
the alias class necessarily names :CreditworthinessEvaluationCapability
(that is what `owl:equivalentClass` documents).

README and LIMITATIONS adversarial-coverage descriptions updated to
reflect two equivalentClass decoys (1(a) + 5(b)) plus the blank-node ghost.

Closes OPEN_PROBLEMS L3.7 (5(b) adversarial equivalentClass parity).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
"the regulated class IRI never appears in the input data" is not
literally true — the regulated class IRI does appear in the
`owl:equivalentClass` declaration on the alias class. The safer
claim is that the regulated class is not asserted as the
disposition's type. The equivalentClass declaration is class-level,
not instance-level, so the reasoner reaches the disposition's
classification via class-equivalence propagation, not via direct
type assertion. That distinction is what the test exercises.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `ARCO_instances_adversarial_decoy_5b.ttl` (system
`:WeirdCalcSystem_001`) to the HermiT-vs-OWL-RL cross-check matrix
so the new 5(b) equivalentClass decoy is verified under both
reasoners before merge, matching the 1(a) decoy's existing coverage.

The fixture uses named individuals only (no blank-node disposition)
so it does not hit the GhostSystem exclusion path documented in
hermit_cross_check.py's module docstring.

Three loci touched:
- 03_TECHNICAL_CORE/scripts/hermit_cross_check.py: CERTIFICATE_GRADE_SCENARIOS
- .github/workflows/robot-validate.yml: matrix strategy fixture list
- LIMITATIONS.md §7.4: HermiT cross-check inventory prose

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Amosk21 Amosk21 marked this pull request as ready for review May 21, 2026 12:01

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Amosk21 Amosk21 merged commit 4f0c3da into main May 21, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant