From db6eb5ec3f4b42ae6107dce1f4e23a96dfd43dea Mon Sep 17 00:00:00 2001 From: Anne Fouilloux Date: Sun, 31 May 2026 15:20:25 +0200 Subject: [PATCH] Fix upstream-node handling in nanopub chain skills The /np/constellation walk terminates at the Claim when the Claim->AIDA link is a shared AIDA-statement IRI (asAidaStatement -> purl.org/aida/...) rather than a nanopub-to-nanopub reference, so AIDA (step 2) and the upstream Quote/PICO/PCC (step 1) can both be absent from steps[]. Document this in both skills and stop treating it as a chain-integrity failure. verify-chain: the old upstream-reachability check used the .../sciencelive/np/... URI form, which redirects to the HTML viewer and returns HTTP 200 on the SPA shell -- so a status-only check passed even for non-existent nanopubs. Switch to the bare w3id.org/np/ resolver form (which serves TriG) and assert the body is TriG, not HTML. import-from-nanopub: same prefix bug in the optional archival loop was silently saving HTML viewer pages as .trig files; swap the prefix there too. Add an "upstream terminus" note explaining the constellation may not reach AIDA/Quote and how to recover them via direct TriG fetch. API-side fix (bridge the asAidaStatement IRI to its asserting nanopub in discoverNeighbours) is tracked separately in science-live-platform. --- .claude/skills/import-from-nanopub/SKILL.md | 7 ++++++- .claude/skills/verify-chain/SKILL.md | 17 +++++++++++------ 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/.claude/skills/import-from-nanopub/SKILL.md b/.claude/skills/import-from-nanopub/SKILL.md index 4d457a3..2f13fcb 100644 --- a/.claude/skills/import-from-nanopub/SKILL.md +++ b/.claude/skills/import-from-nanopub/SKILL.md @@ -81,6 +81,8 @@ The 200 JSON response is the constellation. Top-level keys: - `chains[]` — array of `{id, outcomeUri, outcomeVerdict, outcomeConfidence, citoRelations[], steps[]}` - `chains[].steps[]` — array of `{step, uri, …}` where `step` is `"AIDA"`, `"Claim"`, `"Study"`, `"Outcome"`, or `"CiTO"` and each step type carries its substantive prose fields inline (Study has `scope`, `method`, `deviations`; Outcome has `label`, `verdict`, `confidence`, `conclusion`, `evidence`, `limitations`, `repository`; CiTO has `relations[]`, `targets[]`) +**Upstream terminus — the constellation may not reach AIDA or Quote.** The Claim→AIDA link is a shared AIDA-statement IRI (`asAidaStatement → http://purl.org/aida/`), not a nanopub-to-nanopub reference, and the `/np/constellation` walk follows only nanopub references. For some chain shapes it therefore terminates at the **Claim**: `steps[]` has no `"AIDA"` entry, and the upstream Quote-with-comment / PICO / PCC (step 1) is absent too. This is expected, not a missing-data failure — those upstream nanopubs exist and are valid. If you need the AIDA / Quote prose, recover their URIs from the source chain's `PUBLISHED.md` (or from the Claim's `asAidaStatement` IRI) and fetch them directly via the **bare resolver form** (see Step 3's archival loop). An API-side fix to bridge the AIDA-statement IRI is tracked separately. + ### Step 3 — Cache the response Write the raw response to `nanopubs/imported/constellation.json` (this directory is gitignored by the template). This is the single source of truth for the rest of the skill. @@ -96,7 +98,10 @@ Optionally also fetch each step URI's TriG for archival (useful if the user want mkdir -p nanopubs/imported/trig for uri in $(printf '%s' "$body" | jq -r '.chains[].steps[].uri'); do ra_id=$(printf '%s' "$uri" | sed 's|.*/||') - curl -sL -H "Accept: application/trig" -o "nanopubs/imported/trig/${ra_id}.trig" "$uri" + # The …/sciencelive/np/… form redirects to the HTML viewer; only the bare + # w3id.org/np/ resolver form serves TriG. Swap the prefix before fetching. + resolver_uri=$(printf '%s' "$uri" | sed 's#/sciencelive/np/#/np/#') + curl -sL -H "Accept: application/trig" -o "nanopubs/imported/trig/${ra_id}.trig" "$resolver_uri" done ``` diff --git a/.claude/skills/verify-chain/SKILL.md b/.claude/skills/verify-chain/SKILL.md index 2f3fe76..8304d6f 100644 --- a/.claude/skills/verify-chain/SKILL.md +++ b/.claude/skills/verify-chain/SKILL.md @@ -109,16 +109,21 @@ Build a set of all URIs returned by the API (across `researchSynthesis.uri`, `ap If a URI in `PUBLISHED.md` is missing from the constellation, that's a chain-integrity failure: the URI exists but isn't reachable from the entry point via FORRT chain links. Record it. -The constellation API does NOT enumerate Quote-with-comment / PICO / PCC URIs as a separate `step` (they sit upstream of AIDA). For step 1, verify reachability with a direct fetch: +The constellation API does NOT enumerate Quote-with-comment / PICO / PCC URIs as a separate `step` (they sit upstream of AIDA). In some chain shapes it also omits the **AIDA** (step 2): the Claim→AIDA link is a shared AIDA-statement IRI (`asAidaStatement → http://purl.org/aida/`), not a nanopub reference, so the walk can terminate at the Claim. Treat a missing step 1 — and a missing step-2 AIDA — as **upstream-not-enumerated**, not a chain-integrity failure; still verify those URIs in `PUBLISHED.md` resolve. + +Verify each upstream URI with a direct TriG fetch. **Do not test the `…/sciencelive/np/…` form** — it redirects to the HTML viewer and returns HTTP 200 on the SPA shell, so a status-only check passes even when no nanopub is served. Use the **bare resolver form** `https://w3id.org/np/RA…` (swap the prefix) and assert the body is TriG, not HTML: ```bash -quote_uri="" -curl -sI --max-time 30 -H "Accept: application/trig" -L "$quote_uri" \ - | grep -E '^HTTP/' | tail -1 +upstream_uri="" +resolver_uri=$(printf '%s' "$upstream_uri" | sed 's#/sciencelive/np/#/np/#') +body=$(curl -sL --max-time 30 -H "Accept: application/trig" "$resolver_uri") +case "$(printf '%s' "$body" | head -c 16 | tr 'A-Z' 'a-z')" in + '@prefix'*) echo "PASS — TriG served" ;; + '