Skip to content

Jott2121/graph-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

graph-guard

ci

Ontology/graph-aware retrieval over a personal knowledge vault — built in two tiers plus a real-vault measurement of what each one buys you. Turns a folder of markdown (Obsidian [[wikilinks]] + YAML frontmatter) into a typed knowledge graph, retrieves with multi-hop Personalized PageRank fused with lexical TF-IDF behind rag-guard's retrieve() seam, and — as a Tier-B fidelity layer — exports the same graph as a formal RDF/OWL/SHACL/SPARQL ontology with owlrl reasoning.

The point isn't "graph vs. vectors" or "how much OWL can we bolt on." It's judgment: knowing when the heavyweight semantic stack earns its cost and when it doesn't — shown with numbers measured on a real, live 517-note vault, not a toy benchmark.

The measured result

Three retrieval arms — flat TF-IDF, the Tier-A typed graph, and the Tier-B owlrl-reasoned graph — run over graph-guard's actual knowledge graph (517 notes / 807 nodes / 1,814 edges), scored on 159 structure-derived multi-hop probes and 517 simple-lookup probes (k=10). Full method and honest limits in docs/EVAL-real-vault-lift.md; raw numbers in eval/results.json.

Finding Result
Graph beats flat on multi-hop hit@10 0.3145 → 0.3585 (+14% relative); MRR 0.1303 → 0.1647 (+26% relative)
Graph doesn't hurt simple lookups within ~1 point of flat on every metric — the hybrid fallback holds
owlrl reasoning adds ~zero retrieval lift hit@10 identical to raw graph (0.3585 both); MRR +0.0008

The takeaway: the ontology earns its cost on fidelity, SHACL validation, entailment, and standards interop (SPARQL, and by extension AWS Neptune) — not on retrieval. That's not a weakness of Tier B, it's the honest shape of what formal semantics is for. See docs/TRADEOFFS.md (graph-vs-flat, how much ontology) and docs/SPARQL-vs-PPR.md (the exactness-vs-fuzziness mechanism behind that third finding) for the full architect reasoning.

The three layers (all shipped)

  • Tier A — working core. A typed knowledge graph (SQLite triples, provenance + confidence) with multi-hop Personalized PageRank fused with lexical TF-IDF, hybrid-routed, behind rag-guard's retrieve() seam. Lean closed schema; graph-tightened guards. Near-stdlib, fast, fully tested.
  • Tier B — enterprise-fidelity layer. The same graph, exported as a formal ontology: RDF/Turtle with reified provenance, an OWL T-Box (schema.org/SKOS-mapped, with a DL-safe functional-vs-transitive supersedes split), SHACL shapes, SPARQL 1.1 (rdflib in-memory, fully tested; opt-in Apache Jena Fuseki for a live server), and owlrl OWL-2-RL reasoning/entailment. This mirrors the architecture AWS Neptune uses in production — an RDF/SPARQL store plus an external reasoner — documented in docs/SPARQL-vs-PPR.md.
  • Gate D — the measurement. A 3-arm eval (flat / graph / owlrl-reasoned) over the real vault, with structure-derived probes (no hand labeling) and an aggregate-only, PII-safe output — the numbers above.

How it works

Obsidian vault (frontmatter + [[wikilinks]] + prose)
   │  extract.py   3-tier: frontmatter + wikilinks (deterministic, free) → LLM (optional, off by default)
   │               wikilinks resolve to real notes; edges typed by frontmatter, heading, and inline cues
   ▼
SQLite triple store (store.py)   nodes + edges, with provenance + confidence
   │  adjacency → Personalized PageRank (ppr.py, stdlib)
   ▼
GraphRetriever.retrieve(query, k)  (graph_retriever.py)
   entity-link query → PPR over typed edges → reciprocal-rank-fuse with TF-IDF → hybrid route
   ▼
rag-guard pipeline + tightened guards (guards.py)
   entity-overlap grounding · zero-node structural refuse · functional-predicate contradiction

Lean closed schema (schema.py): entities Person/Project/Reference/Feedback/Decision/Claim/ Source/Tool/Event/Concept (mapped to schema.org + SKOS in Tier B); a closed predicate set (mentions, about, is_part_of, authored_by, supersedes, blocks, depends_on, decides, supports, refutes, has_status, broader/narrower/related). Functional predicates (has_status, supersedes) power contradiction checks.

Tier B path: rdf_export.py turns the live TripleStore into RDF/Turtle (ontology_data/ ontology.ttl is the OWL T-Box, ontology_data/shapes.ttl the SHACL shapes) → shacl.py::validate checks conformance → reasoning.py::materialize runs owlrl OWL-2-RL entailment → fuseki.py runs SPARQL 1.1 (property paths included) against the in-memory graph, or an opt-in local Fuseki server.

Install / run

pip install "git+https://github.com/Jott2121/graph-guard.git"   # pulls guarded-rag from git automatically

For development:

pip install -e ".[dev,rdf]"    # [rdf] adds rdflib/pyshacl/owlrl for Tier B
python -m pytest -q            # 137 passing (+1 opt-in Fuseki skip), ~97% coverage, Python 3.11-3.13 (see .github/workflows/ci.yml)
from graph_guard import service

# retrieval only (hybrid-routed graph + lexical):
hits = service.query("what superseded the leo bus", k=5)

# graph-GUARDED answer (structural refuse + entity-overlap grounding actually run here):
from rag_guard.providers import FakeProvider   # swap for a real provider
out = service.answer("what superseded the leo bus", FakeProvider("..."), k=5)
# {'answer', 'refused', 'grounded', 'support', 'sources'}

The guards run in service.answer(), not inside retrieve()retrieve() is retrieval-only so it drops cleanly behind rag-guard's seam; answer() composes the structural refuse gate + entity-overlap grounding around a provider.

Reproduce the measured lift against your own vault: python -m eval.real_vault_lift (reads a local vault via rag_guard.config.default_roots(); writes an aggregate-only, PII-safe eval/results.json — no note ids, paths, or query/gold text). The mechanism demo behind the "owlrl adds ~zero lift" finding: python -m eval.sparql_vs_ppr.

Where this sits in the landscape (prior art)

None of the ideas are novel — the value is the integration and the honest right-sizing judgment. Retrieval lineage: GraphRAG (local/global/community summaries), LightRAG (lightweight dual-level), HippoRAG (Personalized PageRank multi-hop), OG-RAG (ontology-grounded), the Personal Knowledge Graph survey, and GraphRAG-Bench (graphs win on multi-hop, not simple lookups — matches this repo's own measurement). Right-sizing: "Knowledge Conceptualization Impacts RAG Efficacy" (arXiv:2507.09389). Standards: schema.org, W3C SKOS, OWL 2, SPARQL 1.1. Production analogue: AWS Neptune's RDF/SPARQL engine plus an external reasoner.

Honest limits

  • The eval's structure-derived probes measure multi-hop link-recovery and no-harm, not organic question relevance — a probe's query is a note's own label, not something a user actually typed.
  • Single-gold assumption; real queries can have several relevant notes.
  • A personal N-of-1 vault (the author's own ~517 notes) — these results may not generalize to a different corpus, domain, or scale.
  • The numbers are a snapshot (2026-07-02): the vault is live and evolves, so a re-run will drift even though the measurement itself is deterministic for a fixed snapshot.
  • owlrl is OWL 2 RL — a decidable, rule-based fragment of OWL 2, not full OWL-DL reasoning.
  • Retrieval's lexical leg is TF-IDF (swap embeddings behind the same retrieve() seam).
  • Extraction is deterministic (frontmatter + wikilinks + inline relation cues). Inline relation direction is not resolved ("X superseded by [[Y]]" connects X↔Y but doesn't encode who supersedes whom); precise relation extraction is the Tier-3 LLM job (injectable llm_fn, off by default). No community-summary global search yet.
  • The Fuseki live path is opt-in (GRAPH_GUARD_FUSEKI_LIVE=1), not part of the default test run.

Built by Jeff Otterson (Jott2121). Sibling to rag-guard. MIT.

About

Ontology/graph-aware retrieval over a personal knowledge vault: typed KG + multi-hop PageRank behind a guarded-RAG seam, with an RDF/OWL/SHACL/SPARQL + owlrl fidelity layer. Tier A pragmatic core, Tier B enterprise-fidelity, Gate D measured real-vault lift.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors