Owen-WIKI Template Kit

Owen-WIKI is a Markdown-based knowledge operations template that helps LLM agents read large raw source collections and turn them into curated wiki pages, ontology relations, and reusable outputs. Together with Owen Graphite and Owen Editor, it forms an Obsidian-centered workflow for collecting knowledge, maintaining a graph, and producing reports.

A self-growing personal knowledge system template built on LLM Wiki + Knowledge Graph Ontology. Use this kit to create a personal wiki with the same operating model as Owen's production WIKI repository.

Version: 1.18 (2026-06-08)

Origin: Based on Owen's LLM Wiki operating experience: 702 Microsoft Security domain pages, 7,451 wikilinks, 740 ontology relations, 10,097 raw episodes, and 27/27 Microsoft Security product coverage.

Based on: Andrej Karpathy's LLM Wiki pattern, Nodus Labs knowledge graph extensions, LightRAG-style triplet extraction and reranking, and Graphiti-inspired temporal context graph design.

16 Core Features

🤖 LLM-native knowledge base — One AGENTS.md file defines autonomous ingest, query, lint, ontology, and output workflows. Humans provide raw inputs and review outputs.
📂 3-layer separation — raw/ immutable inputs → wiki/ LLM-curated knowledge → outputs/ shared deliverables.
🕸️ Ontology and gap analysis — Stores [[A]] [relation] [[B]] triplets in wiki/ontology/ alongside normal wikilinks to expose clusters, hubs, and missing areas.
🧲 Auto cluster hubs (v1.7) — Absorbs 4,000+ raw files through source registry hubs without requiring one manual ingest per file. Proven at 100% raw conversion coverage.
📋 Action Queue (v1.9) — Generates registry promotion candidates, synthesis candidates, tag normalization candidates, and raw knowledge maturity grades.
🧭 Ops Dashboard (v1.10) — Unifies quality gates, action queue, promotion lifecycle, ontology sidecar, and episode metrics into one operational entry point.
🎚️ Operations Precision (v1.11) — Adds registry scoring, dedupe rules, lifecycle CLI operations, relation quality checks, and a target state of zero tag drift.
🧪 Curation Automation (v1.12) — Supports registry source sampling, lifecycle recommendations, synthesis expansion routing, and safe relation rewrites.
🧬 Ontology Relation Refinement (v1.13) — Reduces weak related-to links into canonical relations and runs a repeatable sidecar quality loop.
🏗️ Architecture Hardening (v1.14) — Uses canonical metrics, query-adjusted PageRank, strict sidecar parsing, and strict CI to prevent metric drift.
⚙️ Operational Automation (v1.15) — Includes query routing, graph hygiene, metrics snippets, related-to budget checks, graph delta reports, registry workbench packets, and release automation.
⏱️ Temporal Provenance (v1.16) — Records Graphiti-style relation_id, episode_id, valid_at, invalid_at, and raw source lineage in sidecar files and the episode ledger.
🧭 Agent Behavioral Guardrails (v1.17) — Uses assumption exposure, simplicity first, minimal change, and verification loops to improve LLM work quality.
🧩 Context Compaction & Prose Metrics (v1.18) — Adds local compact-first sidecars for large ops outputs and local Korean prose linting without requiring an external proxy, wrapper, or rewrite model.
📊 Production-validated scale — 702 pages, 7,451 wikilinks, 740 ontology relations, 10,097 raw episodes, zero broken links, and zero orphan pages.
📦 Reusable template kit — Packaged as an external Git repository so anyone can bootstrap the same LLM Wiki operating model.

Why Owen-WIKI Extends The Early LLM Wiki Pattern

The early LLM Wiki pattern starts from a simple, powerful idea: an LLM reads raw sources, writes Markdown wiki pages, and maintains them over time. Owen-WIKI keeps that core idea, then adds the operational structure needed for large real-world repositories: schema, quality gates, ontology, bulk raw absorption, curation automation, and an output layer.

Area	Early LLM Wiki	Owen-WIKI Template Kit
Core philosophy	The LLM reads raw material and maintains a wiki	The same philosophy is encoded as executable operating rules in `AGENTS.md`
Structure	Raw Sources / Wiki / Schema	`raw/` → `wiki/` → `outputs/` plus a `wiki/ontology/` graph layer
Knowledge accumulation	Markdown pages and wikilinks	Validated at 702 pages, 7,451 wikilinks, 740 ontology relations, and 10,097 raw episodes
Query behavior	Index and wikilink navigation	5-route query strategy, relevance scoring, and query routing policy
Trust management	Source citation is possible, but lifecycle controls are light	`confidence`, `last_confirmed`, `stale_after`, `supersedes`, and `superseded_by` fields
Quality management	Periodic linting concept	Broken link, orphan, tag, stub, ontology, and dashboard quality gates
Bulk source handling	Mostly manual ingest	Binary extraction, auto cluster hubs, remaining raw registries, and promotion lifecycle
Ontology	Mostly wikilink-based	`[[A]] [relation] [[B]]` graph plus temporal/provenance JSONL sidecars
Outputs	The wiki itself is the main artifact	Reports, presentations, workshops, and other audience-specific deliverables
Reuse	Personal knowledge-base pattern	Copyable template kit with starter files, scripts, templates, and ontology templates

The original flow is compact:

raw source -> LLM summary -> wiki page -> query/lint refinement

Owen-WIKI turns that into a knowledge operations pipeline:

raw/
    -> PII check
    -> extraction and clustering
    -> summary/entity/concept/synthesis pages
    -> ontology sidecar
    -> episode ledger
    -> action queue
    -> lifecycle and sampling
    -> output generation
    -> quality gates and dashboard

In short, the early LLM Wiki is the prototype of an LLM-native Markdown knowledge base. Owen-WIKI is an LLM-native knowledge operations platform designed to survive large source collections, domain knowledge, repeated deliverables, and ongoing maintenance.

Benefits At A Glance

Area	Benefit	Mechanism
Trust	Track source richness per page	`confidence` from 0.0 to 1.0 with a five-level guide
Lifecycle	Automatically classify aging information	`last_confirmed` / `stale_after` plus 90-day aging and 180-day stale checks
Versioning	Explicitly replace old pages without deleting history	`supersedes` / `superseded_by` plus output-layer upgrade hints
Privacy	Block risky content before ingest	sanitize-ingest.py checks nine PII patterns
Search efficiency	Reduce tokens during answers	5-route strategy plus nine-factor relevance scoring
Ingest precision	Extract structured knowledge before writing pages	LightRAG-inspired ENTITIES / RELATIONS YAML
Indexing cost	Keep single-file updates cheap	2-tier index plus Smart Diff 3-tier strategy
Bulk absorption	Handle 4,000+ files without one-by-one ingest	auto-cluster-hubs.py + absorb-remaining-uningested.py
Next-action automation	Generate promotion, synthesis, and tag cleanup candidates	wiki-action-queue.py
Candidate precision	Penalize generic registries and dedupe part-based candidates	wiki-action-queue.py
Ops dashboard	Provide one entry point for quality, queue, lifecycle, ontology, and episodes	wiki-ops-dashboard.py
Promotion lifecycle	Track source registry candidates from candidate to promoted	registry-promotion-lifecycle.py
Representative sampling	Select 3-5 source samples for registry review	sample-registry-candidate.py
Machine-readable ontology	Export relation weights, evidence, paths, temporal fields, and provenance	build-ontology-sidecar.py
Episode provenance	Record raw sources as stable episodes and track derived wiki/ontology lineage	build-episode-ledger.py
Relation quality	Identify weak `related-to` relations for replacement	check-ontology-relations.py
Relation rewrites	Apply reviewed relation changes with dry-run/apply modes	apply-ontology-relation-suggestions.py
Ontology loop	Manage weak relation budgets and confidence/evidence tiers	check-ontology-relations.py + build-ontology-sidecar.py
Canonical metrics	Refresh README/AGENTS metrics from repository facts	wiki-stats.py + update-metrics-snippets.py
Query router	Downrank registry-only hubs for normal knowledge questions	wiki-query.py
Graph hygiene	Prevent placeholder, unknown, trailing-link, and escaped-alias graph pollution	check-graph-hygiene.py + wiki_utils.py
Release automation	Bundle validation, metrics update, commit, tag, push, and GitHub Release steps with bare numeric release names such as `1.17`	release-wiki.py
Context compaction	Read compact sidecars for large ops outputs first, then retrieve originals by path and hash when needed	wiki-ops-compact.py
Korean prose lint	Detect translationese and AI-style prose signals locally without rewriting source files	wiki-humanize-metrics.py
Agent work quality	Reduce hidden assumptions, overdesign, unrelated edits, and unverified completion	Agent Behavioral Guardrails in `AGENTS.md`
Integrity	Automate structural quality checks	Tags, ontology, orphans, broken links, confidence decay, stubs, graph hygiene, related-to budget, action queue, dashboard, and relation quality
Quality gates	Enforce structure in PR workflows	wiki-quality-gates.py
Domain depth	Proven Microsoft Security coverage	Five-prefix tag system across hundreds of tags
Output variety	Create more than knowledge-base pages	PPTX, DOCX, HTML, Markdown, SVG, and Mermaid-ready outputs
External source absorption	Convert binary sources into Markdown	markitdown-first extraction with fallback engines
Audit and rollback	Keep every operation traceable	Git, append-only `log.md`, and immutable raw source policy
Visualization	Generate an interactive wiki graph	wiki-graph-viz.py with Louvain communities and HTML output

Canonical Metrics Block

For a new wiki project, run scripts/wiki-stats.py --write-ops and scripts/update-metrics-snippets.py to refresh this block from the actual repository metrics.

Metric	Value
Wiki pages	702
Ontology files	7
Total lines	63,046
Total words	356,661
Wikilinks	7,451 (10.6 average per page)
Tags	651
Raw source files	5,902
Ontology relations	740 (temporal sidecar basis)
Raw episodes	10,097
Git commits	170+

Graph (`graphify-out`)

Metric	Value
Nodes (pages)	674
Edges (wikilinks)	3,645
Communities (Louvain)	9
Connected components	1
Orphan nodes	0
Broken links	0

What's Included

Core Documents And Templates

File	Purpose	Use
`README.md`	This overview and operating guide	Read first
`AGENTS.md`	LLM agent schema v1.17	Copy into your project root and customize
`SETUP-GUIDE.md`	Step-by-step setup guide	Follow during setup
`CHANGELOG.md`	Template kit release history	Review version changes
`templates/`	Five wiki page templates	Copy into `templates/`
`starter-files/`	Starter `index.md`, `log.md`, and `overview.md` files	Copy into the project root
`ontology-templates/`	Starter ontology files	Copy into `wiki/ontology/`

Script Catalog (`scripts/`, 52 files)

Core linting and statistics

Script	Purpose
`wiki-stats.py`	Compute page, tag, confidence, and repository metrics
`find-orphans.py`	Detect pages with zero inbound links
`check-tags.py`	Validate tag prefix compliance
`scan-broken-links.py`	Scan broken wikilinks
`check-ontology.py`	Validate ontology wikilink integrity and relation codes
`check-confidence-decay.py`	Apply 90-day aging and 180-day stale classification
`sanitize-ingest.py`	Run the Ingest 0 PII precheck
`extract-raw-sources.py`	Convert PPTX/PDF/DOCX/XLSX files to Markdown with markitdown-first extraction

Bulk source absorption and cluster hubs

Script	Purpose
`find-uningested-raw.py`	Scan unreferenced raw files with NFC normalization and special-character matching
`auto-cluster-hubs.py`	Group unreferenced raw candidates and create source registry hub summaries
`absorb-remaining-uningested.py`	Absorb remaining unreferenced files into existing hubs with routing rules
`absorb-uningested-subhubs.py`	Split remaining raw candidates into source registry sub-hubs
`apply-default-confidence.py`	Apply policy-based confidence and `last_confirmed` defaults
`backfill-confidence.py`	Backfill missing confidence metadata with heuristics
`rebalance-confidence.py`	Re-evaluate high-trust source types such as `type/mslearn`
`auto-extract-triplets.py`	Provide an LLM-oriented ENTITIES/RELATIONS extraction skeleton
`append-ontology.py`	Append deduped triplets to ontology Markdown files
`fix-broken-wikilinks.py`	Repair known broken wikilinks through an aliases dictionary
`fix-hub-sources.py`	Repair damaged cluster hub `sources` YAML
`gen-hub-category-index.py`	Build body indexes that group hub sources by subfolder

Ontology, graph, and query operations

Script	Purpose
`build-ontology-sidecar.py`	Convert Markdown ontology relations into JSONL with weights, evidence, temporal fields, and provenance
`build-episode-ledger.py`	Record raw sources as stable episodes and map derived wiki pages and ontology relations
`check-ontology-relations.py`	Report weak `related-to` relations that can be replaced by canonical relations
`apply-ontology-relation-suggestions.py`	Apply reviewed relation rewrites with dry-run/apply support
`check-related-to-budget.py`	Enforce the weak `related-to` relation budget in CI
`compute-pagerank.py`	Generate raw PageRank and query-adjusted ranking
`wiki-query.py`	Route candidate pages using body text, tags, category boosts, ontology weight, and query-adjusted PageRank
`wiki-graph-viz.py`	Build a wikilink graph, Louvain communities, interactive HTML, and graph reports
`check-graph-hygiene.py`	Detect placeholder, unknown, and trailing wikilink graph pollution
`wiki_utils.py`	Provide shared wikilink, frontmatter, token parsing, and escaped-alias normalization utilities

Action queue, lifecycle, and operations dashboard

Script	Purpose
`wiki-action-queue.py`	Generate registry promotion, synthesis, tag normalization, maturity, and ranking-hint queues
`registry-promotion-lifecycle.py`	Track candidates through candidate, sampled, promoted, deferred, and rejected states
`sample-registry-candidate.py`	Select 3-5 representative source samples for registry review
`registry-promotion-workbench.py`	Build compact review packets for registry promotion candidates
`wiki-ops-dashboard.py`	Combine quality gates, queues, lifecycle, ontology sidecar, graph hygiene, graph delta, and episode metrics
`weekly-gap-report.py`	Generate a weekly gap report from action queue and quality signals
`identify-stubs.py`	Identify stub pages and summarize cleanup candidates
`analyze-large-hubs.py`	Identify oversized hubs and generate split plans
`build-raw-to-wiki-map.py`	Build raw-to-wiki reference maps and coverage reports
`generate-outputs-backlinks.py`	Add output backlinks to wiki pages

Context compaction and prose metrics

Script	Purpose
`wiki-ops-compact.py`	Create CCR-like compact Markdown/JSON sidecars for large wiki-ops JSON, JSONL, Markdown, and log outputs while preserving source path and SHA-256 retrieval metadata
`wiki-humanize-metrics.py`	Run stdlib-only local Korean prose lint for translationese, AI-style signals, connector habits, and over-polish risks without rewriting source files

Release, metrics, tags, and folder operations

Script	Purpose
`wiki-quality-gates.py`	Enforce broken link, orphan, tag, stub, ontology, graph hygiene, and related-to budget gates
`apply-tag-aliases.py`	Apply tag alias migrations from `tag-aliases.yml`
`tag-aliases.yml`	Store tag normalization aliases
`update-metrics-snippets.py`	Refresh README/AGENTS metrics blocks from canonical metrics
`graph-delta-report.py`	Report graph changes against a Git reference
`release-wiki.py`	Run validation, metrics update, commit, tag, push, and GitHub Release steps; release tags and GitHub Release titles use only bare numeric versions such as `1.17`
`organize-collection-by-month.ps1`	Move direct collection files into monthly `YYYYMM/` folders
`organize-outputs-by-month.ps1`	Move output documents into monthly folders based on frontmatter or mtime
`organize-outputs-attachments-by-month.ps1`	Move output attachments into monthly attachment folders
`sync-to-obsidian.ps1`	Sync the wiki into an Obsidian vault

Quick Start (5 Minutes)

# 1. Create a project folder.
mkdir my-wiki && cd my-wiki

# 2. Create the folder structure.
mkdir -p raw/articles \
         raw/obsidian/Clippings \
         raw/obsidian/outputs \
         wiki/{entities,concepts,summaries,comparisons,synthesis,ontology} \
         outputs/wiki-ops graphify-out templates scripts

# 3. Copy the Owen-WIKI kit files.
cp <path-to>/owen-wiki/AGENTS.md ./AGENTS.md
cp <path-to>/owen-wiki/starter-files/* ./
cp <path-to>/owen-wiki/templates/* ./templates/
cp <path-to>/owen-wiki/ontology-templates/* ./wiki/ontology/
cp <path-to>/owen-wiki/scripts/* ./scripts/
mkdir -p .github/workflows && cp <path-to>/owen-wiki/.github/workflows/wiki-lint.yml ./.github/workflows/

# 4. Open AGENTS.md and customize the domain, paths, and operating rules.

# 5. Initialize Git when you are ready.
git init && printf ".venv/\nraw/extracted/\ngraphify-out/\n" > .gitignore

# 6. Add the first source into raw/ and ask your LLM agent to ingest it.
#    The agent should run sanitize-ingest.py before ingest.
#    Large collections can be absorbed with auto-cluster-hubs.py.

See SETUP-GUIDE.md for the full setup guide.

Architecture Overview

Knowledge Pipeline

raw/ inputs -> wiki/ curated knowledge + ontology -> outputs/ deliverables
                         ^                           ^
                         |                           |
                 relation extraction          gap-based generation

Four Layers

Layer	Owner	Role
`raw/`	User	Immutable source material
`wiki/`	LLM	Curated knowledge pages
`wiki/ontology/`	LLM	Relation graph and gap analysis inside the wiki layer
`outputs/`	Shared	Final deliverables and working drafts

Five Page Types

Type	Folder	Purpose	Example
Entity	`wiki/entities/`	People, organizations, tools, products, customers	`openai.md`, `python.md`
Concept	`wiki/concepts/`	Theories, frameworks, methods	`machine-learning.md`
Summary	`wiki/summaries/`	Source-based summaries	`attention-is-all-you-need.md`
Comparison	`wiki/comparisons/`	Comparison and trade-off analysis	`pytorch-vs-tensorflow.md`
Synthesis	`wiki/synthesis/`	Cross-source synthesis and hubs	`overview.md`

Core Workflows

Workflow	Trigger	Core Behavior
Ingest	New sources	PII precheck, triplet extraction, summary creation, entity/concept updates, ontology append
Query	User question	5-route discovery, relevance scoring, query routing, and synthesized answer
Lint	Periodic maintenance	Contradiction, orphan, gap, decay, tag, ontology, and quality gate checks
Ontology Update	Large changes	Incremental relation updates, sidecar rebuild, gap analysis, and overview refresh
Cluster Hub Absorb	Large raw additions	`find-uningested-raw.py` -> `auto-cluster-hubs.py` -> `absorb-remaining-uningested.py` -> `gen-hub-category-index.py`

Version Milestones

Version	Milestone
v1.3	Triplet-first ingest and metadata-based relevance scoring
v1.4	Confidence, lifecycle metadata, supersession, and PII precheck
v1.5	NetworkX / Louvain graph visualization
v1.6	Diagram standards and print-friendly palettes
v1.7	Auto cluster hubs and 100% raw conversion coverage pattern
v1.9	Action Queue and CI quality gates
v1.10	Ops Dashboard and promotion lifecycle
v1.11	Operations precision and relation quality reporting
v1.12	Curation automation and safe relation rewrites
v1.13	Ontology relation refinement loop
v1.14	Architecture hardening and canonical metrics
v1.15	Operational automation, query routing, graph hygiene, and release automation
v1.16	Temporal provenance and episode ledger
v1.17	Agent Behavioral Guardrails

Version Compatibility

This kit is versioned whenever Owen's WIKI operating model changes. See CHANGELOG.md for release history.

Sponsor

License

MIT — free to use, modify, and distribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Owen-WIKI Template Kit

16 Core Features

Why Owen-WIKI Extends The Early LLM Wiki Pattern

Benefits At A Glance

Canonical Metrics Block

Graph (`graphify-out`)

What's Included

Core Documents And Templates

Script Catalog (`scripts/`, 52 files)

Core linting and statistics

Bulk source absorption and cluster hubs

Ontology, graph, and query operations

Action queue, lifecycle, and operations dashboard

Context compaction and prose metrics

Release, metrics, tags, and folder operations

Quick Start (5 Minutes)

Architecture Overview

Four Layers

Five Page Types

Core Workflows

Version Milestones

Version Compatibility

Sponsor

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
assets		assets
ontology-templates		ontology-templates
scripts		scripts
starter-files		starter-files
templates		templates
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
README.md		README.md
SETUP-GUIDE.md		SETUP-GUIDE.md

Folders and files

Latest commit

History

Repository files navigation

Owen-WIKI Template Kit

16 Core Features

Why Owen-WIKI Extends The Early LLM Wiki Pattern

Benefits At A Glance

Canonical Metrics Block

Graph (graphify-out)

What's Included

Core Documents And Templates

Script Catalog (scripts/, 52 files)

Core linting and statistics

Bulk source absorption and cluster hubs

Ontology, graph, and query operations

Action queue, lifecycle, and operations dashboard

Context compaction and prose metrics

Release, metrics, tags, and folder operations

Quick Start (5 Minutes)

Architecture Overview

Four Layers

Five Page Types

Core Workflows

Version Milestones

Version Compatibility

Sponsor

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Graph (`graphify-out`)

Script Catalog (`scripts/`, 52 files)

Packages