I build deterministic governance infrastructure for AI systems.
Phionyx treats large language model outputs as noisy cognitive measurements rather than final answers. The goal is to place a verifiable governance runtime between AI systems and real-world action: safety gates, ethics gates, telemetry, evaluation standards, state evolution, and audit-first control.
Latest (2026-05): Phionyx Core v0.5.0 is live on PyPI (
pip install phionyx-core) alongside 5 open-source companion packages that wire the runtime into MCP hosts, Inspect AI, LangChain / LangGraph, and the OpenAI Agents SDK. Phionyx Evaluation Standard v0.2.0 (Released 2026-05-24) ships the Evidence-Oriented Runtime Telemetry Profile — a vendor-neutral JSON schema for governance evidence rows. See phionyx.ai for the runtime narrative and where to start.
The work organises around three audience entry points, mirrored on phionyx.ai:
AI output should not directly become action. Phionyx adds deterministic gates between model output and real-world action.
Repos that implement and demonstrate the pattern:
- phionyx-research — the core runtime; 46-block canonical pipeline, kill switch, HITL queue, ethics gate, audit chain.
pip install phionyx-core. - phionyx-mcp-server — MCP trust boundary; descriptor signing, signed envelopes, audit chain over third-party MCP tool calls.
- phionyx-pipeline-mcp — agent self-claim gate; verifies what the agent says it did against the repository's actual diff.
- hearthos — applied: bounded-authority household AI. Browser-only demo + policy gates. The Governance Trilogy, Book 1.
→ Read the full argument: phionyx.ai/bounded-authority
When AI characters drift, the story breaks. Phionyx detects narrative drift, state incoherence, and unsafe output before the scene reaches the player.
- phionyx-research ships the NPC drift reference trace under
examples/physics/— source-inspectable today; end-to-end runnable from the v0.6.0 classifier surface. - trace.phionyx.ai/school — School RPG demo (external surface) running the same coherence mechanism end-to-end.
→ Read the full argument: phionyx.ai/narrative-coherence
Every claim should be reproducible. Verify Phionyx through installable packages, tests, evidence rows, and public artefacts.
- phionyx-evaluation-standard — vendor-independent evaluation standard. v0.2.0 (today) ships the Evidence-Oriented Runtime Telemetry Profile + JSON Schema + worked evidence rows.
- phionyx-eval-inspect — Inspect AI bridge. Runtime evidence exported into Inspect
.evalevaluation logs. Replayable agent evaluations. - phionyx_langchain_langgraph — LangChain + LangGraph adapters. Every chain / tool / LLM event + supervisor handoff becomes a signed, hash-chained envelope.
- phionyx_openai_agents — OpenAI Agents SDK tracing bridge. Every Trace and Span becomes a signed, hash-chained envelope.
→ Read the full Evidence Matrix: phionyx.ai/evidence
- LLM output is not truth; it is a signal requiring governance.
- AI systems need runtime control, not only prompt-level safety.
- Safety, coherence, and telemetry should be structured before response release.
- Evaluation must include behavioural stability, not only benchmark performance.
- Human-facing AI should be explainable, auditable, and interruptible.
- Phionyx Evaluation Standard v0.2.0 — Evidence-Oriented Runtime Telemetry Profile (2026-05-24 · Release)
- Persistent Worlds Need Deterministic Governance (2026-05-22 · Substack post 5 · link)
- A model saying "fixed" is not evidence (2026-05-22 · X Article · link)
- MCP Connects Tools. Runtime Evidence Keeps Agents Accountable. (2026-05-19 · X Article · link)
- The Phionyx Architecture: Treating LLMs as Sensors, Not Oracles (2026-05-09 · Substack post 4 · link)
- Website: phionyx.ai — runtime evidence, bounded authority, narrative coherence
- Trace (narrative + School RPG demo): trace.phionyx.ai · @trace_phionyx
- Substack: phionyxresearch.substack.com
- X: @phionyx_ai
- ORCID: 0009-0002-3718-4010
If runtime evidence for agentic AI is a problem you have, watch phionyx-research to get email updates when we ship new experiments.


