Applied Epistemic Engineering toolkit for AI-assisted development.
Intelligence proposes. Constraints decide. The ledger remembers.
specsmith treats belief systems like code: codable, testable, and deployable. It scaffolds epistemically-governed projects, stress-tests requirements as BeliefArtifacts, runs cryptographically-sealed trace vaults, and orchestrates AI agents under formal AEE governance.
0.11.0 — EU AI Act / NIST AI RMF compliance, context window management, and governance tools panel. Specsmith now ships a full compliance and auditability layer aligned to the EU AI Act (2024/1689) and the NIST AI Risk Management Framework 1.0. Every agent action is cryptographically sealed, every AI-generated output is disclosed, context windows are GPU-aware and protected against overflow, and a dedicated governance tools panel in Kairos surfaces compliance settings per-session and per-project.
specsmith governance-serve --port 7700 # Kairos governance REST API
specsmith sync # sync .specsmith/ from docs/ markdown
specsmith agent permissions-check git_push # check tool permission (REQ-012)
specsmith ollama gpu # detect GPU VRAM, recommend context size
specsmith export # generate full compliance reportIt also co-installs the standalone epistemic Python library for direct use in any project:
from epistemic import AEESession # works in any Python 3.10+ project
from epistemic import BeliefArtifact, StressTester, CertaintyEngineAEE treats requirements, decisions, and assumptions — the beliefs your project depends on — as engineering artifacts subject to the same discipline as code: version control, testing, and refactoring.
The 4-step core method: Frame → Disassemble → Stress-Test → Reconstruct
The 5 foundational axioms:
- Observability — every belief must be inspectable
- Falsifiability — every belief must be challengeable
- Irreducibility — beliefs decompose to atomic primitives
- Reconstructability — every failed belief can be rebuilt
- Convergence — stress-test + recovery always reaches Equilibrium
specsmith tracks your project through the full AEE development cycle:
🌱 Inception → 🏗 Architecture → 📋 Requirements → ✅ Test Spec
→ ⚙ Implementation → 🔬 Verification → 🚀 Release
specsmith phase # show current phase + readiness checklist
specsmith phase next # advance to the next phase (runs checks first)
specsmith phase set requirements # jump to a specific phase
specsmith phase list # list all phasesThe current phase is persisted in scaffold.yml as aee_phase and displayed in the
Kairos Governance page. Each phase has a checklist of file/command criteria, recommended
commands, and a readiness percentage.
Recommended — via pipx (works with Kairos, any terminal, and CI):
pipx install specsmith # core CLI + epistemic library
pipx inject specsmith anthropic # + Claude support
pipx inject specsmith openai # + GPT / O-series support
pipx inject specsmith google-generativeai # + Gemini supportOr with pip:
pip install specsmith # core
pip install "specsmith[anthropic]" # + Claude
pip install "specsmith[openai]" # + GPT/O-series
pip install "specsmith[gemini]" # + GeminiUpdate:
pipx upgrade specsmith
specsmith self-update# New project (interactive)
specsmith init
# Adopt an existing project
specsmith import --project-dir ./my-project
# Check governance health
specsmith audit --project-dir ./my-project
# Run AEE stress-test on requirements
specsmith stress-test --project-dir ./my-project
# Full epistemic audit (certainty + logic knots + recovery proposals)
specsmith epistemic-audit --project-dir ./my-project
# Start the agentic REPL
specsmith run --project-dir ./my-project
# AG2 agent shell — Planner/Builder/Verifier over Ollama
specsmith agent status # check agent config + Ollama
specsmith agent plan "add logging" # plan only (no execution)
specsmith agent run "fix lint errors" # full Plan → Build → Verify
specsmith agent improve "add tests" # self-improvement with reports
specsmith agent verify # run Verifier on current state
specsmith agent reports # list improvement reports
# Check current AEE workflow phase
specsmith phase --project-dir ./my-project.specsmith/ always mirrors the human-readable docs/ governance files.
Run specsmith sync after any change to docs/REQUIREMENTS.md or docs/TESTS.md:
specsmith sync # regenerate .specsmith/requirements.json + testcases.json
specsmith sync --check # CI mode: exits 1 if out of sync without writing
specsmith sync --json # emit sync result as JSONspecsmith agent permissions # show active permission profile
specsmith agent permissions-check git_push # check if git_push is allowed
specsmith agent permissions-check git_push --no-log # dry-run (no ledger write)Configure in docs/SPECSMITH.yml:
agent:
permissions:
preset: standard # read_only | standard | extended | admin
# Or custom:
allow: [read_file, write_file, run_shell, git_status]
deny: [git_push, git_create_pr]specsmith is designed from the ground up for auditable, explainable, and human-overseen AI. It implements concrete compliance mechanisms mapped to the two major regulatory frameworks that govern AI systems in production today.
EU AI Act (Regulation 2024/1689) — The world's first comprehensive legal framework for AI, enforced across the European Union. High-risk AI systems must provide transparency, auditability, human oversight, and robustness. specsmith implements:
| EU AI Act Requirement | specsmith Mechanism |
|---|---|
| Art. 9 — Risk Management System | AEE verification loop with confidence scoring and equilibrium checks |
| Art. 12 — Logging & Record-Keeping | TraceVault SHA-256 chained ledger (tamper-evident, append-only) |
| Art. 13 — Transparency & Explainability | ai_disclosure block in every preflight response; /why in Nexus REPL |
| Art. 14 — Human Oversight | Human escalation threshold (--escalate-threshold); kill-switch CLI |
| Art. 15 — Accuracy & Robustness | Bounded retry (max 3×), confidence gates, hard context ceiling (REQ-247) |
| Art. 53 — GPAI Model Transparency | Provider + model name emitted in every ai_disclosure block |
NIST AI Risk Management Framework 1.0 (AI RMF) — The US standard for managing AI risk across the AI lifecycle. specsmith addresses all four core functions:
| NIST AI RMF Function | specsmith Mechanism |
|---|---|
| GOVERN — Policies & accountability | Governance rules (H1–H13), permissions profile, scaffold.yml policy |
| MAP — Risk identification | AEE stress-test, belief graph, contradictions and uncertainty metrics |
| MEASURE — Risk analysis | Confidence scoring, epistemic equilibrium, specsmith epistemic-audit |
| MANAGE — Risk treatment | Kill-switch, escalation, bounded retry, safe-write backup, permissions deny-list |
Every agent action, decision, milestone, and audit gate is recorded as a JSONL entry in
.specsmith/trace.jsonl. Each entry contains a SHA-256 hash of its own content plus the
hash of the previous entry, forming a cryptographic chain:
{"seq":1, "type":"DECISION", "description":"...", "hash":"a3f9...", "prev":"genesis"}
{"seq":2, "type":"MILESTONE", "description":"...", "hash":"7c2b...", "prev":"a3f9..."}Any modification to a past entry breaks every subsequent hash. specsmith trace verify
detects and reports the first corrupted entry. The file is append-only — overwrites are
blocked by safe_write. This satisfies EU AI Act Art. 12 (logging and record-keeping)
and NIST AI RMF GOVERN (accountability trail).
Every preflight response includes a mandatory ai_disclosure block:
{
"ai_disclosure": {
"governed_by": "specsmith",
"governance_gated": true,
"provider": "ollama",
"model": "qwen2.5:14b",
"spec_version": "0.11.0"
}
}This ensures every AI-generated output is traceable to its source model and version, meeting EU AI Act Art. 13 (transparency) and Art. 53 (GPAI transparency). It is impossible to suppress — the field is injected at the governance layer before any response is returned to the client.
When an action's confidence is below the escalation threshold, specsmith sets
escalation_required: true and includes an escalation_reason in the preflight payload.
Kairos surfaces this as a confirmation dialog before execution proceeds.
specsmith preflight "deploy to production" --escalate-threshold 0.85 --json
# → escalation_required: true, escalation_reason: "confidence 0.71 < threshold 0.85"This implements EU AI Act Art. 14 (human oversight) and NIST AI RMF MANAGE.
A kill-session CLI command and keyboard shortcut (surfaced in Kairos) immediately
terminates all active agent sessions and records a timestamped kill event in LEDGER.md:
specsmith kill-session # terminate all sessions, log kill event
specsmith kill-session --session abc123 # terminate a specific sessionThis satisfies EU AI Act Art. 14 §4 (ability to intervene and stop the AI system) and is required for certification of high-risk AI systems.
All governance file writes go through safe_write, which:
- Appends to
LEDGER.mdand.specsmith/ledger.jsonl— never truncates - Backs up any file before overwriting it (timestamped
.bakcopy) - Prevents accidental destruction of audit history
This satisfies EU AI Act Art. 12 (records must be kept for the lifetime of the system) and provides recovery capability per NIST AI RMF MANAGE.
Every agent tool call is gated through a permission profile. Tools outside the active profile are denied with exit code 3 and a ledger entry:
specsmith agent permissions-check git_push # exit 0 = allowed, exit 3 = denied
specsmith agent permissions # show active profileFour built-in presets (read_only, standard, extended, admin) plus full
custom allow/deny lists in .specsmith/config.yml. This implements NIST AI RMF GOVERN
(policy enforcement) and principle of least privilege per standard security practice.
Before any shell command is executed, agent.safety.is_safe_command() classifies it
against a deny list of destructive patterns (rm -rf, git push origin main,
kubectl apply, cat .env, etc.). Denied commands are blocked and logged.
This implements NIST AI RMF MANAGE (risk treatment at the action level).
specsmith export generates a full compliance report containing:
- AI System Inventory — all providers, models, and versions used
- Risk Classification — AEE phase, confidence scores, open work items
- Human Oversight Controls — active permission profile, escalation settings, kill-switch state
- Audit Trail Summary — TraceVault chain length, last verification, any tampering
specsmith export --format markdown > compliance-report.md
specsmith export --format json > compliance-report.jsonThis report is suitable for submission to regulators, internal audit teams, or SOC-2 / ISO-42001 reviewers.
Compliance settings are layered:
- Global defaults —
~/.specsmith/config.yml(user-level defaults) - Per-project policy —
.specsmith/config.yml(committed to the repo) - Per-session overrides — Kairos Governance panel or CLI flags
The Kairos Governance Tools Panel (Settings → Governance) exposes all compliance
controls in a live UI: escalation threshold, permission profile, kill-switch, audit log
viewer, and context window settings. Changes take effect immediately for the active
session and can optionally be written back to the per-project .specsmith/config.yml.
specsmith enforces safe, efficient use of LLM context windows — especially critical when running local models via Ollama where the context limit directly affects GPU VRAM.
specsmith ollama gpu # detect GPU VRAM (NVIDIA + AMD supported)
specsmith ollama available # show models within your VRAM budgetVRAM tiers and recommended context sizes:
| VRAM | Recommended Context |
|---|---|
| < 6 GB (CPU or low-end GPU) | 4,096 tokens |
| 6–11 GB | 8,192 tokens |
| 12–19 GB | 16,384 tokens |
| 20 GB+ | 32,768 tokens |
Override via SPECSMITH_OLLAMA_CONTEXT_LENGTH or ollama.context_length in .specsmith/config.yml.
The context fill tracker emits real-time JSONL events consumed by Kairos:
{"type": "context_fill", "used": 27500, "limit": 32768, "pct": 83.9}Kairos displays a compact fill bar in the agent footer. When fill reaches the compression threshold (default 80%), specsmith signals that context summarization should run before the next turn.
When fill reaches the compression threshold, specsmith automatically triggers conversation summarization — the current context is condensed to a compact summary that preserves key decisions and facts while freeing window space. This happens transparently before the next agent turn.
Configure in .specsmith/config.yml:
context:
compression_threshold_pct: 80 # trigger summarization at 80% fill
auto_compress: true # enable automatic compressionA hard reservation of 15% of the context window (minimum 2,048 tokens) is always
held back for the governance layer. Attempts to fill beyond the effective ceiling raise
ContextFullError — making it impossible to reach a state where even a compression
request cannot be processed. This is a safety invariant, not a configuration option.
Kairos is the companion Rust terminal runtime (BitConcepts/kairos). specsmith
acts as the governance backend: Kairos spawns specsmith governance-serve at startup
and routes all preflight and verify calls through it.
# Start the governance REST API (Kairos calls this automatically)
specsmith governance-serve --port 7700 --project-dir .
# Classify a natural-language utterance under Specsmith governance
specsmith preflight "fix the cleanup dry-run regression" --json
# Start the agentic REPL
specsmith run
> what does the cleanup module do? # read-only ask -> answered
> fix the cleanup dry-run regression # change -> Specsmith approves, runs
> delete the entire dist directory # destructive -> needs clarificationThe Nexus runtime is specsmith's local-first agentic REPL — a governance-gated broker that sits between you and the LLM.
Every utterance passes through specsmith preflight before execution.
The broker classifies intent, matches requirements, and gates the action.
After execution, specsmith verify checks equilibrium. The /why command
shows the full governance trace.
# Interactive REPL with governance
specsmith run
nexus> fix the cleanup bug # broker classifies → accepts → executes → verifies
nexus> /why # show governance trace for last action
nexus> /exitThe Nexus broker:
- Preflight gate: every change goes through
specsmith preflight - Bounded retry: failed actions retry up to 3× with strategy classification
- Execution trace: every action is sealed in the cryptographic trace vault
/whytoggle: shows governance rationale in human-readable form
**How it works.** A natural-language **broker** classifies intent, infers scope from
your requirements, and asks Specsmith to **preflight** the request. Only when the
preflight decision is `accepted` does Nexus drive the AG2 orchestrator — and it does so
through a **bounded-retry harness** so you can never accidentally run away. By default,
Nexus speaks plain English; toggle `/why` in the REPL to surface the underlying
requirement, test, and work-item identifiers Specsmith assigned.
**Pieces in this repo.**
- `specsmith preflight` — CLI subcommand emitting a deterministic governance JSON payload
(`decision`, `requirement_ids`, `test_case_ids`, `confidence_target`, `instruction`).
- `src/specsmith/agent/broker.py` — natural-language broker (intent + scope + narration).
- `src/specsmith/agent/repl.py` — Nexus REPL with the `/why` toggle and execution gate.
- `docker-compose.yml` — pinned vLLM `l1-nexus` model server with the Hermes tool-call parser.
- `scripts/nexus_smoke.py` — opt-in live smoke test (`NEXUS_LIVE=1` to run against
a running container).
---
## Kairos — Flagship Terminal Client
**[Kairos](https://github.com/BitConcepts/kairos)** is the recommended terminal client for specsmith.
Kairos spawns specsmith as a managed governance child process at startup and routes all
preflight, verify, and BYOE proxy calls through it. The Governance settings page shows live
specsmith status, version, and one-click update.
```bash
# Kairos starts specsmith automatically; or run manually:
specsmith governance-serve --port 7700 --project-dir .
The VS Code extension (specsmith-vscode) has been deprecated in favour of Kairos.
Use pipx install specsmith for standalone CLI usage from any terminal.
specsmith is open source and built by a small team. Every bit of support helps:
- ⭐ Star specsmith and kairos on GitHub
- 📣 Tell your friends and colleagues — word of mouth is our best marketing
- 🐛 Report bugs via GitHub Issues — even small ones help
- 💡 Suggest features via GitHub Discussions — we read every suggestion
- 🔧 Fix bugs and contribute — see CONTRIBUTING.md; PRs welcome
- 📝 Write about specsmith — blog posts, tutorials, and talks help the community grow
- ❤️ Sponsor BitConcepts — directly funds development
specsmith has first-class Ollama support, including:
specsmith ollama gpu # detect GPU and VRAM tier
specsmith ollama available # show catalog filtered by VRAM budget
specsmith ollama available --task code # filter by task type
specsmith ollama pull qwen2.5:14b # download a model
specsmith ollama suggest requirements # task-based recommendations
specsmith ollama list # show installed modelsGPU-aware context sizing: 4K/8K/16K/32K tokens based on detected VRAM.
Override via SPECSMITH_OLLAMA_CONTEXT_LENGTH env var or ollama.context_length in .specsmith/config.yml.
specsmith supports FPGA-specific project types with full governance:
# scaffold.yml
type: fpga-rtl-amd # or fpga-rtl-intel / fpga-rtl-lattice / fpga-rtl
fpga_tools:
- vivado
- gtkwave
- vsg
- ghdl
- verilatorSupported tools: Synthesis: vivado, quartus, radiant, diamond, gowin. Simulation: ghdl, iverilog, verilator, modelsim, questasim, xsim. Waveform: gtkwave, surfer. Linting: vsg, verible, svlint. Formal: symbiyosys. OSS flow: yosys, nextpnr, openFPGALoader.
Governance: init import audit validate diff upgrade compress doctor export architect
AEE Epistemic: stress-test epistemic-audit belief-graph trace seal/verify/log integrate
Workflow: phase show/set/next/list ledger add/list req list/add/gaps/trace
Agent: run agent run/plan/status/verify/improve/reports agent providers/tools/skills
Ollama: ollama list/available/gpu/pull/suggest
Workspace: workspace init/audit/export
VCS: commit push sync branch pr status
Tools: tools scan [--fpga] tools install <tool> tools rules [--tool] [--list]
Tools: exec ps abort watch optimize credits self-update
Auth: auth set/list/remove/check
Patent: patent search/prior-art
Software: Python CLI/lib/web, Rust, Go, C/C++, .NET, Node.js/TypeScript, mobile, microservices, data/ML.
Hardware/Embedded: FPGA/RTL (Xilinx, Intel, Lattice, generic), Yocto BSP, embedded C/C++.
Documents: Technical specs, research papers, API specs, requirements management.
Business/Legal: Business plans, patent applications, compliance frameworks.
The standalone epistemic Python library works in any Python 3.10+ project — no specsmith coupling:
from epistemic import AEESession, BeliefArtifact, StressTester
session = AEESession("my-project", threshold=0.70)
session.add_belief(
artifact_id="HYP-001",
propositions=["The API always returns valid JSON"],
epistemic_boundary=["Valid auth token required"],
)
session.accept("HYP-001")
result = session.run()
print(result.summary())
# certainty=0.55, failures=2, equilibrium=FalseUse cases: linguistics research, compliance pipelines, AI alignment, patent prosecution.
13 hard rules enforced by specsmith validate:
- H11 — Every loop or blocking wait must have a timeout, fallback exit, and diagnostic message.
- H12 — Windows multi-step automation goes into
.cmdfiles, not inline shell invocations. - H13 — Agent tools must declare epistemic contracts (what they claim and what they cannot detect).
specsmith governs itself — the specsmith repo is a specsmith-managed project. Run specsmith audit
in this repo to check its governance health. This means every feature we add to specsmith is
immediately dogfooded on specsmith itself. Kairos
is the companion terminal and flagship client.
specsmith.readthedocs.io — Full manual: AEE primer, command reference, project types, tool registry, governance model, Ollama guide, Kairos integration.
MIT — Copyright (c) 2026 BitConcepts, LLC.