Skip to content

BitConcepts/specsmith

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

283 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

specsmith

CI Sponsor Docs PyPI Python 3.10+ License: MIT

Applied Epistemic Engineering toolkit for AI-assisted development.

Intelligence proposes. Constraints decide. The ledger remembers.

specsmith treats belief systems like code: codable, testable, and deployable. It scaffolds epistemically-governed projects, stress-tests requirements as BeliefArtifacts, runs cryptographically-sealed trace vaults, and orchestrates AI agents under formal AEE governance.

0.11.0 — EU AI Act / NIST AI RMF compliance, context window management, and governance tools panel. Specsmith now ships a full compliance and auditability layer aligned to the EU AI Act (2024/1689) and the NIST AI Risk Management Framework 1.0. Every agent action is cryptographically sealed, every AI-generated output is disclosed, context windows are GPU-aware and protected against overflow, and a dedicated governance tools panel in Kairos surfaces compliance settings per-session and per-project.

specsmith governance-serve --port 7700     # Kairos governance REST API
specsmith sync                              # sync .specsmith/ from docs/ markdown
specsmith agent permissions-check git_push # check tool permission (REQ-012)
specsmith ollama gpu                        # detect GPU VRAM, recommend context size
specsmith export                            # generate full compliance report

It also co-installs the standalone epistemic Python library for direct use in any project:

from epistemic import AEESession         # works in any Python 3.10+ project
from epistemic import BeliefArtifact, StressTester, CertaintyEngine

What is Applied Epistemic Engineering?

AEE treats requirements, decisions, and assumptions — the beliefs your project depends on — as engineering artifacts subject to the same discipline as code: version control, testing, and refactoring.

The 4-step core method: Frame → Disassemble → Stress-Test → Reconstruct

The 5 foundational axioms:

  1. Observability — every belief must be inspectable
  2. Falsifiability — every belief must be challengeable
  3. Irreducibility — beliefs decompose to atomic primitives
  4. Reconstructability — every failed belief can be rebuilt
  5. Convergence — stress-test + recovery always reaches Equilibrium

The AEE Workflow — 7 Phases

specsmith tracks your project through the full AEE development cycle:

🌱 Inception → 🏗 Architecture → 📋 Requirements → ✅ Test Spec
    → ⚙ Implementation → 🔬 Verification → 🚀 Release
specsmith phase          # show current phase + readiness checklist
specsmith phase next     # advance to the next phase (runs checks first)
specsmith phase set requirements  # jump to a specific phase
specsmith phase list     # list all phases

The current phase is persisted in scaffold.yml as aee_phase and displayed in the Kairos Governance page. Each phase has a checklist of file/command criteria, recommended commands, and a readiness percentage.


Install

Recommended — via pipx (works with Kairos, any terminal, and CI):

pipx install specsmith                    # core CLI + epistemic library
pipx inject specsmith anthropic           # + Claude support
pipx inject specsmith openai              # + GPT / O-series support
pipx inject specsmith google-generativeai # + Gemini support

Or with pip:

pip install specsmith                     # core
pip install "specsmith[anthropic]"       # + Claude
pip install "specsmith[openai]"          # + GPT/O-series
pip install "specsmith[gemini]"          # + Gemini

Update:

pipx upgrade specsmith
specsmith self-update

Quick Start

# New project (interactive)
specsmith init

# Adopt an existing project
specsmith import --project-dir ./my-project

# Check governance health
specsmith audit --project-dir ./my-project

# Run AEE stress-test on requirements
specsmith stress-test --project-dir ./my-project

# Full epistemic audit (certainty + logic knots + recovery proposals)
specsmith epistemic-audit --project-dir ./my-project

# Start the agentic REPL
specsmith run --project-dir ./my-project

# AG2 agent shell — Planner/Builder/Verifier over Ollama
specsmith agent status                    # check agent config + Ollama
specsmith agent plan "add logging"        # plan only (no execution)
specsmith agent run "fix lint errors"     # full Plan → Build → Verify
specsmith agent improve "add tests"       # self-improvement with reports
specsmith agent verify                    # run Verifier on current state
specsmith agent reports                   # list improvement reports

# Check current AEE workflow phase
specsmith phase --project-dir ./my-project

Machine State Sync

.specsmith/ always mirrors the human-readable docs/ governance files. Run specsmith sync after any change to docs/REQUIREMENTS.md or docs/TESTS.md:

specsmith sync                     # regenerate .specsmith/requirements.json + testcases.json
specsmith sync --check             # CI mode: exits 1 if out of sync without writing
specsmith sync --json              # emit sync result as JSON

Least-Privilege Agent Permissions (REG-012)

specsmith agent permissions                      # show active permission profile
specsmith agent permissions-check git_push       # check if git_push is allowed
specsmith agent permissions-check git_push --no-log  # dry-run (no ledger write)

Configure in docs/SPECSMITH.yml:

agent:
  permissions:
    preset: standard       # read_only | standard | extended | admin
    # Or custom:
    allow: [read_file, write_file, run_shell, git_status]
    deny:  [git_push, git_create_pr]

AI Compliance & Governance

specsmith is designed from the ground up for auditable, explainable, and human-overseen AI. It implements concrete compliance mechanisms mapped to the two major regulatory frameworks that govern AI systems in production today.

Standards Coverage

EU AI Act (Regulation 2024/1689) — The world's first comprehensive legal framework for AI, enforced across the European Union. High-risk AI systems must provide transparency, auditability, human oversight, and robustness. specsmith implements:

EU AI Act Requirement specsmith Mechanism
Art. 9 — Risk Management System AEE verification loop with confidence scoring and equilibrium checks
Art. 12 — Logging & Record-Keeping TraceVault SHA-256 chained ledger (tamper-evident, append-only)
Art. 13 — Transparency & Explainability ai_disclosure block in every preflight response; /why in Nexus REPL
Art. 14 — Human Oversight Human escalation threshold (--escalate-threshold); kill-switch CLI
Art. 15 — Accuracy & Robustness Bounded retry (max 3×), confidence gates, hard context ceiling (REQ-247)
Art. 53 — GPAI Model Transparency Provider + model name emitted in every ai_disclosure block

NIST AI Risk Management Framework 1.0 (AI RMF) — The US standard for managing AI risk across the AI lifecycle. specsmith addresses all four core functions:

NIST AI RMF Function specsmith Mechanism
GOVERN — Policies & accountability Governance rules (H1–H13), permissions profile, scaffold.yml policy
MAP — Risk identification AEE stress-test, belief graph, contradictions and uncertainty metrics
MEASURE — Risk analysis Confidence scoring, epistemic equilibrium, specsmith epistemic-audit
MANAGE — Risk treatment Kill-switch, escalation, bounded retry, safe-write backup, permissions deny-list

How Each Compliance Mechanism Works

1. Tamper-Evident Audit Log — TraceVault (REQ-206)

Every agent action, decision, milestone, and audit gate is recorded as a JSONL entry in .specsmith/trace.jsonl. Each entry contains a SHA-256 hash of its own content plus the hash of the previous entry, forming a cryptographic chain:

{"seq":1, "type":"DECISION", "description":"...", "hash":"a3f9...", "prev":"genesis"}
{"seq":2, "type":"MILESTONE", "description":"...", "hash":"7c2b...", "prev":"a3f9..."}

Any modification to a past entry breaks every subsequent hash. specsmith trace verify detects and reports the first corrupted entry. The file is append-only — overwrites are blocked by safe_write. This satisfies EU AI Act Art. 12 (logging and record-keeping) and NIST AI RMF GOVERN (accountability trail).

2. AI Disclosure — Every Response (REQ-207)

Every preflight response includes a mandatory ai_disclosure block:

{
  "ai_disclosure": {
    "governed_by": "specsmith",
    "governance_gated": true,
    "provider": "ollama",
    "model": "qwen2.5:14b",
    "spec_version": "0.11.0"
  }
}

This ensures every AI-generated output is traceable to its source model and version, meeting EU AI Act Art. 13 (transparency) and Art. 53 (GPAI transparency). It is impossible to suppress — the field is injected at the governance layer before any response is returned to the client.

3. Human Escalation — Configurable Threshold (REQ-209)

When an action's confidence is below the escalation threshold, specsmith sets escalation_required: true and includes an escalation_reason in the preflight payload. Kairos surfaces this as a confirmation dialog before execution proceeds.

specsmith preflight "deploy to production" --escalate-threshold 0.85 --json
# → escalation_required: true, escalation_reason: "confidence 0.71 < threshold 0.85"

This implements EU AI Act Art. 14 (human oversight) and NIST AI RMF MANAGE.

4. Kill-Switch — Immediate Session Termination (REQ-210)

A kill-session CLI command and keyboard shortcut (surfaced in Kairos) immediately terminates all active agent sessions and records a timestamped kill event in LEDGER.md:

specsmith kill-session                   # terminate all sessions, log kill event
specsmith kill-session --session abc123  # terminate a specific session

This satisfies EU AI Act Art. 14 §4 (ability to intervene and stop the AI system) and is required for certification of high-risk AI systems.

5. Append-Only Safe Write — safe_write (REQ-213)

All governance file writes go through safe_write, which:

  • Appends to LEDGER.md and .specsmith/ledger.jsonl — never truncates
  • Backs up any file before overwriting it (timestamped .bak copy)
  • Prevents accidental destruction of audit history

This satisfies EU AI Act Art. 12 (records must be kept for the lifetime of the system) and provides recovery capability per NIST AI RMF MANAGE.

6. Least-Privilege Permissions (REQ-217, REQ-012)

Every agent tool call is gated through a permission profile. Tools outside the active profile are denied with exit code 3 and a ledger entry:

specsmith agent permissions-check git_push   # exit 0 = allowed, exit 3 = denied
specsmith agent permissions                  # show active profile

Four built-in presets (read_only, standard, extended, admin) plus full custom allow/deny lists in .specsmith/config.yml. This implements NIST AI RMF GOVERN (policy enforcement) and principle of least privilege per standard security practice.

7. Policy Guardrails — is_safe_command (REQ-220)

Before any shell command is executed, agent.safety.is_safe_command() classifies it against a deny list of destructive patterns (rm -rf, git push origin main, kubectl apply, cat .env, etc.). Denied commands are blocked and logged. This implements NIST AI RMF MANAGE (risk treatment at the action level).

8. Compliance Export Report (REQ-208, REQ-215)

specsmith export generates a full compliance report containing:

  • AI System Inventory — all providers, models, and versions used
  • Risk Classification — AEE phase, confidence scores, open work items
  • Human Oversight Controls — active permission profile, escalation settings, kill-switch state
  • Audit Trail Summary — TraceVault chain length, last verification, any tampering
specsmith export --format markdown > compliance-report.md
specsmith export --format json > compliance-report.json

This report is suitable for submission to regulators, internal audit teams, or SOC-2 / ISO-42001 reviewers.

Compliance per Session and per Project

Compliance settings are layered:

  1. Global defaults~/.specsmith/config.yml (user-level defaults)
  2. Per-project policy.specsmith/config.yml (committed to the repo)
  3. Per-session overrides — Kairos Governance panel or CLI flags

The Kairos Governance Tools Panel (Settings → Governance) exposes all compliance controls in a live UI: escalation threshold, permission profile, kill-switch, audit log viewer, and context window settings. Changes take effect immediately for the active session and can optionally be written back to the per-project .specsmith/config.yml.


Context Window Management

specsmith enforces safe, efficient use of LLM context windows — especially critical when running local models via Ollama where the context limit directly affects GPU VRAM.

GPU-Aware Context Sizing (REQ-244)

specsmith ollama gpu                    # detect GPU VRAM (NVIDIA + AMD supported)
specsmith ollama available              # show models within your VRAM budget

VRAM tiers and recommended context sizes:

VRAM Recommended Context
< 6 GB (CPU or low-end GPU) 4,096 tokens
6–11 GB 8,192 tokens
12–19 GB 16,384 tokens
20 GB+ 32,768 tokens

Override via SPECSMITH_OLLAMA_CONTEXT_LENGTH or ollama.context_length in .specsmith/config.yml.

Live Context Fill Indicator (REQ-245)

The context fill tracker emits real-time JSONL events consumed by Kairos:

{"type": "context_fill", "used": 27500, "limit": 32768, "pct": 83.9}

Kairos displays a compact fill bar in the agent footer. When fill reaches the compression threshold (default 80%), specsmith signals that context summarization should run before the next turn.

Auto Context Compression (REQ-246)

When fill reaches the compression threshold, specsmith automatically triggers conversation summarization — the current context is condensed to a compact summary that preserves key decisions and facts while freeing window space. This happens transparently before the next agent turn.

Configure in .specsmith/config.yml:

context:
  compression_threshold_pct: 80   # trigger summarization at 80% fill
  auto_compress: true             # enable automatic compression

Hard Context Ceiling — Never 100% Full (REQ-247)

A hard reservation of 15% of the context window (minimum 2,048 tokens) is always held back for the governance layer. Attempts to fill beyond the effective ceiling raise ContextFullError — making it impossible to reach a state where even a compression request cannot be processed. This is a safety invariant, not a configuration option.


Kairos + Governance REST API

Kairos is the companion Rust terminal runtime (BitConcepts/kairos). specsmith acts as the governance backend: Kairos spawns specsmith governance-serve at startup and routes all preflight and verify calls through it.

# Start the governance REST API (Kairos calls this automatically)
specsmith governance-serve --port 7700 --project-dir .

# Classify a natural-language utterance under Specsmith governance
specsmith preflight "fix the cleanup dry-run regression" --json

# Start the agentic REPL
specsmith run
> what does the cleanup module do?           # read-only ask -> answered
> fix the cleanup dry-run regression          # change -> Specsmith approves, runs
> delete the entire dist directory            # destructive -> needs clarification

Nexus

The Nexus runtime is specsmith's local-first agentic REPL — a governance-gated broker that sits between you and the LLM.

Every utterance passes through specsmith preflight before execution. The broker classifies intent, matches requirements, and gates the action. After execution, specsmith verify checks equilibrium. The /why command shows the full governance trace.

# Interactive REPL with governance
specsmith run
nexus> fix the cleanup bug         # broker classifies → accepts → executes → verifies
nexus> /why                         # show governance trace for last action
nexus> /exit

The Nexus broker:

  • Preflight gate: every change goes through specsmith preflight
  • Bounded retry: failed actions retry up to 3× with strategy classification
  • Execution trace: every action is sealed in the cryptographic trace vault
  • /why toggle: shows governance rationale in human-readable form

**How it works.** A natural-language **broker** classifies intent, infers scope from
your requirements, and asks Specsmith to **preflight** the request. Only when the
preflight decision is `accepted` does Nexus drive the AG2 orchestrator — and it does so
through a **bounded-retry harness** so you can never accidentally run away. By default,
Nexus speaks plain English; toggle `/why` in the REPL to surface the underlying
requirement, test, and work-item identifiers Specsmith assigned.

**Pieces in this repo.**
- `specsmith preflight` — CLI subcommand emitting a deterministic governance JSON payload
  (`decision`, `requirement_ids`, `test_case_ids`, `confidence_target`, `instruction`).
- `src/specsmith/agent/broker.py` — natural-language broker (intent + scope + narration).
- `src/specsmith/agent/repl.py` — Nexus REPL with the `/why` toggle and execution gate.
- `docker-compose.yml` — pinned vLLM `l1-nexus` model server with the Hermes tool-call parser.
- `scripts/nexus_smoke.py` — opt-in live smoke test (`NEXUS_LIVE=1` to run against
  a running container).

---

## Kairos — Flagship Terminal Client

**[Kairos](https://github.com/BitConcepts/kairos)** is the recommended terminal client for specsmith.
Kairos spawns specsmith as a managed governance child process at startup and routes all
preflight, verify, and BYOE proxy calls through it. The Governance settings page shows live
specsmith status, version, and one-click update.

```bash
# Kairos starts specsmith automatically; or run manually:
specsmith governance-serve --port 7700 --project-dir .

The VS Code extension (specsmith-vscode) has been deprecated in favour of Kairos. Use pipx install specsmith for standalone CLI usage from any terminal.


Supporting specsmith

specsmith is open source and built by a small team. Every bit of support helps:

  • Star specsmith and kairos on GitHub
  • 📣 Tell your friends and colleagues — word of mouth is our best marketing
  • 🐛 Report bugs via GitHub Issues — even small ones help
  • 💡 Suggest features via GitHub Discussions — we read every suggestion
  • 🔧 Fix bugs and contribute — see CONTRIBUTING.md; PRs welcome
  • 📝 Write about specsmith — blog posts, tutorials, and talks help the community grow
  • ❤️ Sponsor BitConcepts — directly funds development

Ollama — Local LLMs (Zero API Cost)

specsmith has first-class Ollama support, including:

specsmith ollama gpu                    # detect GPU and VRAM tier
specsmith ollama available              # show catalog filtered by VRAM budget
specsmith ollama available --task code  # filter by task type
specsmith ollama pull qwen2.5:14b      # download a model
specsmith ollama suggest requirements  # task-based recommendations
specsmith ollama list                  # show installed models

GPU-aware context sizing: 4K/8K/16K/32K tokens based on detected VRAM. Override via SPECSMITH_OLLAMA_CONTEXT_LENGTH env var or ollama.context_length in .specsmith/config.yml.


FPGA / HDL Projects

specsmith supports FPGA-specific project types with full governance:

# scaffold.yml
type: fpga-rtl-amd          # or fpga-rtl-intel / fpga-rtl-lattice / fpga-rtl
fpga_tools:
  - vivado
  - gtkwave
  - vsg
  - ghdl
  - verilator

Supported tools: Synthesis: vivado, quartus, radiant, diamond, gowin. Simulation: ghdl, iverilog, verilator, modelsim, questasim, xsim. Waveform: gtkwave, surfer. Linting: vsg, verible, svlint. Formal: symbiyosys. OSS flow: yosys, nextpnr, openFPGALoader.


50+ CLI Commands

Governance: init import audit validate diff upgrade compress doctor export architect

AEE Epistemic: stress-test epistemic-audit belief-graph trace seal/verify/log integrate

Workflow: phase show/set/next/list ledger add/list req list/add/gaps/trace

Agent: run agent run/plan/status/verify/improve/reports agent providers/tools/skills

Ollama: ollama list/available/gpu/pull/suggest

Workspace: workspace init/audit/export

VCS: commit push sync branch pr status

Tools: tools scan [--fpga] tools install <tool> tools rules [--tool] [--list]

Tools: exec ps abort watch optimize credits self-update

Auth: auth set/list/remove/check

Patent: patent search/prior-art


35 Project Types

Software: Python CLI/lib/web, Rust, Go, C/C++, .NET, Node.js/TypeScript, mobile, microservices, data/ML.

Hardware/Embedded: FPGA/RTL (Xilinx, Intel, Lattice, generic), Yocto BSP, embedded C/C++.

Documents: Technical specs, research papers, API specs, requirements management.

Business/Legal: Business plans, patent applications, compliance frameworks.


epistemic Library

The standalone epistemic Python library works in any Python 3.10+ project — no specsmith coupling:

from epistemic import AEESession, BeliefArtifact, StressTester

session = AEESession("my-project", threshold=0.70)
session.add_belief(
    artifact_id="HYP-001",
    propositions=["The API always returns valid JSON"],
    epistemic_boundary=["Valid auth token required"],
)
session.accept("HYP-001")
result = session.run()
print(result.summary())
# certainty=0.55, failures=2, equilibrium=False

Use cases: linguistics research, compliance pipelines, AI alignment, patent prosecution.


Governance Rules (H1–H13)

13 hard rules enforced by specsmith validate:

  • H11 — Every loop or blocking wait must have a timeout, fallback exit, and diagnostic message.
  • H12 — Windows multi-step automation goes into .cmd files, not inline shell invocations.
  • H13 — Agent tools must declare epistemic contracts (what they claim and what they cannot detect).

The specsmith Bootstrap

specsmith governs itself — the specsmith repo is a specsmith-managed project. Run specsmith audit in this repo to check its governance health. This means every feature we add to specsmith is immediately dogfooded on specsmith itself. Kairos is the companion terminal and flagship client.

Documentation

specsmith.readthedocs.io — Full manual: AEE primer, command reference, project types, tool registry, governance model, Ollama guide, Kairos integration.

Links

License

MIT — Copyright (c) 2026 BitConcepts, LLC.

About

Applied Epistemic Engineering toolkit · AEE agent sessions · execution profiles · FPGA/HDL governance · tool installer · 50+ CLI commands · Anthropic/OpenAI/Gemini/Ollama

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Contributors

Languages