Claude Spec-Driven Development — as an executable contract.
csdd is a single Go binary that turns the Spec-Driven Development (SDD) workflow for Claude Code from "good intentions in markdown" into a contract that is validated mechanically — for humans and AI agents.
| 5 | 1 | ~8.7k |
|---|---|---|
| managed resources | binary, zero runtime deps | lines of Go |
AI agents write code fast. Too fast to ship without a contract.
- Scope creep. Without an explicit, testable requirement, the agent "interprets" — and every run interprets differently. There is no baseline to review against.
- Zero traceability. Code with no link to a requirement. Nobody knows why a function exists, or what breaks if it changes.
- Human review too late. The human only sees the result in the PR — when reverting is expensive. The decision should be visible before the code.
The answer is SDD — Spec-Driven Development: a contract layer (requirements → design → tasks) with human approval at every gate. csdd is the tool that makes that contract impossible to break by accident.
A CLI + TUI in a single Go binary — the only sanctioned author of the workflow's artifacts. You don't hand-edit frontmatter, spec.json, or task annotations — you generate from a template, edit the body, and validate.
Flag-driven, headless. Exposes 100% of the functionality so Claude Code, Cursor, Codex, or CI can drive the binary without reading the source.
csdd spec generate photo-albums --artifact requirementsInteractive interface (Bubble Tea). Running csdd with no arguments opens wizards and an artifact browser. Same operations, same rules.
csdd # no args → interactive TUI🔑 Core principle: both surfaces call the same operation helpers. A single source of truth — what a human does in the TUI, an agent does identically via the CLI.
npx (cross-platform — fetches the right prebuilt binary for your OS/arch, no install):
npx @protonspy/csdd --help # run instantly, no install
npx @protonspy/csdd # interactive TUIPrefer the short csdd command on your PATH? Install it globally:
npm install -g @protonspy/csdd # then: csddPrebuilt binaries: grab the archive for your platform from the
releases page and put csdd on
your PATH. From source: go install github.com/protonspy/csdd@latest.
The examples below call
csdddirectly. Running via npx? Prefix them withnpx @protonspy/csdd, or alias it:alias csdd='npx @protonspy/csdd'.
| Resource | What it is | Location |
|---|---|---|
| 🧭 steering | Project memory loaded into every agent interaction. Standards and the why behind decisions. | .claude/steering/*.md |
| 📐 spec | Per-feature contract: spec.json + requirements + design + tasks (+ research/bugfix). |
specs/<feature>/ |
| 🛠️ skill | Executable workflow bundle: SKILL.md + references + assets + scripts. |
.claude/skills/<name>/ |
| 🤖 agent | Custom sub-agent with a least-privilege tool scope (reviewer, debugger…). | .claude/agents/<name>.md |
| 🔌 mcp | Model Context Protocol servers the agent can connect to. stdio or remote, never both. | .mcp.json |
Verbs per resource. Common base: create/init · list · show · delete. spec adds generate · approve · validate · status; mcp uses add · remove · enable · disable · validate; skill adds add-reference/script/asset · validate.
Two distinctions matter more than anything else.
The requirements, the File Structure Plan in the design, and the _Boundary:_ / _Depends:_ annotations on the tasks. 👤 Humans review and approve this.
Components, internals, sequencing within each task. How the contract is fulfilled. 🤖 The agent is free here, after approval.
The second distinction is the phase gates: no phase is generated before the previous one is approved by a human — and that is enforced mechanically, not by convention.
Four phases, three human gates:
Discovery → [gate] Requirements → [gate] Design → [gate] Tasks → Implementation
State lives in spec.json. Generating design while requirements is not approved fails — it's not a warning, it's exit code 2.
ready_for_implementationonly becomestrueafter all 3 approvals.--forcebreaks the gate — only with explicit human authorization (Quick Plan), and it shows up in history.
csdd spec generate albums --artifact design
✗ phase gate: 'requirements' must be
approved before generating 'design'.
# the right path:
csdd spec approve albums --phase requirements
✓ requirements approved# 1 · bootstrap (once per repo)
csdd init --with-baseline
# 2 · create the feature workspace
csdd spec init photo-albums
# 3 · requirements → edit in EARS → validate → approve
csdd spec generate photo-albums --artifact requirements
csdd spec validate photo-albums # exit 2 = fix what it flags
csdd spec approve photo-albums --phase requirements
# 4 · design (blocked until step 3 passes) → 5 · tasks (same)
csdd spec generate photo-albums --artifact design # ... validate, approve
csdd spec generate photo-albums --artifact tasks # ... validate, approve
✓ spec.json: ready_for_implementation = true # implementation can begin💡
csdd spec status <feature>between any two steps: phase + approvals + validation issues on a single screen.
Fixed, testable syntax, one behavior per criterion. SHALL — never should. Unique N.M IDs.
### Requirement 1: Album Management
1. WHEN a user creates an album
THEN the system SHALL persist it <500ms.
2. IF the name is empty
THEN the system SHALL return 400.
3. WHILE deleting THE SYSTEM SHALL
block new uploads.
Each leaf traces requirements; parallelism is declared and verified.
- [ ] 2. AlbumService _Boundary: AlbumService_
- [ ] 2.1 create / rename / delete
_Requirements: 1.1, 1.2_
- [ ] 3. PhotoService _Boundary: PhotoService_ (P)
- [ ] 3.1 upload S3
_Requirements: 2.1_
_Depends: 1.2_
_Requirements:_on every leaf_Boundary:_on every(P)_Depends:_between boundaries
(P) = runs in parallel. Two (P) tasks cannot share a boundary — the validator rejects it, guaranteeing safe parallel execution by agents.
Once the gate is open, the agent implements in TDD:
RED (write the test) → GREEN (minimum to pass) → REFACTOR (clean under green) → widen the net (full suite + lint)
- One leaf task per invocation. Takes the ID from
specs/<f>/tasks.md; doesn't batch tasks "to save time". - RED fails for the right reason. A compile error doesn't count — it cites the failure before moving on.
- Never weakens a test to make the suite pass. New behavior = new RED.
Before reporting done: run the executable checks and produce real evidence — "compiles" and "looks right" are not done.
go test ./... ✓
lint ✓
typecheck ✓
build ✓
Each leaf traces _Requirements:_ → the test proves the requirement. Evidence beats assertion.
code-reviewer → /csdd-commit → git push (pre-push gate) → Pull Request
code-reviewerruns on the diff; resolve every Blocker before moving on.security-reviewerif it touches auth / secrets / input — resolve Critical/High.- Reviewers don't write — you apply the fix and re-review until clean.
# Conventional Commits, generated from the diff + spec
feat(photo-albums): add album rename
Implements photo-albums; tasks 2.1, 2.2.
# git push → hook runs the suite; red BLOCKS
git push
✗ pre-push: test gate failed — push blocked
Never commit with an open Blocker; never git push --no-verify. The PR carries evidence: spec links · completed tasks · real check output · risks.
main.go
(no args → TUI · with args → CLI)
│
┌────────────────────┴────────────────────┐
cmd/ · CLI tui/ · TUI
dispatcher, 1 file/resource Bubble Tea · wizards + browser
└──── both call the SAME operation helpers ────┘
│
workspace · paths · validator · templater · frontmatter · render
│
artifacts on disk: .claude/ · specs/ · CLAUDE.md · .mcp.json
(plain text, reviewable in a PR)
| Package | Responsibility | Why it matters |
|---|---|---|
cmd/ |
CLI surface. Dispatches resource action, flag parsing, 1 file per resource. Includes CLAUDE.md and .gitignore wiring. |
The public contract — 100% of functionality, headless. |
tui/ |
Interactive front-end (Bubble Tea): menu, wizards, artifact browser. | Calls the same helpers as cmd. No duplicated logic. |
internal/workspace |
Resolves the .claude/ root by walking up the tree; validates kebab-case; enumerates phases and artifacts. |
Defines what a workspace is and the valid names. |
internal/paths |
Centralizes the on-disk layout: .claude/, CLAUDE.md, .mcp.json, specs/. |
The layout lives in exactly one place. |
internal/validator |
The mechanical checks: EARS, unique IDs, traceability, annotations, parallelism safety, skill structure. | The agent's "friend." Never asks for judgment — only true/false. Exit 2. |
internal/templater |
Renders templates embedded at compile time (go:embed). |
A fully self-contained binary — zero runtime dependencies. |
internal/frontmatter |
Parser for a minimal subset of YAML (scalars, bool, inline arrays). | Does only what's needed — small, predictable surface. |
internal/render |
Terminal output helpers with color (respects NO_COLOR/TTY). |
Consistent ✓ ✗ ! • messages in the CLI. |
- CLI = TUI, always. Both surfaces converge on the same helpers. There is no function only the TUI can do — which is why a headless agent has 100% of the power.
- Embedded templates.
go:embed all:templatescompiles the templates into the binary. You download one file and it works offline, with nothing to install. - Mechanical, not opinionated, validation. The validator never asks for judgment: either the criterion starts with
WHENor it doesn't. Deterministic → an agent can trust the exit code. - Artifacts are plain text. Everything becomes versionable markdown/JSON in
.claude/andspecs/. Review happens in the PR, with the tools the team already uses.
The result: the CLI never stops you from doing the right thing — it stops you from doing the wrong thing without making the decision visible. Breaking a gate requires an explicit
--force, and that shows up in history.
| Gate | Checks |
|---|---|
| spec · requirements | Every criterion starts with WHEN/WHILE/IF/WHERE/THE SYSTEM · none uses should · ### Requirement N: headers unique |
| spec · design | Boundary Map and File Structure Plan sections present · every requirement ID appears in the traceability table · design.md ≤ 1000 lines (else split the spec) |
| spec · tasks | Every leaf has _Requirements:_ with real IDs · every (P) has a _Boundary:_ that matches the design · no (P) pair shares a boundary |
| skill · mcp · steering | SKILL.md ≤ 500 lines / ~5k tokens, refs cited · mcp: exactly 1 transport (stdio or url) · steering: valid inclusion, fileMatch has a pattern |
Exit codes: 0 ok · 1 usage error · 2 validation failure. Scriptable in CI.
The workspace csdd writes is the layout Claude Code expects. csdd init bootstraps it and handles the wiring:
CLAUDE.md # entry point + steering imports
.claude/steering/*.md # @-referenced from CLAUDE.md
.claude/agents/*.md # sub-agents (Read, Grep…)
.claude/skills/<n>/ # skill bundles
.claude/commands/ # slash commands (/csdd-commit)
.claude/hooks/ # deterministic automation
specs/<feature>/ # SDD contracts
.mcp.json # MCP servers
Creating a steering automatically inserts @.claude/steering/<name> into a managed block of CLAUDE.md — idempotent, never clobbering manual edits.
What the team gains:
- Zero friction with Claude Code. Artifacts are read natively — no exporting or converting.
- Review where we already work. Specs and steering are text in a PR — diff, comment, approve.
- Least privilege by default. Sub-agents are born with
Read, Grep; MCP with a restricted scope. - CI validates the contract. A
csdd spec validatein the pipeline blocks a broken spec before merge.
csdd is Claude Code-native, but the SDD artifacts aren't locked in. csdd export
converts the workspace to other agentic toolchains — a one-way, additive export
that lives alongside .claude/ (nothing is overwritten in place):
csdd export kiro # → .kiro/steering/*.md + .kiro/specs/<feature>/{requirements,design,tasks}.md
csdd export codex # → AGENTS.md (CLAUDE.md + steering inlined) + .codex/config.toml (MCP)
csdd export kiro --out ./build --force- Kiro — steering frontmatter (
inclusion: always|fileMatch|manual|auto,fileMatchPattern) is already Kiro-compatible, so steering copies verbatim; specs copy their SDD markdown (spec.jsonis dropped — Kiro tracks phase state in-IDE). - Codex — Codex has no
@-import, so the managed steering block inCLAUDE.mdis replaced by the steering inlined intoAGENTS.md;.mcp.jsonbecomes[mcp_servers.*]tables in.codex/config.toml.
# build the binary
go build -o csdd .
# bootstrap a repo with baseline steering
csdd init --with-baseline
# take your first feature through to ready_for_implementation
csdd spec init my-feature
csdd spec generate my-feature --artifact requirementsTakeaways: The validator is your friend. The gate makes the decision visible. Contract before code — requirements → design → tasks, each approved by a human before the next. Always generate from a template; never hand-write frontmatter or spec.json. Least privilege everywhere.