Functional core, imperative shell: encode as essential principle by andrewemark · Pull Request #27 · driver-ai/driver-sdlc-plugin

andrewemark · 2026-05-26T20:37:40Z

Motivation

The plugin's curated test suites still felt low-value and mock-heavy even after /drvr:assess. We traced it to a structural cause in the planning skill:

Test strategy was designed independently of architecture (test pyramid ratios, coverage quotas, "mock at unit boundaries").
Under that framing, mock-heavy unit tests are structurally inevitable — the planner has nowhere to put logic except inside classes whose collaborators then get mocked.
The "scaffolding vs durable" framing leaned on assessment to clean up later, but the KEEP criterion ("asserts observable behavior") was a weaker bar than what we actually want: a test a reader can use to understand the code.

The fix flips the dependency: architecture determines test shape, not the other way around.

What this PR does

Encodes functional core, imperative shell (Bernhardt; also Hexagonal / Ports-and-Adapters) as a load-bearing architectural commitment across every phase of the SDLC. Under this commitment:

Pure-core code takes values in, returns values out — no I/O, no time, no randomness, no mutable shared state. Unit-tested with values in / values out and no mocks.
Imperative shell performs I/O and calls into the core. Integration-tested against real I/O.
A ""unit test"" that needs a mock for an internal module is a signal the boundary is broken — fix the architecture, not the test.
Mocks are permitted only at hard external boundaries (third-party APIs without sandboxes, cost-bearing services, hardware absent in test).

When the surrounding code isn't already in core/shell shape, the plugin steers each new feature toward extraction anyway — local mess from a clean extraction is preferred to a clean fit with an entangled neighbor. We're fighting architectural entropy actively.

Where it lands, by phase

Phase	File	Change
Constitution	`CLAUDE.md` + `templates/CLAUDE.md.template`	Principle as first Key Principle. Expanded in Engineering Practices. New `kind` task frontmatter field (`core`/`shell`/`both`). Plan schema requires a `### Core/Shell Decomposition` subsection under `## Architecture Fit`.
Research	`skills/research-guidance/SKILL.md`	Surfaces the natural decomposition during ""How"" research. Required section in overview.
Planning	`skills/planning-guidance/SKILL.md` (heaviest edit)	Test Strategy is now derived from a required Core/Shell Decomposition subsection. Test pyramid + coverage quotas + ""mock at unit boundaries"" replaced with: pure-core unit tests (values in / values out, no mocks) and shell integration tests (real I/O). ""Scaffolding test"" category removed — every test is now durable by construction. Added boundary self-review.
Validation	`commands/dry-run-plan.md`	New gap checks (mostly HIGH severity): missing decomposition, pure-core items that aren't pure, internal-module mocks in proposed tests, test/item classification mismatches.
Materialization	`skills/materialize-tasks/SKILL.md`	Reads the plan's decomposition, propagates per-task classification into task doc frontmatter (`kind`) and a `## Core/Shell` section. BLOCKs if plan lacks the decomposition.
Implementation	`skills/implementation-guidance/SKILL.md`	Sub-agent prompts now carry the core/shell rules + per-task classification. New ""core/shell boundary"" deviation category — needing a mock for an internal module is a stop-and-surface signal.
Assessment	`commands/assess.md`	PRUNE/KEEP/PROMOTE keyed on mock presence and core/shell membership. ""When uncertain, KEEP"" reversed for mock-heavy tests (PROMOTE instead so the architecture failure surfaces). New §FCIS boundary check in Step 4a — runs regardless of whether codebase has its own standards artifact.
Standards review	`agents/standards-review.md`	§FCIS check runs unconditionally (Step 2a). Codebase-specific standards layer on top (Step 2b, conditional).
Handoff	`commands/docs-artifacts.md`	New Functional Core, Imperative Shell section in `architecture.md` handoff template so reviewers can verify the boundary in the diff.
Pass-through	`commands/review.md`, `commands/setup.md`	Updated descriptions to reflect the new commitment.

What's deliberately NOT in this PR

Validation tests for the new contract. The existing test_plan_doc_sections probably doesn't know about the required ### Core/Shell Decomposition subsection. Worth adding as a follow-up.
CHANGELOG entry. Template version bumped to 1.1.0, but no CHANGELOG entry written.
README mention. README doesn't currently surface the commitment; worth adding for plugin users browsing the repo.
skills/intent-guidance, skills/sdlc-orchestration. Neither phase materially shapes architecture, so left untouched.

Backward compatibility

This is a real philosophical shift. Plans written before this change won't have a Core/Shell Decomposition subsection — materialize-tasks will BLOCK on them and route back to planning. That's intentional: we don't want pre-existing plans silently treated as conforming.

Test plan

Open a new feature with /drvr:feature, run through Research → Planning → see the Core/Shell Decomposition prompts land in the expected places
Try to materialize a plan missing the decomposition; confirm it BLOCKs
Run /drvr:assess on a test suite known to contain mock-heavy tests; confirm they surface as PRUNE/PROMOTE rather than KEEP
Run /drvr:review on a codebase without its own CLAUDE.md; confirm §FCIS check still runs

🤖 Generated with Claude Code

…iple Establishes the functional-core / imperative-shell architecture (Bernhardt; also Hexagonal / Ports-and-Adapters) as a load-bearing commitment that flows through every phase of the SDLC, not just a testing preference. The motivating problem: assessment-curated test suites still felt low-value and mock-heavy. Root cause was that the planning skill designed test strategy independently of architecture (test pyramid, coverage targets, "mock at unit boundaries"), which made mock-heavy unit tests structurally inevitable. The fix is to invert the dependency — architecture determines test shape. Changes: - CLAUDE.md: Add the principle as the first Key Principle. Expand in Engineering Practices with how it flows through each phase. Register `kind` (core/shell/both) as a task frontmatter field and require a Core/Shell Decomposition subsection in plan Architecture Fit. - skills/planning-guidance: Add the architectural commitment up front. Restructure Test Strategy to be derived from a required Core/Shell Decomposition subsection. Replace test-pyramid ratios, coverage quotas, and "mock at unit boundaries" guidance with: pure-core unit tests (values in / values out, no mocks) and shell integration tests (real I/O). Remove the "scaffolding test" category — every test is now durable by construction. Add boundary self-review. - skills/implementation-guidance: Add core/shell rules to the sub-agent prompt template. Add "core/shell boundary" as a deviation category — needing a mock for an internal module is a stop-and-surface signal, not a thing to paper over with a mock. - skills/materialize-tasks: Read the plan's Core/Shell Decomposition and propagate per-task classification into task doc frontmatter (`kind`) and a `## Core/Shell` section carrying the rules sub-agents enforce. BLOCK materialization for plans missing the decomposition. - skills/research-guidance: Surface the natural core/shell decomposition for the feature during "How" research. Add Core/Shell Decomposition to the overview template so planning has the seam in hand. - commands/assess: Rewrite PRUNE/KEEP/PROMOTE around mock presence and core/shell membership. Mock-on-internal-module is the strongest PRUNE signal. Reverse the "when uncertain, KEEP" default for mock-heavy tests — promote them so the architecture failure surfaces. Add §FCIS boundary check to code quality review (always runs, independent of whether the codebase has its own standards artifact). - commands/dry-run-plan: Add boundary checks — plans missing the decomposition, pure-core items that aren't pure, internal-module mocks in proposed tests, and test/item classification mismatches all surface as gaps (mostly HIGH severity). - agents/standards-review: Add §FCIS check (Step 2a, always runs) alongside codebase standards (Step 2b, conditional). Standards artifact path becomes optional. - commands/docs-artifacts: Add Functional Core, Imperative Shell section to the architecture.md handoff template so reviewers can verify the boundary is intact in the diff. - commands/review, commands/setup, templates/CLAUDE.md.template: Update descriptions to reflect the new commitment. Steering: when the surrounding code isn't already in core/shell shape, the plugin pushes for extraction anyway. Local mess from a clean extraction is preferred to a clean fit with an entangled neighbor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…verdict Three softenings to avoid forcing the functional-core / imperative-shell commitment onto code where it doesn't pay off, while keeping the pressure where it does. 1. Shell-only carve-out Some features are genuinely shell-only by nature: thin CRUD endpoints with no business logic, webhook forwarders, glue code, integration wrappers. The principle's value (extracting pure logic for clean unit testing) doesn't apply because there's no meaningful pure logic to extract. Plans can now declare "Pure core: (none — shell-only feature)" with a required Rationale that names the type of work. Vague rationales ("doesn't fit", "everything is I/O") still get pushed back at dry-run as MEDIUM-severity gaps. Specific rationales pass. For shell-only plans: test strategy is integration-only; the pure- core checks in self-review, dry-run, assess, and standards-review are skipped; only shell-side checks apply. Routing/dispatch branching in shell-only code is no longer flagged as "shell carrying logic" — the branching IS the feature. 2. Broader, justified mocking rule The previous rule (mocks only at "hard external boundaries: third- party APIs without sandboxes, cost-bearing services, hardware absent in test") was too narrow. It rejected legitimate cases: mocking an internal LLM client (real calls cost money), mocking an internal billing collaborator (don't want to actually charge), mocking time in retry-logic tests (real wall clock makes tests flaky). The rule now permits mocks of internal modules when the real collaborator is: - External (third-party API with no sandbox) - Expensive (real money per invocation) - Non-deterministic in ways you can't control - Absent in the test environment Every mock must be named with its justification (a comment on the mock or a note in the test docstring). Unjustified mocks of internal modules remain FAIL. Mocks of pure-core logic remain FAIL (boundary failure). "The real thing is slow" or "I don't want to set up the DB" are explicitly called out as not-acceptable justifications. 3. N/A verdict in §FCIS checks The §FCIS check in assess (Step 4a) and standards-review (Step 2a) was binary PASS/FAIL. That generated noise for cases where the decomposition genuinely doesn't apply: a routing function with URL-path branching, a framework-mandated hook with a one-line body, a justified-mock test, files attributed to a shell-only plan. Both checks now support N/A, recorded for transparency but not counted as FAIL and not triggering fix workflows. Each N/A row includes a brief reason ("shell-only plan, routing dispatch", "mock of LLM client — justified as expensive") so reviewers can audit the exceptions. Files changed: planning-guidance (decomposition template, test strategy template, mocking rules, constraints, confirm-approach, self-review, anti-patterns); dry-run-plan (checks 14-19 + gap-type table); assess (Step 4a verdicts + table example + opening framing); standards-review (Step 2a verdicts + example table); implementation- guidance (architectural commitment block, default, sub-agent prompt in both standards and no-standards variants, deviation table, anti- patterns); materialize-tasks (Step 3.5 shell-only handling, task doc template Core/Shell section). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

andrewemark requested review from adamtilton, cdeutsch, chriscasebolt-driver, dwhensley, eric-driverai, furnissj, ghiotto1, jkonkle and joeschmid as code owners May 26, 2026 20:37

andrewemark changed the title ~~Functional core, imperative shell: encode as load-bearing principle~~ Functional core, imperative shell: encode as essential principle May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Functional core, imperative shell: encode as essential principle#27

Functional core, imperative shell: encode as essential principle#27
andrewemark wants to merge 2 commits into
mainfrom
andrew/functional-core-imperative-shell

andrewemark commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andrewemark commented May 26, 2026

Motivation

What this PR does

Where it lands, by phase

What's deliberately NOT in this PR

Backward compatibility

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant