[DSPX-3302] (6/6) feature-orchestrate skill + cells-of-effort spec#455
[DSPX-3302] (6/6) feature-orchestrate skill + cells-of-effort spec#455dmihalcik-virtru wants to merge 6 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the feature-orchestrate skill and a CLI command to automate multi-repo feature implementation using a 'cells of effort' model. This approach allows for granular task management within repositories via git worktrees and dependency-based parallel execution. Feedback from the reviewer focuses on enhancing the orchestrator's reliability and portability, specifically suggesting improvements for git branch handling during worktree setup, consolidating exception handling in subagents, providing configurability for root directories, broadening error catching for YAML parsing, and implementing a cap on parallel worker threads to prevent resource exhaustion.
| subprocess.check_call( | ||
| ["git", "-C", str(repo), "worktree", "add", str(wt), "-b", cell.branch], | ||
| ) |
There was a problem hiding this comment.
The git worktree add -b <branch> command will fail if the branch already exists in the repository. This is a common scenario when retrying a failed orchestration run where the worktree was removed but the branch remains. It's safer to check if the branch exists first and only use -b if it doesn't.
# Check if branch exists in the source repo
has_branch = subprocess.run(
["git", "-C", str(repo), "rev-parse", "--verify", cell.branch],
capture_output=True,
).returncode == 0
cmd = ["git", "-C", str(repo), "worktree", "add", str(wt)]
if not has_branch:
cmd.extend(["-b", cell.branch])
else:
cmd.append(cell.branch)
subprocess.check_call(cmd)| try: | ||
| wt = ensure_worktree(spec, cell) | ||
| except Exception as e: | ||
| return CellResult(cell, Path(), Path(), False, None, f"worktree: {e}") | ||
|
|
||
| ensure_subagent_settings(wt, cell.path) | ||
|
|
||
| transcripts_dir.mkdir(parents=True, exist_ok=True) | ||
| transcript = transcripts_dir / f"{spec.jira or spec.name}-{cell.key}.jsonl" | ||
|
|
||
| cmd = [ | ||
| "claude", "-p", | ||
| "--model", model, | ||
| "--permission-mode", "acceptEdits", | ||
| "--output-format", "stream-json", | ||
| "--verbose", | ||
| build_prompt(spec, cell), | ||
| ] | ||
| try: | ||
| with transcript.open("w", encoding="utf-8") as out: | ||
| completed = subprocess.run( | ||
| cmd, | ||
| cwd=wt, | ||
| stdout=out, | ||
| stderr=subprocess.STDOUT, | ||
| timeout=timeout_s, | ||
| ) | ||
| except subprocess.TimeoutExpired: | ||
| return CellResult(cell, wt, transcript, False, None, f"timed out after {timeout_s}s") | ||
|
|
||
| if completed.returncode != 0: | ||
| return CellResult(cell, wt, transcript, False, None, f"exit {completed.returncode}") | ||
|
|
||
| pr_url: str | None = None | ||
| for line in transcript.read_text(encoding="utf-8").splitlines(): | ||
| m = PR_URL_RE.search(line) | ||
| if m: | ||
| pr_url = m.group(0) | ||
| return CellResult(cell, wt, transcript, True, pr_url, None) |
There was a problem hiding this comment.
The current exception handling in run_cell is fragmented and misses several potential failure points (e.g., ensure_subagent_settings, transcript.open, or OSError from subprocess.run if the claude binary is missing). Wrapping the entire logic in a single try...except block ensures that any failure is captured and reported as a CellResult error, preventing thread crashes in the orchestrator.
wt = Path()
transcript = Path()
try:
wt = ensure_worktree(spec, cell)
ensure_subagent_settings(wt, cell.path)
transcripts_dir.mkdir(parents=True, exist_ok=True)
transcript = transcripts_dir / f"{spec.jira or spec.name}-{cell.key}.jsonl"
cmd = [
"claude", "-p",
"--model", model,
"--permission-mode", "acceptEdits",
"--output-format", "stream-json",
"--verbose",
build_prompt(spec, cell),
]
with transcript.open("w", encoding="utf-8") as out:
completed = subprocess.run(
cmd,
cwd=wt,
stdout=out,
stderr=subprocess.STDOUT,
timeout=timeout_s,
)
if completed.returncode != 0:
return CellResult(cell, wt, transcript, False, None, f"exit {completed.returncode}")
pr_url: str | None = None
for line in transcript.read_text(encoding="utf-8").splitlines():
m = PR_URL_RE.search(line)
if m:
pr_url = m.group(0)
return CellResult(cell, wt, transcript, True, pr_url, None)
except subprocess.TimeoutExpired:
return CellResult(cell, wt, transcript, False, None, f"timed out after {timeout_s}s")
except Exception as e:
return CellResult(cell, wt, transcript, False, None, str(e))| OPENTDF_ROOT = Path.home() / "Documents/GitHub/opentdf" | ||
| WORKTREES_ROOT = Path.home() / "Documents/GitHub/worktrees" |
There was a problem hiding this comment.
| try: | ||
| spec = load_spec(spec_path) | ||
| except ValueError as e: | ||
| typer.echo(f"Error: {e}", err=True) | ||
| raise typer.Exit(1) from e |
There was a problem hiding this comment.
The load_spec function uses ruamel.yaml.load, which can raise various YAMLError exceptions if the input file is malformed. The current try...except only catches ValueError. Catching a broader exception would make the CLI more robust against invalid YAML files.
| try: | |
| spec = load_spec(spec_path) | |
| except ValueError as e: | |
| typer.echo(f"Error: {e}", err=True) | |
| raise typer.Exit(1) from e | |
| try: | |
| spec = load_spec(spec_path) | |
| except Exception as e: | |
| typer.echo(f"Error: {e}", err=True) | |
| raise typer.Exit(1) from e |
| ) | ||
| failed.add(skipped.key) | ||
|
|
||
| with concurrent.futures.ThreadPoolExecutor(max_workers=max(1, len(runnable))) as ex: |
There was a problem hiding this comment.
Spawning one thread (and one claude -p process) per cell in a wave without a cap could lead to resource exhaustion (CPU/Memory) or API rate limiting if a feature has many independent cells. It's recommended to cap the max_workers to a reasonable value (e.g., 8).
max_workers = min(8, len(runnable)) if runnable else 1
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as ex:#450) ## Summary First PR in a five-part stack that introduces a multi-instance test harness and a Claude plugin for OpenTDF bug reproduction. This PR adds *only* the shared Pydantic schema in `otdf-sdk-mgr` — no consumers yet. - Adds `otdf_sdk_mgr.schema` with v2 models: `Scenario`, `Instance`, `PlatformPin`, `KasPin`, `SdkPin`, `ScenarioSdks`, `Suite`, etc. - `ScenarioSdks.encrypt` / `.decrypt` mirror xtest's existing `--sdks-encrypt` / `--sdks-decrypt` convention so a→b-only scenarios are first-class. - `python -m otdf_sdk_mgr.schema validate <path>` validates either a Scenario or an Instance file based on its `kind:`. - Adds `pydantic` + `ruamel.yaml` to `otdf-sdk-mgr/pyproject.toml`. - 6 unit tests covering round-trips, pin invariants, and unknown-field rejection. ## Stack 1. [**This PR**](#450) — Shared schema 2. [Platform installer + `install scenario`](#451) in `otdf-sdk-mgr` (builds on this) 3. `otdf-local` [multi-instance refactor](#452) + new CLI subcommands 4. `xtest/conftest.py` [integration](#453) (`--scenario`, `--instance`) 5. [Claude plugin](#454) (`.claude/skills/`, settings, plugin manifest) 6. #455 ## Test plan - [x] `cd otdf-sdk-mgr && uv run pytest tests/test_schema.py` — all 6 pass - [x] `uv run python -m otdf_sdk_mgr.schema validate <path>` accepts a valid scenarios.yaml and rejects unknown fields Jira: https://virtru.atlassian.net/browse/DSPX-3302 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added schema validation for OpenTDF Scenario and Instance YAML configurations with a new CLI command. * Introduced strict validation with cross-field constraints for SDK and platform configurations. * **Documentation** * Updated supported container formats from `nano` to `ztdf-ecwrap`. * **Dependencies** * Updated core package dependencies to support enhanced validation capabilities. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/opentdf/tests/pull/450?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5197c21 to
aae201c
Compare
b9c6214 to
7b646b2
Compare
X-Test Failure Report✅ java@main-main |
aae201c to
d6eacb2
Compare
…-3302)
Implements the second half of the multi-repo feature workflow. feature-design
(landed in PR 5) authors the spec and tests-side artifacts; feature-orchestrate
reads that spec, creates git worktrees, and fans claude -p subagents out in
topological waves so each cell of work proceeds in parallel where possible.
Spec schema change (informal — no Pydantic model yet):
The platform monorepo holds proto definitions, the Go SDK, KAS service code,
and shared libs, so a single feature often touches multiple "cells of effort"
inside one repo. The spec now expresses work as cells, not repos. Each cell
has a `path:` (which sibling repo to worktree from), a `branch:`, a `todo:`
list, and an optional `depends_on:` edge. Canonical example: every SDK cell
declares `depends_on: [platform-proto]` whenever the feature changes wire
format — the proto cell regenerates Go/Java/JS bindings before the SDKs can
adopt them.
- `otdf-sdk-mgr orchestrate run <spec>` (new CLI verb in `cli_orchestrate.py`):
parses the spec, topologically sorts cells by `depends_on`, creates worktrees
at `~/Documents/GitHub/worktrees/<JIRA-KEY>-<cell-key>/`, and dispatches one
`claude -p` subagent per cell. Cells in the same wave run in parallel via
`ThreadPoolExecutor`. Per-cell prompts embed the full spec body for
cross-cell context. `--dry-run`, `--only <cell>`, `--timeout`, `--model`
flags. Per-cell stdout captured to `.claude/tmp/runs/<KEY>-<cell>.jsonl`.
- Per-worktree `.claude/settings.json` auto-written if absent, with a minimal
allowlist tailored to the repo (`go`/`make`/`buf` for platform, `mvn` for
java-sdk, `npm` for web-sdk) plus universal `git`/`gh pr create`. Skipped
if the user has committed their own settings.
- `feature-orchestrate` SKILL.md: thin wrapper that surfaces the dry-run plan,
asks the user to confirm, then dispatches.
- `feature-design` SKILL.md Step 2/3: teaches the cell shape, `path:`,
`depends_on:`, and the proto-blocks-SDK pattern as the canonical example.
Branches are now per-cell (`<JIRA-KEY>-<cell-key>`) so concurrent worktrees
of the same repo don't collide.
- 10 new unit tests (`test_orchestrate.py`): load + validation, topological
waves with skip semantics, cycle detection (incl. diamond), worktree path
resolution. All passing alongside the existing 65.
- `xtest/features/{README,CLAUDE}.md`: cell-aware terminology.
- `settings.json` + `plugin.json`: `Bash(claude -p *)`, `Bash(git worktree *)`,
`Bash(gh pr create *)` allowlists; `Skill(feature-orchestrate)` registered.
Out of scope for this PR: status command, --retry, Pydantic model for Feature,
cross-PR linking automation, non-Sonnet subagent models. See plan file for
the full follow-up list.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…X-3302) Change Write(xtest/bug_*_test.py) to Write(xtest/**) — wildcards must sit at path boundaries, not in filename patterns. Consolidates and simplifies xtest write permissions across both settings.json and plugin.json. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…e fixes (DSPX-2399/DSPX-3302) Add pure ML-KEM-768 (FIPS 203) as a new TDF key-wrapping mechanism: - feature_type "mechanism-mlkem" in tdfs.py (platform gate >= 0.15.0) - mlkem:768 algorithm + enum 13 in abac.py - key_mlkem768 / attribute_with_mlkem768_key fixtures in fixtures/keys.py - test_mlkem768_roundtrip in test_pqc.py (dormant until SDK supports land) - xtest/features/mechanism-mlkem.yaml spec (DSPX-2399, 5 cells) - xtest/scenarios/mechanism-mlkem.yaml scenario Fix feature-orchestrate unattended-run failures: - Add golangci-lint to platform/otdfctl REPO_ALLOW (was blocking platform-service) - Disable commit.gpgsign in new worktrees (was blocking java-sdk via 1Password) - Add deferred signing section to final report with retroactive sign commands Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…D workflow (DSPX-3302) The orchestrator now handles the `tests` cell directly — push branch + open draft PR — in Wave 1 alongside `platform-proto`, rather than skipping it. This makes the dormant test definitions visible to downstream SDK CI jobs early, enabling a TDD/BDD approach where each per-repo PR can validate against the test suite as it adds its `supports <feature>` case. - Add TESTS_REPO constant and run_tests_cell() (push + idempotent gh pr create) - Default skip set is now empty; --only still works as before - Dry-run labels the tests cell action=push+draft-PR - _dispatch() routes tests cell to run_tests_cell, others to run_cell - Deferred signing section excludes the tests cell (user signed those commits) - Update SKILL.md description and body to reflect TDD lifecycle + new behavior - Fix module docstring (no longer says tests cell is skipped) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (DSPX-3302) Re-running `orchestrate run` after a partial failure now resumes from where it left off: cells whose branch already has an open PR are skipped (reported as OK with the existing URL), and only failed/pending cells are dispatched. - Add check_existing_pr() helper (gh pr list --head <branch>) - run_cell() checks for existing PR before creating worktree or launching subagent - --force flag bypasses the idempotency check (re-run even with existing PR) - Dry-run annotates already-done cells with [PR EXISTS: <url>] - SKILL.md: update Step 1/2 + replace "When to use partial runs" with "Resumption and partial runs" covering fix-and-retry, staging, and --force Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rompt (DSPX-3302) Subagents now receive an explanation of why `supports <feature>` must be added to the SDK CLI wrapper and how xtest uses it to activate dormant tests. Includes the pattern for probing SDK capabilities and a reminder that the scenario(s) are listed in the PR body. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7b646b2 to
5495629
Compare
|
X-Test Failure Report✅ java@main-main |



Summary
Adds
feature-orchestrate— the second half of the multi-repo feature workflow thatfeature-design(PR 5) started. Reads a feature spec atxtest/features/<name>.yaml, creates git worktrees, and fansclaude -psubagents out in topological waves to implement each cell of work in parallel where possible.What it does
feature-orchestrateskill (tests/.claude/skills/feature-orchestrate/SKILL.md) — thin wrapper that surfaces a dry-run plan, asks the user to confirm, then dispatches.otdf-sdk-mgr orchestrate run <spec>(tests/otdf-sdk-mgr/src/otdf_sdk_mgr/cli_orchestrate.py) — the engine. Parses the spec, topologically sorts cells bydepends_on, creates worktrees at~/Documents/GitHub/worktrees/<JIRA-KEY>-<cell-key>/, dispatches oneclaude -psubagent per cell. Cells in the same wave run in parallel viaThreadPoolExecutor. Each per-cell prompt embeds the full spec body for cross-cell context. Flags:--dry-run,--only <cell>(repeatable),--timeout,--model.Spec schema change (cells, not repos)
The Feature spec format evolves from "one entry per repo" to "one entry per cell of effort." The platform monorepo holds proto definitions, the Go SDK, KAS service code, and shared libs, so a single feature often touches multiple cells inside one repo. Each cell now has a
path:(which sibling repo to worktree from), abranch:, atodo:list, and an optionaldepends_on:edge.Canonical example: every SDK cell declares
depends_on: [platform-proto]whenever the feature changes wire format — the proto cell regenerates Go/Java/JS bindings before the SDKs can adopt them.feature-design's SKILL.md Step 2/3 is updated in this PR to teach the new cell shape. Branches are now per-cell (<JIRA-KEY>-<cell-key>) so concurrent worktrees of the same repo don't collide.Subagent dispatch
Each per-cell subagent:
claude -p --model sonnet --permission-mode acceptEditsinside its worktree.path/branch/todo+ house-style commit guidance (subject embeds(DSPX-XXXX), noJira:footer,Co-Authored-By: Claude).gh pr create --draft, prints the PR URL as the last line of stdout.--timeoutto override).The orchestrator pre-writes a minimal
.claude/settings.jsoninto each worktree if one isn't already there, allowlisting the repo-type-appropriate test commands (go/make/buffor platform,mvnfor java-sdk,npmfor web-sdk) plus universalgit/gh pr create. User-committed.claude/settings.jsonfiles are left alone.Stack
This PR is stacked on #454; merge that one first.
Test plan
cd tests/otdf-sdk-mgr && uv run pytest tests/test_orchestrate.py(10 new tests covering topological sort, cycle detection, diamond dependencies, schema validation, worktree path resolution)uv run otdf-sdk-mgr orchestrate run <spec> --dry-run— verify topo order and per-cell worktree paths print correctlyuv run otdf-sdk-mgr orchestrate run <spec> --only platform-proto— verify worktree creation, subagent dispatch, and PR URL extraction on one celldepends_on) — verify wave ordering and parallel dispatch within a wave.claude/tmp/runs/<KEY>-<cell>.jsonl— confirm the subagent invokedgh pr create --draftand printed a PR URLOut of scope (deferred follow-ups)
feature-orchestrate status <spec>— pollsgh pr listto update on PR progress per cell.feature-orchestrate --retry <cell-key>— re-launch a failed subagent (v1 just reports the failure and tells the user to re-invoke with--only <cell>).Featuremodel + schema-dump entry (informal YAML for v1).--model sonnet).--permission-mode acceptEdits+ auto-written.claude/settings.json; a tighter per-repo allowlist is a follow-up.Jira: https://virtru.atlassian.net/browse/DSPX-3302
🤖 Generated with Claude Code