From 6ecb12641c93c9aa3b76c2d522f5c6ba3ea2a55a Mon Sep 17 00:00:00 2001 From: Mikhail Petrov Date: Thu, 2 Jul 2026 17:01:00 +0300 Subject: [PATCH] docs(changelog): document commits missing from [Unreleased]; fix map-release CHANGELOG gate awk bug MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CHANGELOG.md was missing entries for #317, #307, #301/#300, #292/#289, #280, #279, and the Claude-side adversarial review feature (2991e7e) — all landed after v3.20.0 but never documented. Also fix the /map-release skill's Gate 12 completeness check: the awk range pattern /## \[Unreleased\]/,/## \[/ collapses to a single line when start and end match the same line, which they do for the "## [Unreleased]" heading itself (it matches both patterns). This made the gate always report 0 CHANGELOG entries. Switched to an explicit flag-based awk in all three occurrences (count, review display, semver-analysis extraction) and dropped the now-unneeded trailing `sed '$d'`. --- .claude/skills/map-release/SKILL.md | 12 ++++++++---- CHANGELOG.md | 7 +++++++ src/mapify_cli/templates/skills/map-release/SKILL.md | 12 ++++++++---- .../templates_src/skills/map-release/SKILL.md.jinja | 12 ++++++++---- 4 files changed, 31 insertions(+), 12 deletions(-) diff --git a/.claude/skills/map-release/SKILL.md b/.claude/skills/map-release/SKILL.md index d1ae7960..3a20ccdc 100644 --- a/.claude/skills/map-release/SKILL.md +++ b/.claude/skills/map-release/SKILL.md @@ -198,8 +198,12 @@ if [[ -n "$LAST_TAG" ]]; then # maintenance commits, which otherwise make this heuristic chase its own fixes. COMMITS_SINCE=$(git log ${LAST_TAG}..HEAD --no-merges --format="%s" | awk '!/^(docs\(changelog\)|chore\(release\):)/ { count++ } END { print count + 0 }') - # Count CHANGELOG entries in [Unreleased] section - CHANGELOG_ENTRIES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | grep -cE "^- " || echo "0") + # Count CHANGELOG entries in [Unreleased] section. + # NOTE: a range-pattern awk (/start/,/end/) collapses to the single + # matching line when start and end match the SAME line — and "## + # [Unreleased]" matches both "/## \[Unreleased\]/" and "/## \[/". Use an + # explicit flag instead so the range spans past the heading line itself. + CHANGELOG_ENTRIES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md | grep -cE "^- " || echo "0") echo "Counted commits since $LAST_TAG: $COMMITS_SINCE" echo "(excluding docs(changelog) and chore(release) maintenance commits)" @@ -216,7 +220,7 @@ if [[ -n "$LAST_TAG" ]]; then echo "════════════════════════════════════════════════════════" echo "" echo "Current CHANGELOG [Unreleased] content:" - awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d' + awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md echo "" # Ask user to update CHANGELOG @@ -289,7 +293,7 @@ Read CHANGELOG.md [Unreleased] section to determine bump type: ```bash # Extract unreleased changes -UNRELEASED_CHANGES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d') +UNRELEASED_CHANGES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md) ``` **Semantic Versioning Rules:** diff --git a/CHANGELOG.md b/CHANGELOG.md index 161d23c0..c4c5a27f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,9 +18,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **Parallel-wave merge coordinator for worktree isolation (`merge_wave_worktrees`, part of #284 Phase 2).** Wires the existing wave/DAG scheduler to per-subtask worktree isolation so a parallel wave's independent subtasks each run in their own worktree and are accepted **atomically**. Every worktree of a wave is cut off the same base (HEAD at wave start), so they cannot be merged one at a time — the first `merge_subtask_worktree` advances HEAD and the next trips `BASE_DIVERGED`. The new coordinator relaxes *only* that guard to a wave-scoped form: it refuses **external** HEAD movement (`EXTERNAL_HEAD_MOVED`) but allows the sibling divergence each in-wave squash-merge creates. It derives `wave_base_sha` from the sidecar (never a caller parameter), preflights every worktree (commit + per-worktree guards + pre-merge verify) BEFORE touching the working branch, then squash-merges each accepted worktree **by frozen SHA in sorted id order** (one runner commit per subtask — the one-commit-per-subtask contract holds), then runs **one post-wave full gate on the merged tree inside the same transaction**. It is **all-or-nothing** (council-reviewed, conv `c29d6fa9`): any textual conflict, commit failure, or post-wave-gate failure rolls the whole working branch back to the wave base via `git reset --hard` + `git clean -fd` (squash leaves no `MERGE_HEAD`, so `git merge --abort` is never used; MAP runtime state is excluded from the clean) and leaves **every** worktree intact for retry — no partial-wave state ever survives. Safety extras: an advisory `flock` serializes coordinators (`MERGE_IN_PROGRESS`); attached-/clean-target preconditions; conflicted paths are attributed back to the subtasks that touched them (declared-disjoint `affected_files` is only a scheduler hint, so actual changed-file overlap is reported as advisory telemetry while git's textual conflict stays the hard guard). The shared `_wt_freeze_and_verify` primitive (commit + guards + pre-merge verify) is extracted once and reused by both the single-subtask and wave merge paths. CLI: `merge_wave_worktrees [--branch B] [--verify-cmd CMD…] [--skip-verify] [--post-wave-cmd CMD…] [--skip-post-wave]`. Phase 3 (context-budget hooks) remains open on #284. - **Per-subtask git worktree isolation for `/map-efficient` (`worktree.isolation`, part of #284).** Opt-in, OFF by default. When enabled, each subtask's Actor runs in a dedicated, throwaway git worktree and its result is squash-merged back into the working branch ONLY after the configured `verification_checks` pass IN the worktree (a **pre-merge** gate, strictly stronger than today's post-commit check) — a rejected attempt (Monitor `valid=false` / Evaluator fail) is discarded so the working branch is never touched by a bad attempt. The Python step runner owns the whole lifecycle and every safety guard (producer-owns-parse): `create_subtask_worktree` (crash-safe remove-and-recreate; guards: not-a-repo, protected-ref, nested-worktree refusal, active-git-op, `subtask_id` ref/path sanitization, dirty-main refusal, submodule init), `merge_subtask_worktree` (guards run BEFORE the working branch is touched: base-divergence `git merge-base` check, runtime-state-in-diff, configurable bulk-deletion threshold `worktree.max_deletions`, submodule-pointer change, detached-HEAD, then the pre-merge verify gate; accept = `git merge --squash` + one runner-authored commit, never `--no-ff`, preserving one-commit-per-subtask), `discard_subtask_worktree` (atomic reject, idempotent, optional `--save-patch` forensics), and `worktree_isolation_status` (reconciles recorded vs live worktrees). Worktrees are stored OUT of the working tree under the repo's git common dir (`/map-framework/worktrees/`), so `git clean -fdx`, recursive scanners, and accidental commits can never touch them; MAP runtime state (`.map//...`) always resolves against the main checkout — state-mutating commands refuse if invoked from inside a managed worktree (the silent state-desync footgun). Every guard returns a structured `{kind, message}` the skill branches on. Config keys `worktree.{isolation,max_deletions}`; new `worktree` manifest stage; `.map//worktrees.json` sidecar. Design was llm-council-reviewed (runner-owned worktrees over harness-native `isolation="worktree"`; squash-merge over `--no-ff`; always-discard on reject; pre-merge verification + crash-safe retry + atomic reject folded in so the slice is not a no-op; explicit state-root separation). Phase 2 (wave/DAG parallelism) and Phase 3 (context-budget hooks) remain open on #284. - **Cross-AI peer review for `/map-review` (`--cross-ai `, part of #288).** `/map-review --cross-ai codex|gemini|claude|opencode` dispatches the review to an INDEPENDENT external AI CLI for a true second opinion (different model/vendor, fresh context with no shared session). The dispatch, parsing, normalization, and untrusted-wrapping all live in the Python step runner (`run_cross_ai_review` / `dispatch_cross_ai_review`, producer-owns-parse) — the skill only handles consent and presentation. Egress is **double-consent**: the per-run `--cross-ai` flag AND `review.cross_ai.enabled: true` in `.map/config.yaml` (off by default) are both required, because the diff/code leaves the machine. Mandatory guardrails: a **high-confidence outbound secret scan** (private keys, AWS/GitHub/Google/Slack credentials) BLOCKS dispatch before the subprocess and surfaces only the pattern name, never the value; the external CLI is invoked `shell=False` with a literal-argv adapter and a configurable timeout; the returned findings ALWAYS enter context behind an `EXTERNAL UNTRUSTED REFERENCE` fence (link/injection scan, applied deterministically in Python so the model cannot skip it) and are advisory-only (`source: cross_ai`, never auto-applied); same-vendor runtimes (`claude`) are honestly labeled `independent_vendor: false`. Any dispatch failure (disabled, CLI missing, not authenticated, timeout, non-JSON output, secret-blocked) degrades non-blockingly and falls back to the in-session review. Config keys `review.cross_ai.{enabled,runtime,timeout_seconds}`. Design was llm-council-reviewed (Python-owned dispatch; single-runtime slice with `--cross-ai all` consensus deferred to a follow-up slice). +- **Adversarial multi-perspective code review (`/map-review --adversarial`).** Runs three parallel independent reviewers with isolated contexts instead of a single monitor pass: Blind Hunter (diff-only, unbiased by stated intent), Edge Case Hunter (diff + repo read; null handling, boundaries, error paths), and Acceptance Auditor (diff + spec + artifacts; missed requirements, AC gaps). Adds a `--quick` flag (Blind + Acceptance, skips Edge Case) and a `--show-raw-findings` debug flag. Findings use a structured severity/category/evidence/failure_mode schema, deduplicated via deterministic clustering with corroboration signals, and rolled up into a unified report with a convergence section and all-clear statements. New `build_adversarial_review_prompts()` / `aggregate_adversarial_findings()` in the step runner, plus an `adversarial-reference.md` workflow doc. This is the Claude-side feature the Codex port (above) mirrors. +- **`mapify tokenreport` dashboard, history, estimate, and export modes (closes #289).** `token_report_dashboard()` adds a box-drawing visual layout (session summary, per-subtask bar chart, per-agent/model breakdowns, vs-previous-session comparison); `record_session_snapshot()` persists `token_history.jsonl` for `token_report_history()` trend analysis; `token_report_estimate()` gives a weighted cost projection; `token_report_json()` / `token_report_csv()` support CI/export. New CLI flags: `--dashboard`, `--history`, `--json`, `--csv`, `--estimate`, `--finalize`. +- **Learned rules scoped by `path_glob` (closes #280).** Rules with a `paths:` frontmatter key are now filtered before Actor context and personal-rules injection, and only load when the agent is working on matching files — aligning with Claude Code's hierarchical rule-loading pattern instead of injecting every learned rule into every subtask regardless of relevance. +- **Auto-created GitHub Release in the release CI workflow (closes #279).** `release.yml` now uses `softprops/action-gh-release@v2` to auto-create the GitHub Release (with a changelog excerpt) on tag publish, with the required `contents: write` / `id-token: write` permissions. The manual Phase 5.4 (`gh release create`) step is dropped from the `/map-release` skill; the summary/checklist now reference the auto-created release URL instead. ### Fixed - **`detect_actor_files_changed_mismatch` no longer false-positives on MAP-only subtask artifacts (closes #277).** The actor files-changed gate validated every declared file against `_current_subtask_changed_files`, which derives from `git diff`/`git status` and strips the gitignored framework trees (`.map/`, `.codex/`, `.agents/`). A subtask whose only declared `affected_files` entry was a MAP artifact (e.g. `.map//verification-summary.md`) therefore always reported `status_mismatch=true` with a false "Actor declared files it did not write" recovery instruction, making MAP-only documentation/verification subtasks look like truncated actor edits. The detector now partitions declared files: git-tracked files keep the diff check, while MAP-internal artifacts are validated by filesystem existence + non-empty content (a missing or empty artifact is still a real mismatch). MAP-artifact validation is independent of git availability, so a MAP-only subtask is never forced into a false mismatch by a git error. A new shared `_is_map_internal_artifact` helper de-duplicates the framework-tree prefix list used by both the strip filter and the new validation path. +- **Workflow-context injection no longer fires on a terminal `COMPLETE` state (closes #317).** When `step_state.json` has `current_step_id` or `current_step_phase` equal to `"COMPLETE"`, `format_reminder()` now returns `None` immediately via a terminal-state guard, so the hook emits `{}` instead of a misleading "REQUIRED: Complete phase COMPLETE" banner after a workflow has already finished. Added a `_TERMINAL_STEP_IDS` frozenset constant and regression tests covering both the subprocess-integration and unit (`format_reminder`) paths. +- **`record_test_baseline` timeout is now fail-safe, not fail-open (closes #307).** When the baseline subprocess timed out it never finished, so `baseline_failures` was always `[]` — indistinguishable from a genuinely clean suite, silently treating any pre-existing failure as "not pre-existing" and defeating the regression-vs-pre-existing distinction. Status is now `"timed_out"` (distinct from `"baseline_failures"`); a new `baseline_complete: bool` field is `false` on timeout so downstream code can check it before trusting an empty baseline; `list_baseline_failures` propagates `baseline_complete`/`timed_out` and emits a `warning` key when the stored baseline is incomplete. Default `timeout_seconds` raised from 120 to 600 to give most suites room to finish; `--timeout` still accepts an explicit value. +- **Bare-basename spec citations now auto-resolve instead of hard-failing (closes #301, closes #300).** `validate_spec_citations.py` resolves a bare filename citation (e.g. `api.ts:80`) automatically when it is unique in the repo; an ambiguous bare basename now produces a non-blocking warning instead of a hard error, and a genuinely missing file gets a clearer error message. Separately, `/map-plan` Step 0's research-agent now writes its full report directly to disk (with the pipe-based fallback kept), documenting the `SendMessage` vs. new-`Agent()` footgun for future skill authors. ## [3.20.0] - 2026-06-26 diff --git a/src/mapify_cli/templates/skills/map-release/SKILL.md b/src/mapify_cli/templates/skills/map-release/SKILL.md index d1ae7960..3a20ccdc 100644 --- a/src/mapify_cli/templates/skills/map-release/SKILL.md +++ b/src/mapify_cli/templates/skills/map-release/SKILL.md @@ -198,8 +198,12 @@ if [[ -n "$LAST_TAG" ]]; then # maintenance commits, which otherwise make this heuristic chase its own fixes. COMMITS_SINCE=$(git log ${LAST_TAG}..HEAD --no-merges --format="%s" | awk '!/^(docs\(changelog\)|chore\(release\):)/ { count++ } END { print count + 0 }') - # Count CHANGELOG entries in [Unreleased] section - CHANGELOG_ENTRIES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | grep -cE "^- " || echo "0") + # Count CHANGELOG entries in [Unreleased] section. + # NOTE: a range-pattern awk (/start/,/end/) collapses to the single + # matching line when start and end match the SAME line — and "## + # [Unreleased]" matches both "/## \[Unreleased\]/" and "/## \[/". Use an + # explicit flag instead so the range spans past the heading line itself. + CHANGELOG_ENTRIES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md | grep -cE "^- " || echo "0") echo "Counted commits since $LAST_TAG: $COMMITS_SINCE" echo "(excluding docs(changelog) and chore(release) maintenance commits)" @@ -216,7 +220,7 @@ if [[ -n "$LAST_TAG" ]]; then echo "════════════════════════════════════════════════════════" echo "" echo "Current CHANGELOG [Unreleased] content:" - awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d' + awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md echo "" # Ask user to update CHANGELOG @@ -289,7 +293,7 @@ Read CHANGELOG.md [Unreleased] section to determine bump type: ```bash # Extract unreleased changes -UNRELEASED_CHANGES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d') +UNRELEASED_CHANGES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md) ``` **Semantic Versioning Rules:** diff --git a/src/mapify_cli/templates_src/skills/map-release/SKILL.md.jinja b/src/mapify_cli/templates_src/skills/map-release/SKILL.md.jinja index d1ae7960..3a20ccdc 100644 --- a/src/mapify_cli/templates_src/skills/map-release/SKILL.md.jinja +++ b/src/mapify_cli/templates_src/skills/map-release/SKILL.md.jinja @@ -198,8 +198,12 @@ if [[ -n "$LAST_TAG" ]]; then # maintenance commits, which otherwise make this heuristic chase its own fixes. COMMITS_SINCE=$(git log ${LAST_TAG}..HEAD --no-merges --format="%s" | awk '!/^(docs\(changelog\)|chore\(release\):)/ { count++ } END { print count + 0 }') - # Count CHANGELOG entries in [Unreleased] section - CHANGELOG_ENTRIES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | grep -cE "^- " || echo "0") + # Count CHANGELOG entries in [Unreleased] section. + # NOTE: a range-pattern awk (/start/,/end/) collapses to the single + # matching line when start and end match the SAME line — and "## + # [Unreleased]" matches both "/## \[Unreleased\]/" and "/## \[/". Use an + # explicit flag instead so the range spans past the heading line itself. + CHANGELOG_ENTRIES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md | grep -cE "^- " || echo "0") echo "Counted commits since $LAST_TAG: $COMMITS_SINCE" echo "(excluding docs(changelog) and chore(release) maintenance commits)" @@ -216,7 +220,7 @@ if [[ -n "$LAST_TAG" ]]; then echo "════════════════════════════════════════════════════════" echo "" echo "Current CHANGELOG [Unreleased] content:" - awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d' + awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md echo "" # Ask user to update CHANGELOG @@ -289,7 +293,7 @@ Read CHANGELOG.md [Unreleased] section to determine bump type: ```bash # Extract unreleased changes -UNRELEASED_CHANGES=$(awk '/## \[Unreleased\]/,/## \[/' CHANGELOG.md | sed '$d') +UNRELEASED_CHANGES=$(awk '/^## \[Unreleased\]/{f=1;next} /^## \[/{f=0} f' CHANGELOG.md) ``` **Semantic Versioning Rules:**