Add wrong-location guardrails: PreToolUse hook + artifact validator#116
Add wrong-location guardrails: PreToolUse hook + artifact validator#116KjellKod wants to merge 19 commits into
Conversation
Two PR #116 review findings: 1) Validator silently accepted missing or misplaced canonical artifacts (Codex inline, scripts/quest_artifact_postflight.py:214). The loop iterated declared_paths only, so an "artifacts": [] handoff returned 0 without recording mismatches, and a canonical-named file at a misplaced sub-path inside the role boundary (e.g. phase_01_plan/nested/plan.md) passed every per-path check. Fix: in run(), compare resolved expected_paths to resolved declared_paths before the per-declared loop. For each expected canonical path that is not declared or not present on disk at the canonical location, emit a "missing" mismatch record. Remove the existence check from _check_one to avoid double-emitting when a declared canonical path is also flagged by the coverage step. AC4 ("mismatches cause non-zero exit and are not silently accepted") now holds for the empty-artifacts and misplaced cases as well. Three new tests cover the failure modes: - empty artifacts array for a role with expected paths - omitted canonical artifact - misplaced canonical-named file inside the phase boundary 2) Revert .ai/allowlist.json model pins. The all-Claude pin was scoped to the wrong-location-guardrails quest only; the per-quest snapshot at .quest/<id>/logs/allowlist_snapshot.json preserves that record. Keeping the global pin on main would change repo-default Quest routing for every checkout. Restored the prior mixed-model defaults. Tests: 587 unit + 2 perf (was 584 + 2). Both new mismatch behaviors covered. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
There was a problem hiding this comment.
6 issues found across 18 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name=".skills/quest/delegation/workflow.md">
<violation number="1" location=".skills/quest/delegation/workflow.md:184">
P2: The postflight example uses a generic `handoff.json` path that is incorrect for several role-specific handoff filenames.</violation>
</file>
<file name=".claude/hooks/branch-dir-context.sh">
<violation number="1" location=".claude/hooks/branch-dir-context.sh:20">
P2: Escape newlines in the working directory before printing; otherwise a valid path containing `\n`/`\r` breaks the hook's single-line stdout contract.</violation>
</file>
<file name="tests/unit/test_branch_dir_context_hook.sh">
<violation number="1" location="tests/unit/test_branch_dir_context_hook.sh:1">
P2: This test file is not wired into the repo’s test runner, so CI will never execute it. Add a shell-test step or convert it to a pytest module so the new coverage actually runs.</violation>
</file>
<file name="tests/unit/test_workflow_postflight_wireup.py">
<violation number="1" location="tests/unit/test_workflow_postflight_wireup.py:70">
P2: Scope this check to the actual postflight step, not any filename mention in the file.</violation>
<violation number="2" location="tests/unit/test_workflow_postflight_wireup.py:135">
P2: Don't end the scan on the first bold subheading; it truncates the artifact-preparation block early.</violation>
</file>
<file name="scripts/quest_artifact_postflight.py">
<violation number="1" location="scripts/quest_artifact_postflight.py:136">
P2: Catch `OSError`, `UnicodeDecodeError`, and `json.JSONDecodeError` when reading the handoff file, and ensure `data` is a dictionary before calling `.get()`. This avoids unhandled exceptions (e.g., `FileNotFoundError` or `AttributeError`) that could crash the postflight validator if the orchestrator text-fallback didn't write a valid JSON file.
(Based on your team's feedback about catching OSError and UnicodeDecodeError when reading UTF-8 files to avoid crashing quest completion.) [FEEDBACK_USED]</violation>
</file>
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic
Capture the design and analysis surfaced by PR #116 review: today the only place to set per-role model assignments is the shared .ai/allowlist.json, so a one-quest override either ships globally or has to be reverted by hand after the quest closes (the exact friction Codex flagged on PR #116). The proposed design lets each quest carry its own active orchestration: - After preflight + route selection but before folder creation, show the current models block and offer a one-keystroke confirm or a short per-role chooser. - Write the chosen models block into a per-quest orchestration.json alongside the existing allowlist snapshot. - Make every workflow.md role-dispatch site read from .quest/<id>/orchestration.json instead of .ai/allowlist.json. - Resume invocations never re-prompt; the chosen orchestration is locked at quest start. The idea doc includes my analysis (feasibility, usefulness, honest counter-position) and a recommended quest prompt for when it is ready to implement. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
There was a problem hiding this comment.
2 issues found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name=".ai/allowlist.json">
<violation number="1">
P2: This change introduces inconsistent model routing by switching several roles from `claude` to `gpt-5.5`, which conflicts with the stated allowlist pinning and can cause non-deterministic behavior across quest phases.</violation>
</file>
<file name="scripts/quest_artifact_postflight.py">
<violation number="1" location="scripts/quest_artifact_postflight.py:228">
P2: Comparing coverage with `Path.resolve()` lets a symlinked artifact satisfy multiple expected outputs, so an omitted canonical file can still pass validation.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
Six P2 findings from the cubic AI reviewer, all valid: - workflow.md postflight invocation example used a generic handoff.json path. Most sub-agent roles write role-suffixed handoff files (handoff_plan-reviewer-a.json, handoff_arbiter.json, handoff_fixer.json, etc.). The example now uses <expected_handoff_path_for_role> with a pointer back to the role-to-filename table earlier in the same section. - branch-dir-context.sh single-line stdout contract did not survive paths containing LF/CR (legal in POSIX path components). Collapse embedded LF/CR to spaces with `tr` before formatting the output line so the consumer still sees exactly one line. - tests/unit/test_branch_dir_context_hook.sh was never invoked by the test job because pytest does not collect shell files. Added tests/unit/test_branch_dir_context_hook_runner.py that subprocesses bash to execute the shell suite and surfaces failure as a pytest failure with captured stdout/stderr. - test_workflow_postflight_wireup.py #1 (references_postflight_script) asserted only that the literal appeared anywhere in workflow.md. Now scoped to occurrences inside the Handoff File Polling section so a stray comment elsewhere can't satisfy the wire-in contract. - test_workflow_postflight_wireup.py #5 (no_pre_invocation_postflight_reference) terminated its scan on the first bold subheading after the Artifact preparation bullet, which truncated the block early (e.g., the `**Codex sandbox permissions:**` sub-item closed the scan before reaching the bullet's tail). Switched to a numbered-list-item boundary (`^\d+\. `) so the scan covers the full bullet block until item 6 starts. - quest_artifact_postflight.py _load_declared_artifacts could crash on unreadable handoff bytes or malformed JSON (FileNotFoundError, UnicodeDecodeError, JSONDecodeError) and on non-dict payloads (AttributeError on .get()). Wrapped both read paths and added a dict guard. Failure cases now return an empty declared list and the coverage check flags every expected canonical path as missing — the validator no longer crashes when the orchestrator falls back to text parsing. - quest_artifact_postflight.py coverage check used Path.resolve(), which follows symlinks. Two distinct expected paths could collapse onto one declared symlink target, satisfying coverage for an omitted canonical artifact. Switched the identity comparison to os.path.normpath(os.path.abspath(...)) so distinct canonical paths remain distinct. Existence still uses .exists() (symlink-followed) — that's the right semantic for "is the file there?". Tests: 588 unit (was 587, +1 shell-runner wrapper) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
There was a problem hiding this comment.
0 issues found across 5 files (changes from recent commits).
Requires human review: Auto-approval blocked by 1 unresolved issue from previous reviews.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Requires human review: This PR adds two new guardrails (a PreToolUse hook and a post-invocation artifact validator) that run on every Edit/Write tool call and every sub-agent handoff, which touches critical runtime paths; the 2688-line change includes new scripts, workflow logic, and configuration that require human...
Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/quest_artifact_postflight.py">
<violation number="1" location="scripts/quest_artifact_postflight.py:373">
P1: This early return skips validation for every non-current-quest path, so missing workspace files and sibling `.quest/` artifacts are silently accepted.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
There was a problem hiding this comment.
2 issues found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/quest_artifact_postflight.py">
<violation number="1" location="scripts/quest_artifact_postflight.py:384">
P2: Check for sibling-quest paths before the existence test so wrong-location writes are not downgraded to `missing`.</violation>
<violation number="2" location="scripts/quest_artifact_postflight.py:393">
P2: This `.quest` check is too broad: it will reject shared paths like `.quest/cache/...`, not just sibling quest directories.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Requires human review: This PR modifies core orchestration workflow, hook configuration, and adds a new artifact validation script, all of which are high-impact changes that require human review to ensure correctness and avoid unintended side effects.
Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/quest_artifact_postflight.py">
<violation number="1" location="scripts/quest_artifact_postflight.py:416">
P1: In worktree mode, shared `.quest` paths (like `cache/` or `backlog/`) and sibling quest artifacts will incorrectly trigger a `traversal_outside_repo` mismatch because they are validated against `workspace_root` instead of `repo_root`.
(Based on your team's feedback about Non-current-quest path validation with shared .quest allowlist.) [FEEDBACK_USED]</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Auto-approved: These changes add two additive, non-halting guardrails—an observational pre-edit hook and an advisory artifact-path validator—backed by comprehensive tests, with no modifications to critical infrastructure, authentication, or core business logic, and the orchestrator treats any validator failure as
Re-trigger cubic
Six P2 findings from the cubic AI reviewer, all valid: - workflow.md postflight invocation example used a generic handoff.json path. Most sub-agent roles write role-suffixed handoff files (handoff_plan-reviewer-a.json, handoff_arbiter.json, handoff_fixer.json, etc.). The example now uses <expected_handoff_path_for_role> with a pointer back to the role-to-filename table earlier in the same section. - branch-dir-context.sh single-line stdout contract did not survive paths containing LF/CR (legal in POSIX path components). Collapse embedded LF/CR to spaces with `tr` before formatting the output line so the consumer still sees exactly one line. - tests/unit/test_branch_dir_context_hook.sh was never invoked by the test job because pytest does not collect shell files. Added tests/unit/test_branch_dir_context_hook_runner.py that subprocesses bash to execute the shell suite and surfaces failure as a pytest failure with captured stdout/stderr. - test_workflow_postflight_wireup.py #1 (references_postflight_script) asserted only that the literal appeared anywhere in workflow.md. Now scoped to occurrences inside the Handoff File Polling section so a stray comment elsewhere can't satisfy the wire-in contract. - test_workflow_postflight_wireup.py #5 (no_pre_invocation_postflight_reference) terminated its scan on the first bold subheading after the Artifact preparation bullet, which truncated the block early (e.g., the `**Codex sandbox permissions:**` sub-item closed the scan before reaching the bullet's tail). Switched to a numbered-list-item boundary (`^\d+\. `) so the scan covers the full bullet block until item 6 starts. - quest_artifact_postflight.py _load_declared_artifacts could crash on unreadable handoff bytes or malformed JSON (FileNotFoundError, UnicodeDecodeError, JSONDecodeError) and on non-dict payloads (AttributeError on .get()). Wrapped both read paths and added a dict guard. Failure cases now return an empty declared list and the coverage check flags every expected canonical path as missing — the validator no longer crashes when the orchestrator falls back to text parsing. - quest_artifact_postflight.py coverage check used Path.resolve(), which follows symlinks. Two distinct expected paths could collapse onto one declared symlink target, satisfying coverage for an omitted canonical artifact. Switched the identity comparison to os.path.normpath(os.path.abspath(...)) so distinct canonical paths remain distinct. Existence still uses .exists() (symlink-followed) — that's the right semantic for "is the file there?". Tests: 588 unit (was 587, +1 shell-runner wrapper) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
PR #116 review feedback: the rationale for using bash parameter expansion over ``tr`` invoked hypothetical PATH-restricted sandboxes missing coreutils — YAGNI for the platforms we support. Keep the builtin implementation; drop the justification prose. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
PR #116 review (Codex, Must fix): the per-path boundary check used ``resolved.relative_to(boundary_dir)``, which accepts any depth under the phase directory. A handoff declaring both the real canonical path (satisfying coverage) and an off-spec ``phase_XX/nested/<canonical>.md`` slipped through — ``_check_one`` accepted the nested entry because it was under boundary and had a canonical basename. Require exact parent equality (``resolved.parent == boundary_dir``). Expected canonical paths share the same phase dir by construction, so the tighter invariant is consistent with the artifact contract. Regression pin: ``test_extra_canonical_named_file_in_nested_subdir_ records_outside_boundary`` declares both the canonical and the nested duplicate and asserts the nested declaration records ``outside_boundary``. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Codex <noreply@openai.com> in collaboration with the repository author
…inks PR #116 review (Codex, Must fix): a canonical quest path implemented as a symlink to a workspace file (e.g. ``pr_description.md`` -> ``scripts/ foo.py``) slipped through. ``_check_one`` classified by the resolved target, treated the symlink as a workspace file, and accepted it; the expected-identity comparison in coverage matched on both sides because both resolve to the same target. Two related fixes in ``_check_one``/``run``: 1) Classify by the DECLARED path's directory, not the resolved target. ``declared_path.parent.resolve() / declared_path.name`` canonicalizes the worktree ``.quest`` directory symlink while leaving any file-level symlink visible. A symlink at a canonical quest path is now classified as a quest artifact and reaches the boundary check. 2) Resolve ``boundary_dir``'s parent only, not the file. ``Path( expected_paths[0]).resolve().parent`` would follow a file-level symlink on the canonical path and corrupt the boundary to wherever the link pointed. Use ``Path(expected_paths[0]).parent.resolve()`` so the boundary stays anchored to the phase directory regardless of whether the canonical artifact was provisioned as a real file or a symlink. Regression pin: ``test_quest_artifact_symlink_escaping_boundary_records_ outside_boundary`` provisions ``pr_description.md`` as a symlink to ``<workspace>/scripts/foo.py``, declares it alongside a real second canonical, and asserts ``outside_boundary`` on the symlink declaration. Tests: 781 unit/integration green (one pre-existing dashboard test failure on this branch is unrelated). Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Codex <noreply@openai.com> in collaboration with the repository author
881b1dd to
1823130
Compare
* Add per-quest orchestration override idea doc Copy proposal from PR #116 branch onto orchestration-override for review. Proposes promoting the per-quest allowlist snapshot to the source of truth for role dispatch, with a startup confirm/override step, so per-quest model choices no longer require editing the repo allowlist. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod * Add per-quest orchestration override Promote per-quest config to source of truth for role-to-model dispatch. A new .quest/<id>/orchestration.json (written by the startup chooser) replaces .ai/allowlist.json as the read source at every workflow.md dispatch site, so per-quest model choices no longer require editing the repo allowlist. - SKILL.md: new sub-step 8.5 (chooser + writer) in Quest Folder Creation, new sub-step 1a (resume migration from snapshot). - workflow.md: 6 dispatch sites + 5 Codex MCP templates + mapping table footer + Slot A headers swapped to read from .quest/<id>/orchestration.json. - quest_validate-quest-state.sh: new validate_orchestration_json asserting presence and active-mode role coverage at every phase transition. orchestration_compat: "legacy" opt-out scoped to in-flight quests predating this change. - scripts/quest_runtime/orchestration.py: testable contract for chooser writer, override parser, migration, availability check. - tests/test-quest-orchestration.sh (new, 12 cases) and 13 new cases in tests/test-validate-quest-state.sh cover the chooser parse contract, override validation, resume migration, and the grep contract on workflow.md. - AGENTS.md and ideas/2026-05-18-per-quest-orchestration-override.md updated with closure notes. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod * Archive orchestration-override quest journal Quality tier: Platinum. Quest delivered the per-quest orchestration override end-to-end: chooser, dispatch swap, validator, runtime helper, 64+12 tests, and the legacy compat vent for in-flight quests. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod * Address orchestration review findings * Validate orchestration snapshot shape * Tighten orchestration fallback validation * Remove orchestration validation bypass * Handle Claude-family orchestration models * Split orchestration runtime availability checks * Align orchestration dispatch docs * updated * Clean orchestration journal whitespace * Keep sharpen ux defaults portable * Fix orchestration default availability handling * updated * Rewrite install-posture doc to cover both install modes The doc argued outside-in was the default and in-repo a rare exception, but the shipped installer makes in-repo the primary, tooling-supported path. Reframe it to present both modes honestly with trade-offs and a choose-when guide, add the standard YAML frontmatter, and index it in the architecture README. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com> * Ship .quest-manifest so installed repos stay in sync The installer fetched .quest-manifest only to learn what to copy, then discarded it, so target repos kept a stale manifest. Newly installed skills and scripts were then flagged as missing from the manifest by strict validation in the target repo. List .quest-manifest in [copy-as-is] and special-case it to always overwrite with upstream (no prompt, even in --force), since it is bookkeeping that must match the installed file set. A locally modified manifest is preserved as .quest-manifest.quest_updated so an intentional customization stays recoverable. Add tests for the overwrite+backup and pristine-update paths. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com> * Clarify Quest ID timestamp handling Guard against preformatted timestamp strings in the Quest ID helper and make the startup instruction pass datetime.now() explicitly. Quest/Co-Authored by Co-Authored-By: Codex <noreply@openai.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com> * updated ideas * Sharpen native-runtime-dispatch idea with Claude facts and slot model Reframe the proposal so it matches how Quest actually dispatches today and removes hedges that would invite theatre on the way to implementation. - Claude context isolation: fresh context in is automatic (no fork_context equivalent — stronger than it); the real exposure is the Task tool result returning to the orchestrator. Containment is the compact-handoff-out contract, not an open research item. - Note that native Claude Task dispatch is already the live path for Claude-family roles; this proposal codifies and guard-rails it. - Add a slot/role-type/runtime/permission-key distinction so the A/B reviewer mixed-runtime case (the default, even) is documented as normal: runtime is resolved per slot, A+B can be any family mix, the orchestrator model does not constrain slot runtimes, and one shared subagent_type definition is reused across slots and families. - State the permission posture plainly: role_permissions are not runtime-enforced today, native dispatch is enforcement-neutral. Cross-link the enforcement-activation roadmap. - Require that orchestration.json's specific same-family model variants (e.g. claude-haiku, claude-opus-4.7) are passed explicitly rather than silently voided by model: inherit. - Add a Documentation Updates Required section so user-facing docs are treated as definition-of-done, not a follow-up. - Drop the speed framing; argue native because it is first-party and actively optimized by the model vendors, with bridges/MCP scoped to cross-family glue. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
Consolidates two 2026-04-15 sub-ideas (PreToolUse branch/dir hook and sub-agent artifact-path validation) into one shippable brief targeting the top friction class in the eval data: wrong-branch / wrong-directory edits. Brief includes goal, deliverables, acceptance criteria, explicit out-of-scope, kill criteria, and a ready /quest prompt. Sets every role in .ai/allowlist.json to claude so the upcoming wrong-location-guardrails quest runs Claude-only end-to-end. Per the file's own header, takes effect on next quest run. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com>
Flip status on the consolidated brief and the two superseded sub-ideas from proposed to in-progress, add superseded_by pointers on the two sub-ideas, and refresh the ideas/README.md index with the new row and updated statuses. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com>
Add an explicit deliverable and acceptance criterion to the wrong-location guardrails brief: when the quest ships, move the three in-progress idea docs (the consolidated 2026-05-17 brief and the two superseded 2026-04-15 sub-ideas) into ideas/archive/, add a done-index row pointing at the quest journal entry, and drop the active rows from the Execution Discipline and Observability section so the index reflects shipped state. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with KjellKod <kjell.hedstrom@gmail.com>
Two additive guardrails that ship to every Quest install via .quest-manifest and target the top friction class in the eval data (wrong-branch / wrong-directory / wrong nested .quest/ edits, 55 "wrong approach" events across recent sessions). - .claude/hooks/branch-dir-context.sh: PreToolUse Edit|Write hook prints [quest-context] branch=<name|no-git> dir=<path>. Exit 0 in git, non-git, and detached-HEAD contexts; never blocks the wrapped tool call. .claude/settings.json wiring is strictly additive; existing SessionStart and PostToolUse hooks are preserved verbatim. - scripts/quest_artifact_postflight.py: filesystem-only validator that compares declared handoff artifacts against expected_artifacts_for_role() boundaries. Five mismatch reasons (missing, outside_boundary, noncanonical_filename, nested_quest_path, path_traversal); exits non-zero on mismatch and appends a JSON record to .quest/<id>/logs/path_compliance .log. Orchestrator policy is accepted_with_warnings (non-halting) per the brief's kill criterion. - .skills/quest/delegation/workflow.md: new postflight step appended at the END of the Handoff File Polling pattern. Items 1-6 keep their numbers so the 10 hard-coded "Handoff File Polling §5/§6" cross-references elsewhere in the doc don't drift. Two contract tests lock both the placement and the cross-reference counts (§5 x 6, §6 x 4) against future edits. - AGENTS.md: Wrong-location guardrails section names both guardrails, documents how to disable each, and surfaces the installer caveat that customized .claude/settings.json files receive a .quest_updated sidecar and require a one-time manual merge to pick up the PreToolUse entry (merge-carefully is checksum compare, not JSON-aware merge). - pyproject.toml: registers the perf marker so latency tests opt out cleanly under non-perf CI runs. - Tests (tests/unit/): hook contexts, validator branches plus latency target (<50 ms) and regression cap (<200 ms), workflow contract + cross-reference counts, manifest entries + AGENTS.md literals, archival housekeeping. 584 unit tests pass; 2 perf tests pass. - Archival: the three in-progress idea docs (2026-05-17-wrong-location-guardrails, 2026-04-15-pretooluse-branch-dir- verification-hook, 2026-04-15-subagent-path-constraints-hardening) move to ideas/archive/ via git mv; ideas/README.md Done Index gets a row pointing at docs/quest-journal/wrong-location-guardrails_2026-05-18.md. Quest history: 3 plan iterations, 0 fix iterations. Reviewer B caught the installer merge-carefully gap at iteration 1 and the 10-cross-reference regression introduced by iteration 2's chosen placement; both were cross- verified empirically before the arbiter ruled. Final iteration approved with 0 findings from both code reviewers. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Two PR #116 review findings: 1) Validator silently accepted missing or misplaced canonical artifacts (Codex inline, scripts/quest_artifact_postflight.py:214). The loop iterated declared_paths only, so an "artifacts": [] handoff returned 0 without recording mismatches, and a canonical-named file at a misplaced sub-path inside the role boundary (e.g. phase_01_plan/nested/plan.md) passed every per-path check. Fix: in run(), compare resolved expected_paths to resolved declared_paths before the per-declared loop. For each expected canonical path that is not declared or not present on disk at the canonical location, emit a "missing" mismatch record. Remove the existence check from _check_one to avoid double-emitting when a declared canonical path is also flagged by the coverage step. AC4 ("mismatches cause non-zero exit and are not silently accepted") now holds for the empty-artifacts and misplaced cases as well. Three new tests cover the failure modes: - empty artifacts array for a role with expected paths - omitted canonical artifact - misplaced canonical-named file inside the phase boundary 2) Revert .ai/allowlist.json model pins. The all-Claude pin was scoped to the wrong-location-guardrails quest only; the per-quest snapshot at .quest/<id>/logs/allowlist_snapshot.json preserves that record. Keeping the global pin on main would change repo-default Quest routing for every checkout. Restored the prior mixed-model defaults. Tests: 587 unit + 2 perf (was 584 + 2). Both new mismatch behaviors covered. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Six P2 findings from the cubic AI reviewer, all valid: - workflow.md postflight invocation example used a generic handoff.json path. Most sub-agent roles write role-suffixed handoff files (handoff_plan-reviewer-a.json, handoff_arbiter.json, handoff_fixer.json, etc.). The example now uses <expected_handoff_path_for_role> with a pointer back to the role-to-filename table earlier in the same section. - branch-dir-context.sh single-line stdout contract did not survive paths containing LF/CR (legal in POSIX path components). Collapse embedded LF/CR to spaces with `tr` before formatting the output line so the consumer still sees exactly one line. - tests/unit/test_branch_dir_context_hook.sh was never invoked by the test job because pytest does not collect shell files. Added tests/unit/test_branch_dir_context_hook_runner.py that subprocesses bash to execute the shell suite and surfaces failure as a pytest failure with captured stdout/stderr. - test_workflow_postflight_wireup.py #1 (references_postflight_script) asserted only that the literal appeared anywhere in workflow.md. Now scoped to occurrences inside the Handoff File Polling section so a stray comment elsewhere can't satisfy the wire-in contract. - test_workflow_postflight_wireup.py #5 (no_pre_invocation_postflight_reference) terminated its scan on the first bold subheading after the Artifact preparation bullet, which truncated the block early (e.g., the `**Codex sandbox permissions:**` sub-item closed the scan before reaching the bullet's tail). Switched to a numbered-list-item boundary (`^\d+\. `) so the scan covers the full bullet block until item 6 starts. - quest_artifact_postflight.py _load_declared_artifacts could crash on unreadable handoff bytes or malformed JSON (FileNotFoundError, UnicodeDecodeError, JSONDecodeError) and on non-dict payloads (AttributeError on .get()). Wrapped both read paths and added a dict guard. Failure cases now return an empty declared list and the coverage check flags every expected canonical path as missing — the validator no longer crashes when the orchestrator falls back to text parsing. - quest_artifact_postflight.py coverage check used Path.resolve(), which follows symlinks. Two distinct expected paths could collapse onto one declared symlink target, satisfying coverage for an omitted canonical artifact. Switched the identity comparison to os.path.normpath(os.path.abspath(...)) so distinct canonical paths remain distinct. Existence still uses .exists() (symlink-followed) — that's the right semantic for "is the file there?". Tests: 588 unit (was 587, +1 shell-runner wrapper) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Codex flagged a false-positive in the coverage check introduced in 50d16e7: ``expected_artifacts_for_role(...)`` includes the role's own handoff filename (planner → ``("plan.md", "handoff.json")``, builder → ``("pr_description.md", "builder_feedback_discussion.md", "handoff.json")``, etc.), but the agent contracts in workflow.md do NOT ask roles to list their own handoff file in ``handoff.artifacts``. Roles declare deliverables there. The validator was therefore marking every well-formed handoff as ``missing`` the handoff file itself — every real run would have surfaced an ``accepted_with_warnings`` warning for an undeclared envelope. The empirical check: the in-tree builder handoff for the wrong-location quest declares ``pr_description.md`` + ``builder_feedback_discussion.md`` + touched source files, and not ``handoff.json``. With the prior coverage logic, that valid handoff would have logged a missing record. Fix: the validator is invoked on a specific ``--handoff`` path; that file's existence and identity are implicit. Exclude its normalized path from the coverage comparison. Per-declared checks still run for the handoff if a role chooses to declare it. Tests: - ``test_empty_artifacts_array_fails_for_role_with_expected_paths`` now expects 1 missing (``plan.md``) instead of 2; the handoff exclusion is the design. - Replaced ``test_omitted_canonical_artifact_fails`` (which baked the bug into the suite by asserting handoff-omission is a failure) with ``test_omitted_canonical_deliverable_fails`` covering a builder that omits ``builder_feedback_discussion.md``. - Added ``test_undeclared_handoff_file_does_not_record_missing`` pinning the new behavior with the real-world shape (handoff omitted from ``artifacts``; validator returns 0). Tests: 589 unit (was 588, +1 regression test) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Codex flagged the second false-positive in this validator: builder and fixer ARTIFACTS contracts legitimately list changed source/test/doc files alongside the canonical quest deliverables. The previous ``_check_one`` treated every declared path as a quest artifact and applied the boundary + canonical-name checks, so a normal successful build emitted ``outside_boundary`` for every workspace file. The wrong-location-guardrails builder handoff itself declares 16 paths outside ``phase_02_implementation/`` (the role boundary): the hook script, settings.json, the validator, manifest, AGENTS.md, the README, three archived idea docs, pyproject, five test files. Under the prior contract, that valid handoff would have logged 16 mismatch records. Fix: classify each declared path before applying per-path checks. * Quest-artifact path — resolves under ``<repo_root>/.quest/<id>/``. Full check set: traversal, nested ``.quest``, role boundary, canonical filename. The brief's wrong-location intent applies here. * Workspace-file path — anywhere else inside the repo. Only the traversal check applies. Boundary and canonical-name aren't meaningful for changed source files. Coverage check (run-level) is unchanged: every canonical quest artifact must still be declared. Workspace files don't affect coverage because they're not in ``expected_artifacts_for_role(...)``. Test impact: * ``test_outside_boundary_records_outside_boundary`` previously declared ``<workspace>/src/plan.md`` (outside ``.quest/``) which under the new classification is a workspace file, no mismatch. The test was asserting the wrong contract. Rewrote it to declare ``.quest/<id>/phase_03_review/plan.md`` for a planner whose boundary is ``phase_01_plan/`` — still inside ``.quest/`` but wrong phase, so ``outside_boundary`` fires correctly. * Added ``test_workspace_files_outside_quest_pass_through``: builder declares the same canonical-plus-workspace shape as the real wrong-location-guardrails handoff (16+ paths). Validator must return 0 with no records. Regression pin against the 16-false-positive failure mode. * Refreshed the module docstring to document the path classification and the full mismatch-reason set (the prior listing omitted ``unsupported_role_or_phase`` which the implementation can emit). Tests: 590 unit (was 589, +1 workspace-pass-through regression test) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Cubic flagged two real gaps in d01f20c's workspace-vs-quest path classification: 1) Workspace-file paths got traversal-only treatment, so a builder could declare ``scripts/foo.py`` without ever writing it and the validator would silently accept. That is the exact path-drift failure mode the brief targets. 2) Paths under ``.quest/`` belonging to a sibling quest's directory (``.quest/<OTHER-id>/foo.md``) were classified as workspace files and passed. Writing into another quest's space is the precise wrong-location failure mode this validator exists to catch. Tightened the non-current-quest branch to match cubic's suggestion: * Workspace-file path must exist on disk; missing → ``missing``. * Path inside ``.quest/`` but not under our quest_root → sibling-quest cross-write → ``outside_boundary``. * True workspace file (outside ``.quest/`` entirely, exists on disk) passes — unchanged. Two new regression tests: * ``test_workspace_file_missing_on_disk_records_missing`` — builder declares a workspace path that was never written. Validator records ``missing`` and exits non-zero. * ``test_sibling_quest_artifact_records_outside_boundary`` — builder reaches into a sibling quest's directory. Validator records ``outside_boundary`` and exits non-zero. The sibling file is written to disk so the assertion can't be shadowed by ``missing``. Module docstring refreshed to spell out the workspace + sibling-quest semantics. Tests: 592 unit (was 590, +2 regression pins) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Two follow-up gaps in 3822f6b's non-current-quest branch, both flagged by cubic: 1) Sibling-quest detection ran AFTER the existence check, so a declared path under another quest's directory that happened not to exist was reported as ``missing`` rather than ``outside_boundary``. The real failure is the cross-quest declaration; whether the write actually happened is secondary. Reordered to check sibling-quest membership first. 2) The ``relative_to(repo_root / ".quest")`` test was too broad: it treated every path under ``.quest/`` as a sibling-quest cross-write, including legitimate shared infrastructure paths. Three paths are cross-quest by design and must pass: * ``.quest/cache/...`` — preflight stores host-verified bridge probes. * ``.quest/backlog/deferred_findings.jsonl`` — canonical review- intelligence deferred-findings log. * ``.quest/audit.log`` — persistent tool-call audit across quest runs. Empirically present in this repo: ``ls .quest/`` shows ``audit.log`` and ``backlog/`` alongside the per-quest dir. ``archive`` is intentionally NOT shared: writing into ``.quest/archive/`` mid-build would be a real wrong-location failure, archival is the orchestrator's completion-time ceremony. Added ``_SHARED_DOT_QUEST_NAMES`` frozenset and check the first segment of the path's relative-to-``.quest/`` decomposition against it. Paths under shared names take the workspace-file pass; paths under any other name (quest-id-keyed sibling dirs) fire ``outside_boundary``. Two new regression tests: * ``test_sibling_quest_missing_on_disk_still_outside_boundary`` — pins the reorder. Builder declares a sibling-quest path that was never written. Validator MUST record ``outside_boundary``, NOT ``missing``. * ``test_shared_dot_quest_paths_pass_through`` — pins the allowlist. Builder declares ``.quest/cache/...``, ``.quest/backlog/...``, and ``.quest/audit.log`` alongside canonical deliverables. Validator returns 0 with no records. Tests: 594 unit (was 592, +2 regression pins) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Codex flagged a worktree-mode defect in the validator: ``.quest/<id>/`` lives in the original repo, but builder and fixer roles edit files in ``source_workspace_root`` (the worktree). The validator derived ``repo_root = quest_dir.parent.parent`` for everything and used it for both quest-artifact and workspace-file checks. In worktree mode that meant: * A declared relative path like ``scripts/foo.py`` resolved against the original repo (where the file isn't), so existence failed. * An absolute worktree path (``/path/to/worktree/scripts/foo.py``) sat outside the original repo's tree, so the traversal check fired ``traversal_outside_repo``. Both are false positives on every well-formed worktree build. Fix: thread a separate ``workspace_root`` through the validator. * ``repo_root`` — the directory containing ``.quest/<id>/``. Anchors quest-artifact path checks and the sibling-quest detector. * ``workspace_root`` — where the agent was editing. Anchors workspace- file path checks: relative-path resolution in ``handoff.artifacts``, traversal containment, existence on disk. In non-worktree mode the two are the same and existing tests still pass. In worktree mode the orchestrator passes ``--workspace-root <source_workspace_root>``. The CLI adds ``--workspace-root`` (optional, defaults to the repo root so non-worktree invocations work unchanged). ``run(...)`` gains the matching kwarg. ``_check_one`` picks the traversal anchor based on path classification (repo_root for quest artifacts, workspace_root for workspace files). ``_load_declared_artifacts`` resolves relative paths against ``workspace_root`` per the workflow doctrine that orchestrators pass absolute quest-artifact paths in worktree mode. Workflow doc updated to include ``--workspace-root <source_workspace _root>`` in the example invocation, with a paragraph explaining when the flag matters. Two new regression pins: * ``test_worktree_mode_workspace_files_resolve_against_worktree`` — builder handoff with absolute quest-artifact paths plus a worktree-relative ``scripts/foo.py``; validator returns 0 only when ``workspace_root`` is the worktree. * ``test_worktree_mode_without_workspace_root_flag_breaks`` — same shape but ``workspace_root`` omitted; validator MUST record ``traversal_outside_repo`` for the worktree path. Documents why the flag matters and guards against future removal. Tests: 596 unit (was 594, +2 worktree-mode regression pins) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Cubic flagged a regression introduced by b316c3c: the new ``workspace_root`` traversal anchor was selected by ``is_quest_artifact`` (true only for our-quest paths). In worktree mode that anchored both shared ``.quest/`` infrastructure paths (``cache/``, ``backlog/``, ``audit.log``) and sibling-quest paths (``.quest/<OTHER-id>/...``) to the worktree root — but those paths live in the original repo, not the worktree. The traversal check then falsely emitted ``traversal_outside_repo``. The right invariant: any path under ``<repo>/.quest/`` belongs with the original repo regardless of mode. Only paths outside ``.quest/`` are workspace files. Fix: replace the ``is_quest_artifact`` traversal-anchor selector with an ``in_dot_quest_tree`` selector (whether the resolved path is anywhere under ``<repo>/.quest/``). Paths under ``.quest/`` anchor to ``repo_root``; everything else anchors to ``workspace_root``. The downstream sibling/shared classification reuses the same ``relative_to(.quest)`` computation instead of recomputing it. In non-worktree mode the behavior is unchanged because both roots are the same path. Two new regression pins: * ``test_worktree_mode_shared_dot_quest_paths_pass`` — repo and worktree in separate temp dirs; builder declares ``.quest/cache/...``, ``.quest/backlog/...``, and ``.quest/audit.log`` alongside canonical deliverables. Validator returns 0; under the b316c3c code these three would have logged ``traversal_outside_repo``. * ``test_worktree_mode_sibling_quest_records_outside_boundary`` — repo and worktree separate; builder declares a sibling-quest path. The validator MUST report ``outside_boundary``, NOT ``traversal_outside_repo``. Tests: 598 unit (was 596, +2 worktree-shared/sibling regression pins) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Two follow-ups from the worktree-mode design discussion:
1) Document the ``<worktree>/.quest -> <repo>/.quest`` symlink
invariant in two places so the next reviewer doesn't have to
re-derive why the validator handles worktree mode the way it does:
- Validator module docstring gets a "Worktree symlink invariant"
paragraph explaining that the three forms an agent might write
for the same quest artifact (absolute repo path, absolute
worktree path, worktree-relative path) all canonicalize to the
same on-disk inode under the repo via Path.resolve(). The
``--workspace-root`` flag only governs non-``.quest/`` workspace
files.
- AGENTS.md's "Wrong-location guardrails" section gets a one-line
callout pointing to the same invariant.
2) Add ``test_worktree_mode_combined_smoke`` exercising all four
classifications in a single handoff (our canonical quest artifact,
true workspace file, shared ``.quest/cache/`` path, sibling-quest
cross-write) with the ``.quest`` symlink in place. The previous
pins covered each classification individually; this one pins them
acting together so a future refactor that reshuffles classification
order cannot silently break one path while another still passes.
The test uses canonical repo paths for quest artifacts per the
workflow.md contract (orchestrators pass canonical paths even in
worktree mode). The symlink is present as a structural element but
not used to declare paths.
Tests: 599 unit (was 598, +1 combined smoke) + 2 perf green.
Quest/Co-Authored by
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
in collaboration with the repository author
Codex flagged a real interaction left over from eff55b3: the coverage check identity used ``os.path.normpath/abspath`` (cubic's anti-symlink- collision defense from a prior round), which does NOT follow the ``<worktree>/.quest -> <repo>/.quest`` symlink. In worktree mode an agent's worktree-relative declaration like ``.quest/<id>/.../pr_description.md`` resolves through ``_load_declared_artifacts`` against ``workspace_root`` to ``<worktree>/.quest/.../pr_description.md`` — then the coverage identity compares it against the canonical ``<repo>/.quest/.../pr_description.md`` and emits a false ``missing``, even though ``_check_one`` would later resolve the symlink and accept the path. My previous commit (eff55b3) papered over this by leaning on the ``workflow.md`` doctrine that orchestrators pass canonical repo paths even in worktree mode. That's brittle — a well-formed handoff with a worktree-rooted relative entry is also legitimate, and the validator should be robust to either form. Fix: split the coverage identity by path location. * **Paths under ``<repo>/.quest/``** (either directly or via the worktree ``.quest`` symlink): canonicalize through ``Path.resolve()`` so the worktree mount and the repo mount produce the same identity for the same on-disk inode. * **Non-``.quest/`` paths** (true workspace files): keep the ``os.path.normpath(os.path.abspath(...))`` identity. Cubic's anti-symlink-collision defense remains intact for genuine workspace files — two distinct workspace files that happen to symlink to one target stay distinct identities. This narrows cubic's defense to where it actually applies (workspace files), and trusts ``expected_artifacts_for_role`` to return logically distinct paths inside ``.quest/`` so resolving through the symlink cannot collapse them. Updated the module docstring's "Worktree symlink invariant" section to reflect the new behavior (validator actively canonicalizes, not just relies on the orchestrator contract). New regression test: ``test_worktree_mode_quest_artifact_via_symlinked_mount`` — agent declares one quest artifact as an absolute worktree path and one as a worktree-relative path, with the ``.quest`` symlink in place. Coverage must NOT emit ``missing`` for either; both paths resolve to canonical repo inodes. Tests: 600 unit (was 599, +1 symlink-declare regression pin) + 2 perf green. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
Two Codex findings on ``9e00998``, both ``Should fix``, both real:
1) The orchestrator instruction in ``workflow.md`` surfaced records
from ``.quest/<id>/logs/path_compliance.log`` to the user. That log
is the append-only audit trail across every validator invocation in
the quest. Reading it for current-run attribution surfaces stale
records from earlier roles, retries, or iterations and
misattributes them to the active handoff.
Fix: the CLI now emits one JSON line per current-run mismatch to
stdout (in addition to the log append). Stdout stays empty on a
clean run (exit 0). The orchestrator surfaces stdout to the user
instead of reading the log; the log remains the forensic trail.
``_emit_mismatch_lines`` is a new helper alongside
``_append_mismatch_lines``. Both run-level return sites (validator
exception path and the normal mismatch path) emit + append in
lockstep.
``workflow.md`` step 7 updated to reflect the new attribution
contract — surface stdout, treat the log as audit-only.
2) ``.claude/hooks/branch-dir-context.sh`` collapses LF/CR in the
working directory with ``tr``. The hook conditionally handles
``git`` being absent (PATH-restricted sandboxes), but the same
sandboxes can drop ``tr`` too — both live in ``/usr/bin``. A
``tr: command not found`` on stderr would break the hook's
no-stderr / single-line contract.
Fix: use bash parameter expansion ``${var//pattern/replacement}``
for the LF/CR substitution. It's a builtin (bash 2.0+, 1996), no
PATH lookup, no external command. The hook now has zero external
dependencies on the formatting path — only ``git`` remains, and
the surrounding code already handles its absence.
Two new regression pins under ``TestCliStdoutContract``:
* ``test_clean_run_emits_no_stdout`` — exit 0 ⇒ stdout empty. Lets
callers use stdout presence as a "had mismatches" signal.
* ``test_mismatch_run_emits_only_current_records_to_stdout`` — two
sequential invocations on the same quest dir (builder, then
planner). Each invocation's stdout contains ONLY that run's
mismatches. The persistent log retains BOTH (audit trail intact).
Pins the stale-attribution failure mode Codex flagged.
Tests: 602 unit (was 600, +2 stdout-contract pins) + 2 perf green.
Quest/Co-Authored by
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
in collaboration with the repository author
PR #116 review feedback: the rationale for using bash parameter expansion over ``tr`` invoked hypothetical PATH-restricted sandboxes missing coreutils — YAGNI for the platforms we support. Keep the builtin implementation; drop the justification prose. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> in collaboration with the repository author
PR #116 review (Codex, Must fix): the per-path boundary check used ``resolved.relative_to(boundary_dir)``, which accepts any depth under the phase directory. A handoff declaring both the real canonical path (satisfying coverage) and an off-spec ``phase_XX/nested/<canonical>.md`` slipped through — ``_check_one`` accepted the nested entry because it was under boundary and had a canonical basename. Require exact parent equality (``resolved.parent == boundary_dir``). Expected canonical paths share the same phase dir by construction, so the tighter invariant is consistent with the artifact contract. Regression pin: ``test_extra_canonical_named_file_in_nested_subdir_ records_outside_boundary`` declares both the canonical and the nested duplicate and asserts the nested declaration records ``outside_boundary``. Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Codex <noreply@openai.com> in collaboration with the repository author
…inks PR #116 review (Codex, Must fix): a canonical quest path implemented as a symlink to a workspace file (e.g. ``pr_description.md`` -> ``scripts/ foo.py``) slipped through. ``_check_one`` classified by the resolved target, treated the symlink as a workspace file, and accepted it; the expected-identity comparison in coverage matched on both sides because both resolve to the same target. Two related fixes in ``_check_one``/``run``: 1) Classify by the DECLARED path's directory, not the resolved target. ``declared_path.parent.resolve() / declared_path.name`` canonicalizes the worktree ``.quest`` directory symlink while leaving any file-level symlink visible. A symlink at a canonical quest path is now classified as a quest artifact and reaches the boundary check. 2) Resolve ``boundary_dir``'s parent only, not the file. ``Path( expected_paths[0]).resolve().parent`` would follow a file-level symlink on the canonical path and corrupt the boundary to wherever the link pointed. Use ``Path(expected_paths[0]).parent.resolve()`` so the boundary stays anchored to the phase directory regardless of whether the canonical artifact was provisioned as a real file or a symlink. Regression pin: ``test_quest_artifact_symlink_escaping_boundary_records_ outside_boundary`` provisions ``pr_description.md`` as a symlink to ``<workspace>/scripts/foo.py``, declares it alongside a real second canonical, and asserts ``outside_boundary`` on the symlink declaration. Tests: 781 unit/integration green (one pre-existing dashboard test failure on this branch is unrelated). Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: Codex <noreply@openai.com> in collaboration with the repository author
ad22640 to
aec30b1
Compare
Summary
Ship two additive guardrails that ride out through the installer to every Quest user, targeting the top friction class in the eval data (wrong-branch / wrong-directory / wrong nested
.quest/edits — 55 "wrong approach" events across recent sessions).PreToolUse Edit|Writehook that prints branch + working directory before every edit.workflow.md's Handoff File Polling pattern.Important
Existing Quest installs with a customized
.claude/settings.jsonwill receive a.quest_updatedsidecar on the next installer run and must merge the newPreToolUse Edit|Writeblock manually. The installer'smerge-carefullymode is checksum-compare, not JSON-aware merge. The caveat is documented inAGENTS.md; the manifest contract test asserts the caveat string stays present.Changes
.claude/hooks/branch-dir-context.sh— prints[quest-context] branch=<name|no-git> dir=<path>; exits 0 in git, non-git, and detached-HEAD contexts; never blocks the wrapped tool call..claude/settings.json— additivePreToolUse Edit|Writeentry; existingSessionStartandPostToolUse Write|Edithooks preserved verbatim.scripts/quest_artifact_postflight.py— filesystem-only check againstexpected_artifacts_for_role(...). Five mismatch reasons (missing,outside_boundary,noncanonical_filename,nested_quest_path,path_traversal) plus defensiveunsupported_role_or_phase. Non-zero exit on mismatch; appends one JSON line per mismatch to.quest/<id>/logs/path_compliance.log. Orchestrator policy isaccepted_with_warnings(non-halting per kill criterion). Latency target<50 ms, regression cap<200 ms..skills/quest/delegation/workflow.md— new postflight step appended at the end of the Handoff File Polling pattern. Items 1–6 keep their numbers so the 10 hard-codedHandoff File Polling §5/§6cross-references elsewhere in the doc don't drift..quest-manifest— both new files added under[copy-as-is]so downstream installs receive them.AGENTS.md— "Wrong-location guardrails" section naming both guardrails, the disable mechanism, and the installer caveat.ideas/README.md— Done Index row added; three active rows removed.docs/quest-journal/wrong-location-guardrails_2026-05-18.md— quest journal with celebration data.ideas/2026-05-17-wrong-location-guardrails.md,ideas/2026-04-15-pretooluse-branch-dir-verification-hook.md,ideas/2026-04-15-subagent-path-constraints-hardening.mdmoved toideas/archive/viagit mv.tests/unit/test_branch_dir_context_hook.sh— hook contexts (git, non-git, detached-HEAD), stdout format, exit-0 guarantee.tests/unit/test_quest_artifact_postflight.py— 12 cases including latency target and regression cap behindpytest.mark.perf.tests/unit/test_workflow_postflight_wireup.py— placement (end-of-section) + cross-reference count assertions (§5 × 6,§6 × 4).tests/unit/test_manifest_guardrails_entries.py—[copy-as-is]entries + AGENTS.md literals +merge-carefullyposture forsettings.json.tests/unit/test_ideas_archive_guardrails_housekeeping.py— archive paths exist + Done Index row points at the journal.pyproject.toml— registersperfmarker so non-perf runs filter cleanly..ai/allowlist.json— every role pinned toclaudefor this quest (committed earlier in the branch).Validation
pyproject.toml. Action:python3 -m pytest tests/unit/ -q -m "not perf". Expected: 584 tests pass, 0 failures. Thenpython3 -m pytest tests/unit/test_quest_artifact_postflight.py -q -m perffor the two latency tests (<50 mstypical and<200 msstress).bash tests/unit/test_branch_dir_context_hook.sh. Expected: 5 cases pass — git repo with a named branch, non-git directory, detached-HEAD, stdout format check, exit-0 guarantee.EditorWritetool call (or runbash .claude/hooks/branch-dir-context.shdirectly). Expected: one line[quest-context] branch=claude/review-intelligence-roadmap-i0my2 dir=/home/user/quest(or equivalent) printed to stdout. The wrapped tool call is unaffected.python3 scripts/quest_artifact_postflight.py --quest-dir .quest/wrong-location-guardrails_2026-05-18__0003 --role planner --phase plan(substitute any role/phase). Expected: exit 0 (no mismatches againstexpected_artifacts_for_role). Then run with a fake declared artifact pointing outside.quest/: expected non-zero exit and a JSON record appended to.quest/wrong-location-guardrails_2026-05-18__0003/logs/path_compliance.logwithreason: "outside_boundary".grep -n "quest_artifact_postflight" .skills/quest/delegation/workflow.md(expect the line to be after the "Three-tier fallback ladder" header at ~line 143) andgrep -c "Handoff File Polling §5" .skills/quest/delegation/workflow.md(expect6) andgrep -c "Handoff File Polling\*\* §6" .skills/quest/delegation/workflow.md(expect4).Watch for: a customized
.claude/settings.jsonin your install — the installer will not automatically merge the newPreToolUse Edit|Writeentry into it. Pick up the new block from the.quest_updatedsidecar by hand.Notes
merge-carefullygap that would have shipped the hook to no one with a customizedsettings.json, and (2) the 10-cross-reference rot that mid-list step insertion would have caused inworkflow.md. Both were verified empirically (line numbers, grep counts) before the arbiter ruled. Final iteration approved with 0 findings from both code reviewers.run(...). Orchestrators read the workflow doc and call the CLI. Auto-invocation is logged as a follow-up backlog item.accepted_with_warnings(non-halting). CLI exits non-zero, structured log persists, orchestrator surfaces a warning. Tighten to halting only once false-positive rate is empirically zero in the field..quest/backlog/deferred_findings.jsonlfor the next quest touching the postflight validator (docstring drift, fragile test heuristic,--quest-modehelp-text polish).▐▛███▜▌ ▝▜█████▛▘ ▘▘ ▝▝ Quest/Co-Authored by Co-Authored-By: Claude Opus 4.7 in collaboration with the repository authorGenerated by Claude Code