Skip to content

fix: resolve Windows Codex spawn failures for local review runs#250

Open
anagnorisis2peripeteia wants to merge 3 commits into
openclaw:mainfrom
anagnorisis2peripeteia:fix/windows-codex-spawn
Open

fix: resolve Windows Codex spawn failures for local review runs#250
anagnorisis2peripeteia wants to merge 3 commits into
openclaw:mainfrom
anagnorisis2peripeteia:fix/windows-codex-spawn

Conversation

@anagnorisis2peripeteia
Copy link
Copy Markdown

@anagnorisis2peripeteia anagnorisis2peripeteia commented Jun 3, 2026

Summary

  • Add CODEX_BIN env var support following the existing GIT_BIN pattern
  • Add shell: process.platform === "win32" to both Codex spawnSync call sites
  • Add focused tests for CODEX_BIN override and stdin prompt delivery

Motivation

On Windows, npm installs codex as codex.cmd. Node's spawnSync cannot execute .cmd files without shell: true, causing ENOENT or EINVAL errors when ClawSweeper tries to spawn Codex for review or assist.

Proof

Tested on Windows 11, Node 24.12.0, Codex CLI 0.130.0.

Before (no CODEX_BIN, no shell):

codex failure detail: spawnSync codex ENOENT
review_codex_elapsed_ms: 261
review_status: failed

After (CODEX_BIN + shell: true on win32):

Full local review completed against openclaw/openclaw#89834:

[review] shard=0/1 start #89834 (1/1)
[review] shard=0/1 start-comment=existing #89834
[review] shard=0/1 done #89834 (1/1) decision=keep_open confidence=medium

review_codex_elapsed_ms: 368114
review_status: complete

Codex consumed the 119k char prompt via stdin (- flag with input: prompt), executed the review with --output-schema and --output-last-message, and produced structured review output with actionable findings. The review correctly identified a missing applyPluginTextReplacements call on commentary text in the reviewed PR.

Tests

  • CODEX_BIN override: platform-aware fake binary (.cmd on Windows, shebang on Unix) receives structured output and produces a valid decision
  • stdin delivery: verifies the prompt body reaches the Codex subprocess via stdin (Unix; Windows skipped since shell: true changes stdin handling)

Files changed

  • src/clawsweeper.ts -- runCodex() and runCodexAssist(): add CODEX_BIN env var + shell: win32
  • test/clawsweeper.test.ts -- 2 focused tests

Test plan

  • Windows: ENOENT/EINVAL resolved, Codex subprocess launches
  • Windows: full 6-minute review completed with structured output
  • CODEX_BIN test passes (platform-aware fake binary)
  • stdin delivery test passes (Unix)
  • Mac/Linux: no behavioral change (shell is false, CODEX_BIN defaults to "codex")

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

Codex review: needs maintainer review before merge. Reviewed June 3, 2026, 6:36 PM ET / 22:36 UTC.

Summary
The PR adds a CODEX_BIN override and Windows shell: true handling to ClawSweeper Codex subprocess launchers, plus focused subprocess tests.

Reproducibility: yes. source-reproducible: current main hardcodes spawnSync("codex", ...) without Windows shell handling at the affected Codex launcher sites, and the PR body provides Windows before/after terminal output. I did not execute a Windows reproduction from this Linux checkout.

Review metrics: 2 noteworthy metrics.

  • Diff surface: 4 files changed, 152 additions, 4 deletions. The patch is still narrow, but it touches core Codex subprocess launch behavior and tests.
  • Codex launchers updated: 4 spawnSync Codex sites changed. The latest head covers the main review, assist, commit-review, and PR close-coverage proof launchers instead of leaving the previous review-adjacent gap.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Wait for required checks or run pnpm run check on Node 24 before merge.

Risk before merge

  • [P1] Windows shell: true changes process invocation semantics for ClawSweeper Codex automation paths; the PR body's Windows terminal proof is strong, but maintainers still need to accept that runtime behavior before merge.
  • [P1] The PR body does not show a full pnpm run check result, so final merge confidence should come from required checks or a maintainer-run Node 24 check.

Maintainer options:

  1. Merge after standard validation (recommended)
    Maintainers can accept the Windows-specific shell behavior if required checks or a Node 24 pnpm run check pass, because the latest head now covers the review-adjacent spawnSync launchers and includes Windows before/after proof.
  2. Ask for stronger Windows validation
    If maintainers want more confidence in Windows shell/stdin behavior, ask for a Windows pnpm run check or targeted test output before merge.

Next step before merge

  • No automated repair remains; maintainers need to decide whether the Windows shell launch behavior and available validation are enough to merge.

Security
Cleared: The diff changes local process invocation options and tests only; it does not add dependencies, broaden GitHub permissions, or alter secret handling.

Review details

Best possible solution:

Land the narrow Windows-safe Codex launcher update for review-adjacent subprocesses after standard validation, and leave broader repair-lane Windows support to a separate follow-up if maintainers want that scope.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current main hardcodes spawnSync("codex", ...) without Windows shell handling at the affected Codex launcher sites, and the PR body provides Windows before/after terminal output. I did not execute a Windows reproduction from this Linux checkout.

Is this the best way to solve the issue?

Yes for the stated local-review failure: the patch follows the Windows .cmd shell requirement and now applies that policy consistently to the review-adjacent spawnSync launchers. It should not be treated as full repair-lane Windows support.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 74f63a090af7.

Label changes

Label changes:

  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix Windows terminal output showing a completed local review with structured output after the pre-fix spawn failure.
  • remove status: ⏳ waiting on author: Current PR status label is status: 👀 ready for maintainer look.
  • remove rating: 🦐 gold shrimp: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

  • P2: The PR fixes a real Windows local-review automation failure with limited blast radius to Codex subprocess launch behavior.
  • merge-risk: 🚨 automation: The diff changes how ClawSweeper launches Codex subprocesses on Windows, which affects review and proof automation behavior beyond what Linux CI alone settles.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix Windows terminal output showing a completed local review with structured output after the pre-fix spawn failure.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix Windows terminal output showing a completed local review with structured output after the pre-fix spawn failure.
Evidence reviewed

What I checked:

  • AGENTS.md policy read: Repository guidance was read fully and applied because this PR changes core ClawSweeper automation and tests. (AGENTS.md:1, 74f63a090af7)
  • Current main hardcodes Codex launch: Current main still invokes spawnSync("codex", ...) without a Windows shell or CODEX_BIN override in the main review path, matching the reported Windows .cmd launcher failure class. (src/clawsweeper.ts:6181, 74f63a090af7)
  • Prior blocker addressed: The latest PR head extends the same CODEX_BIN plus Windows shell handling to src/pr-close-coverage-proof.ts, addressing the previous review's remaining local-review proof launcher concern. (src/pr-close-coverage-proof.ts:258, 0c024d4aa917)
  • Commit review launcher included: The latest PR head also updates src/commit-sweeper.ts, so commit-review Codex spawnSync use follows the same launch policy. (src/commit-sweeper.ts:300, 0c024d4aa917)
  • Focused tests added: The PR adds a custom CODEX_BIN subprocess test and a Unix stdin-delivery test for the main runCodex path; Windows stdin behavior is covered by the contributor's terminal proof rather than automated cross-platform test execution here. (test/clawsweeper.test.ts:13873, 0c024d4aa917)
  • Contributor real behavior proof: The PR body includes Windows 11 before/after terminal output showing the pre-fix Codex spawn failure and an after-fix completed local review with structured output. (0c024d4aa917)

Likely related people:

  • brokemac79: Git blame points the current main review, assist, and commit-review Codex subprocess launch code at the superseded-PR close safety work. (role: introduced behavior; confidence: high; commits: f2ec021eb55c; files: src/clawsweeper.ts, src/commit-sweeper.ts)
  • Jesse Merhi: Git blame shows the PR close-coverage proof Codex launcher was introduced with the close-coverage proof feature. (role: adjacent feature owner; confidence: high; commits: 2af278ebe246; files: src/pr-close-coverage-proof.ts, src/clawsweeper.ts, test/clawsweeper.test.ts)
  • Tak Hoffman: Recent commits by this author touched the same review automation and proof-nudge surfaces around src/clawsweeper.ts and tests. (role: recent adjacent contributor; confidence: medium; commits: 74f63a090af7, 54422d977482, 666e707a8288; files: src/clawsweeper.ts, test/clawsweeper.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 3, 2026
@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 3, 2026
@anagnorisis2peripeteia anagnorisis2peripeteia marked this pull request as ready for review June 3, 2026 20:44
On Windows, npm installs codex as a .cmd wrapper. Node's spawnSync
cannot execute .cmd files without shell: true, causing ENOENT/EINVAL.

Fix:
- Add CODEX_BIN env var support (follows existing GIT_BIN pattern)
- Add shell: true on win32 for .cmd wrapper compatibility
- Keep stdin pipe with '-' flag (documented Codex exec interface)
@anagnorisis2peripeteia
Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@anagnorisis2peripeteia
Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 3, 2026
@anagnorisis2peripeteia
Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels Jun 3, 2026
@anagnorisis2peripeteia
Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 3, 2026
Extends the Windows .cmd wrapper fix to pr-close-coverage-proof.ts
and commit-sweeper.ts, covering all Codex subprocess launchers.
@anagnorisis2peripeteia
Copy link
Copy Markdown
Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Jun 3, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 3, 2026
anagnorisis2peripeteia added a commit to anagnorisis2peripeteia/clawsweeper that referenced this pull request Jun 4, 2026
Adds a 'local-review' CLI command that reviews the current branch diff
against a base branch using Codex, without requiring a GitHub PR.

Includes all prerequisites for local runs:
- CODEX_BIN env var support (Windows .cmd wrapper compatibility)
- shell: true on win32 for all Codex spawn sites
- CLAWSWEEPER_CODEX_LOGIN_METHOD for ChatGPT Pro OAuth login
- Applied to all 4 Codex subprocess launchers

Merge order: this PR subsumes openclaw#250 (Windows spawn) and openclaw#251 (OAuth
login). If this lands first, close those as superseded. If either
lands first, this PR needs a trivial rebase (identical changes).
anagnorisis2peripeteia added a commit to anagnorisis2peripeteia/clawsweeper that referenced this pull request Jun 4, 2026
Adds a 'local-review' CLI command that reviews the current branch diff
against a base branch using Codex, without requiring a GitHub PR.

Includes all prerequisites for local runs:
- CODEX_BIN env var support (Windows .cmd wrapper compatibility)
- shell: true on win32 for all Codex spawn sites
- CLAWSWEEPER_CODEX_LOGIN_METHOD for ChatGPT Pro OAuth login
- Applied to all 4 Codex subprocess launchers
- Clears stale output before each review run

Merge order: this PR subsumes openclaw#250 (Windows spawn) and openclaw#251 (OAuth
login). If this lands first, close those as superseded. If either
lands first, this PR needs a trivial rebase (identical changes).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant