Harden CI AI review execution by xingbowang · Pull Request #14796 · facebook/rocksdb

xingbowang · 2026-05-27T20:55:17Z

Summary

Run Codex CI review, classifier, query, and recovery invocations non-interactively with --ask-for-approval never --sandbox read-only instead of --dangerously-bypass-approvals-and-sandbox.
Filter OPENAI_API_KEY, ANTHROPIC_API_KEY, and GITHUB_TOKEN out of shell environments inherited by Codex-managed commands.
Remove Write and Task from Claude review allowed_tools while retaining read-only repository inspection tools.

Test Plan

git diff --check upstream/main...2026_05_27_codex_review
make check-workflow-yaml
make check-sources
codex --ask-for-approval never --sandbox read-only exec review --help

T272955643 identified that fork PR content is passed into AI review prompts while the Codex path previously ran with `--dangerously-bypass-approvals-and-sandbox`. Keep fork PR reviews enabled, but remove that unsafe execution mode from the shared analysis workflow. Run every Codex classifier, review, query, and recovery invocation non-interactively with `--ask-for-approval never --sandbox read-only` instead. This keeps CI from blocking on approvals while preventing model-driven shell commands from writing to the workspace, and it filters `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, and `GITHUB_TOKEN` out of the shell environment inherited by those commands. Also remove `Write` and `Task` from the main Claude review calls. Claude still has read-only repository tools for analysis, but prompt-injected PR content no longer gets a direct path to modify checkout files that later `github-script` steps execute. Test Plan: `git diff --check` `make check-workflow-yaml` `make check-sources` `codex --ask-for-approval never --sandbox read-only exec review --help`

github-actions · 2026-05-27T20:56:53Z

✅ clang-tidy: No findings on changed lines

Completed in 0.0s.

github-actions · 2026-05-27T21:12:49Z

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit a5da299

❌ Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

Codex may miss context from files not in the diff
Large PRs may be truncated
Always apply human judgment to AI suggestions

Commands:

/codex-review [context] — Request a code review
/codex-query <question> — Ask about the PR or codebase

github-actions · 2026-05-27T21:19:50Z

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit a5da299

Summary

Good security hardening PR that replaces the dangerous --dangerously-bypass-approvals-and-sandbox flag with proper --sandbox read-only and secret filtering. The primary review workflow is preserved. One medium-risk item needs verification before merge.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Verify `--sandbox read-only` allows `--output-last-message` to write files — all Codex invocations

Issue: The PR applies --sandbox read-only to all 8 Codex invocations while simultaneously using --output-last-message codex-review-output.md (and similar) to write review output. If the read-only sandbox prevents the Codex CLI process itself from writing files (not just spawned shell commands), the review output will be lost.
Root cause: The sandbox scope (CLI process vs. spawned commands) is not documented in the PR, and the test plan (codex ... --help) only validates flag acceptance, not actual file-writing behavior.
Suggested fix: Before merging, run a live test: codex --ask-for-approval never --sandbox read-only exec -m <model> --output-last-message /tmp/test-output.txt - <<< "Say hello" and verify the output file is created. If it fails, the --output-last-message flag may need to be handled differently (e.g., piping stdout).

M2. Max-turns recovery quality degradation for Claude reviews — `.github/workflows/ai-review-analysis.yml:625,1255`

Issue: The CI review prompt (claude_md/ci_review_prompt.md) instructs Claude to write intermediate files (review-findings.md, context.md, findings-*.md) using the Write tool. With Write removed, Claude cannot create review-findings.md. The recovery logic (line 644) checks fs.existsSync('review-findings.md') to decide whether to run a recovery formatting session. Without this file, needs_recovery will always be false.
Root cause: The review prompt was designed to work with Write tool access; the prompt was not updated alongside this hardening change.
Impact: This is NOT a workflow-breaking issue. The primary review output flows through the execution log (parse-claude-review.js reads resultMessage.result from the JSON execution log). When Claude hits max turns without review-findings.md, the comment builder falls back to extracting the last substantial assistant text from the execution log (line 75). Reviews still produce output; only the recovery formatting is lost.
Suggested fix: Either (a) update claude_md/ci_review_prompt.md to remove file-writing instructions (since they're now no-ops), or (b) accept the quality degradation in the max-turns edge case. Option (a) is recommended to avoid confusing the model with impossible instructions.

🟢 LOW / NIT

L1. Secret exclusion list could be more comprehensive — all Codex invocations

Issue: The shell_environment_policy.exclude only lists OPENAI_API_KEY, ANTHROPIC_API_KEY, and GITHUB_TOKEN. Other runner-provided variables like ACTIONS_RUNTIME_TOKEN or ACTIONS_CACHE_URL are not explicitly excluded.
Root cause: The shell_environment_policy.inherit="core" setting is the primary control, limiting inheritance to core env vars (PATH, HOME, etc.). The exclude list is defense-in-depth.
Suggested fix: Consider adding ACTIONS_RUNTIME_TOKEN and ACTIONS_CACHE_URL to the exclude list for additional defense-in-depth, though inherit="core" should already prevent their inheritance.

L2. Review prompt contains impossible instructions after this change

Issue: The review prompt (claude_md/ci_review_prompt.md) still instructs Claude to "append your findings to review-findings.md using the Write tool" and "write the complete review to review-findings.md." With Write removed, Claude will attempt these operations and receive tool restriction errors, wasting turns.
Suggested fix: Update the review prompt to remove or replace file-writing instructions. Claude should be instructed to output the final review as its response text instead.

Cross-Component Analysis

Component	Impact	Status
Claude primary review flow	No impact — output via execution log	Safe
Claude max-turns recovery	Degraded — falls back to raw text extraction	Acceptable
Codex primary review flow	Depends on sandbox scope for `--output-last-message`	Needs verification
Codex recovery flow	Same sandbox dependency	Needs verification
Comment builder (Claude)	No impact — reads execution log	Safe
Comment builder (Codex)	Falls back to log tail if output file missing	Graceful degradation
Comment poster workflow	No impact — reads uploaded artifact	Safe
Secret filtering	Effective with `inherit="core"` as primary control	Good

Positive Observations

Replacing --dangerously-bypass-approvals-and-sandbox with specific controls is a significant security improvement. The flag name itself indicates it was never intended for production use.
The shell_environment_policy.inherit="core" + explicit exclude list provides layered defense against secret leakage.
YAML injection prevention is properly maintained throughout (user inputs via env: blocks, not inline ${{ }}).
The permission model (contents: read, actions: read, checks: read, pull-requests: read) follows principle of least privilege.
All 8 Codex invocations are consistently updated — no invocations were missed.
The security comment accurately describes the new posture.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

Claude may miss context from files not in the diff
Large PRs may be truncated
Always apply human judgment to AI suggestions

Commands:

/claude-review [context] — Request a code review
/claude-query <question> — Ask about the PR or codebase

meta-cla Bot added the CLA Signed label May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden CI AI review execution#14796

Harden CI AI review execution#14796
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:2026_05_27_codex_review

xingbowang commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Findings

🔴 HIGH

🟡 MEDIUM

M1. Verify `--sandbox read-only` allows `--output-last-message` to write files — all Codex invocations

M2. Max-turns recovery quality degradation for Claude reviews — `.github/workflows/ai-review-analysis.yml:625,1255`

🟢 LOW / NIT

L1. Secret exclusion list could be more comprehensive — all Codex invocations

L2. Review prompt contains impossible instructions after this change

Cross-Component Analysis

Positive Observations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xingbowang commented May 27, 2026

Summary

Test Plan

Uh oh!

github-actions Bot commented May 27, 2026

✅ clang-tidy: No findings on changed lines

Uh oh!

github-actions Bot commented May 27, 2026

🟡 Codex Code Review

Uh oh!

github-actions Bot commented May 27, 2026

✅ Claude Code Review

Summary

Findings

🔴 HIGH

🟡 MEDIUM

M1. Verify --sandbox read-only allows --output-last-message to write files — all Codex invocations

M2. Max-turns recovery quality degradation for Claude reviews — .github/workflows/ai-review-analysis.yml:625,1255

🟢 LOW / NIT

L1. Secret exclusion list could be more comprehensive — all Codex invocations

L2. Review prompt contains impossible instructions after this change

Cross-Component Analysis

Positive Observations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

M1. Verify `--sandbox read-only` allows `--output-last-message` to write files — all Codex invocations

M2. Max-turns recovery quality degradation for Claude reviews — `.github/workflows/ai-review-analysis.yml:625,1255`