fix(bash-guard): deny git verbs inside command substitutions#223
fix(bash-guard): deny git verbs inside command substitutions#223antfleet-ops wants to merge 1 commit into
Conversation
The skeletonizer strips echo/printf args and heredoc bodies before matching against the git deny-list, treating them as literal data the shell writes to a file. That holds for plain text — but `echo $(git fetch origin)` and `echo \`git push\`` are evaluated by the shell BEFORE echo ever runs, so the denied git operation executes while the skeleton no longer contains `git fetch` / `git push`. Same trick works inside printf args, double-quoted strings, and unquoted heredoc bodies. Add a pre-check that scans the raw command for git deny-list verbs nested inside `$(...)` substitutions or backticks BEFORE the skeletonizer runs. If found, deny with a substitution-specific message. Backticks and `$(...)` always expand regardless of the wrapping command, so this is safe to apply across all contexts. Trade-off: documentation that literally contains the string "$(git ...)" (even inside a single-quoted heredoc, where it would otherwise remain literal) is also denied. The PR body discusses why this is acceptable — agents can split the string or escape the dollar sign. Regression coverage: - echo $(git fetch origin) — primary reproduction - printf "%s" "$(git push)" - echo `git push origin main` — backtick form - cat <<EOF\n$(git pull)\nEOF — unquoted heredoc form - echo "starting $(git checkout main)" — double-quoted form - echo $(date) — must still ALLOW non-git substitutions Existing benign cases (single-quoted heredoc git docs, plain echo args mentioning git, external curl, bash -c with real git fetch) verified unchanged by the local test harness. Found via AntFleet two-model bench review (Claude Opus 4.7 + GPT-5): AntFleet/bench-roboco#2 (comment)
|
Thanks for opening your first pull request on RoboCo! Quick checklist before review (most of these are enforced by CI, but worth a glance):
See CONTRIBUTING.md for the full workflow and the Code of Conduct for the community standards we follow. Welcome aboard — a maintainer will review shortly. |
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
rennf93
left a comment
There was a problem hiding this comment.
REQUEST_CHANGES — external PR #223: fix(bash-guard): deny git verbs inside command substitutions
The diff adds a pre-check (using python re on raw ... before the skeletonizer can strip echo/printf/heredoc wrappers. This is a reasonable attempt to close a real bypass, but the PR as submitted has path, design, and over-reach issues.
Per-criterion / per-finding review
-
Path mismatch — change will not take effect
- File: docker/scripts/bash-guard-hook.sh (see diff --- / +++ headers)
- Line ~1 and insertion context after "low=..."
- Expected: patch must edit the source file that is actually loaded at runtime: scripts/bash-guard-hook.sh (referenced by orchestrator.py:1250 as "/app/scripts/bash-guard-hook.sh", grok_cli_config.py:64 as ROBOCO_BASH_GUARD_HOOK, and the Claude PreToolUse config).
- Actual: diff targets docker/scripts/bash-guard-hook.sh (a path that does not exist in the current tree and is only referenced in stale comments). The live skeletonizer (current script lines 55-82) and git check (line 97) are untouched by this patch.
-
Pre-check violates SKIP_GIT contract (grok git denials must stay graceful)
- File: docker/scripts/bash-guard-hook.sh
- Lines: 48-72 (the entire new
subst_git_deny=$(printf...block and theif [[ "$subst_git_deny" == "yes" ]]; then ... exit 2) - Expected: the new git-sub check must be inside
if [[ "${ROBOCO_GUARD_SKIP_GIT:-}" != "1" ]]; then ... fi(and the later git check already is). See script header comments (lines 17-22) and bash_guard_hook_config: on grok the hook runs with SKIP=1 precisely so git ops produce recoverable --deny errors rather than hook exit-2 hard cancels. - Actual: the substitution detector and early deny run unconditionally for every Bash command. Consequence:
echo $(git push)(or any) under Grok will now cancel the whole agent run instead of letting the agent adapt.
-
False positives on non-executing literal text (quoted heredocs, single-quoted strings)
- File: docker/scripts/bash-guard-hook.sh
- Lines: 48-60 (the python that scans raw $cmd for subst_re/backtick_re then git_re)
- Expected: only patterns that the shell will expand at runtime should be denied. The existing skeletonizer intentionally strips bodies of heredocs (quoted or not) and args to echo/printf precisely to allow documentation strings containing "git commit" (see current comments lines 48-54 and the heredoc opener logic lines 59-78). A literal inside <<'EOF' or '...' must not trigger git verb blocks.
- Actual: the raw-cmd pre-check fires on the textual sequence "$(git " or "
git <verb>" anywhere. This denies docs even inside single-quoted heredocs (as the PR comment itself warns) and any single-quoted string. The comment understates scope: impact is not limited to "heredoc body (even a single-quoted one)".
-
Incomplete test for the documented trade-off
- File: tests/unit/scripts/test_bash_guard.py
- Lines ~252+ (new tests after test_still_denies_printf_piped_into_git_apply_path)
- Expected: either (a) a test that a quoted-heredoc literal
$(git ...)continues to be allowed (if we wanted to preserve docs), or (b) an explicit test + comment that it is now denied, demonstrating the authors considered the impact. - Actual: only the unquoted-heredoc sub case is added (test_denies_unquoted_heredoc_body_with_git_substitution). The quoted literal case that will now be denied has no corresponding test.
-
Detector fails open on error
- File: docker/scripts/bash-guard-hook.sh
- Lines: the python -c ... 2>/dev/null; if [[ "$subst_git_deny" == "yes" ]]
- Expected: like the skeletonizer sentinel (lines 86-90), parse/detection failure for a security rule should fail closed (deny or fall back to full-cmd inspection) rather than silently allow.
- Actual: non-"yes" output (empty string, traceback suppressed, etc.) yields "no" and proceeds to skeletonizer/main checks, potentially allowing a git verb in a substitution that the detector failed to recognize.
-
Nit: opportunity to strengthen primary git regex (not required for this PR)
- The pre-check's git_re uses prefix class
[\s;&|()]which will catch "(git verb)". The main check at current line 97 uses only[[:space:];&|]and therefore misses top-level "(git ...)" without a space or ;&| before the g. - Not a defect of the submitted change, but the same file now has two different boundary rules.
- The pre-check's git_re uses prefix class
Positive notes
- The new tests (echo
$(git ...), printf "$ (git ...)", backticks, double-quoted subs, non-git subs) correctly exercise the motivating bypass and are placed appropriately. - The python sub-regex tolerates one nested paren level as documented; backtick handling is simple but sufficient.
- Commit message style matches project convention.
Recommended next steps for contributor
- Rebase the patch onto current tree and target the real paths: scripts/bash-guard-hook.sh and the matching test file under tests/.
- Wrap the subst pre-check (and its deny) inside the existing
if [[ "${ROBOCO_GUARD_SKIP_GIT:-}" != "1" ]]block so grok behavior is unchanged. - Either (a) make the detector heredoc/quote-aware (harder; consider re-using/extending the skeletonizer), or (b) explicitly accept+document+test the broader false-positive surface for literal "$(git ...)" mentions.
- Consider whether other execution wrappers (eval, var indirection, $0 etc.) need similar treatment; out of scope for this PR but related.
This review is complete. One change request covers all findings.
Closes the real bypass antfleet flagged in PR #223 (credit to them for the finding): a denied git verb inside $(...) or backticks is expanded by the shell before the wrapping echo/printf runs, so the skeletonizer's strip hid it from the git check. This reworks it correctly where #223's fix could not land: - targets the live source (docker/scripts/bash-guard-hook.sh, COPY'd to /app/scripts/), not a path that doesn't exist; - runs INSIDE the ROBOCO_GUARD_SKIP_GIT guard, so on grok it stays the native --deny's job and never hard-cancels the run (#223 ran it unconditionally); - excludes single-quoted strings and heredoc bodies (literal / data, matching the skeletonizer), so a README documenting git verbs isn't a false positive; - fails closed (a non-sentinel / python failure denies). 44 bash-guard tests pass (5 new: dollar/backtick/double-quoted substitution deny, single-quoted literal allow, grok-skip allow). Co-authored-by: Renn F <rennf93@users.noreply.github.com>
|
Superseded by #226 (merged to master): we reworked the same command-substitution bypass so it runs inside the SKIP_GIT guard (doesn't hard-cancel Grok), excludes single-quoted/heredoc literals, and fails closed. Thanks for the catch. |
Summary
docker/scripts/bash-guard-hook.shskeletonizes the command before applying its git deny-list — heredoc bodies andecho/printfargs are stripped on the grounds that they're literal data the shell writes to a file. That's true for plain text, butecho $(git fetch origin)is evaluated by the shell before echo runs. The substitution executes the denied git op while the skeleton no longer containsgit fetch/git push. The same trick works insideprintfargs, double-quoted strings, backticks, and unquoted heredoc bodies.The hook's header documents its purpose as denying shell-level git network / auth ops; this bypass undermines that for any agent that can issue a Bash tool call.
Reproduction
PreToolUse JSON payload with
tool_input.commandset toecho $(git fetch origin):$(git fetch origin)and the denied op runs.What changed
Add a pre-check that runs BEFORE the existing skeletonizer. It scans the raw command for git deny-list verbs nested inside
$(...)substitutions or backticks; if found, the hook denies with a substitution-specific message that points back at the role's MCP verbs. Command substitutions are always expanded by the shell, so the check is safe to apply across all wrapping contexts.Trade-off
Literal documentation that contains the exact string
$(git ...)(even inside a single-quoted heredoc, where it would otherwise remain literal) is also denied. The collision rate is low in practice — agents writing docs about command substitution can split the string or escape the dollar sign. The security benefit of catching the real bypass outweighs the rare false positive.A fuller fix could parse heredoc openers to know which are quoted and skip the substitution scan for quoted bodies, but adds complexity without much surface gain.
Tests
New denials covered:
echo $(git fetch origin)— primary reproductionprintf "%s" "$(git push)"echo `git push origin main`— backtick formcat <<EOF+ body containing$(git pull)— unquoted heredoc formecho "starting $(git checkout main)"— double-quoted formNew allow guard:
echo $(date)andVAR="$(hostname)"— must still passExisting benign cases (single-quoted heredoc git docs, plain echo text mentioning git, external curl,
bash -c "... && git fetch") verified unchanged by the local shell harness.How it was found
AntFleet's two-model security review (Claude Opus 4.7 + GPT-5) on a mirror of this repo. Both models independently flagged the same defect.