Skip to content

feat(AGENT-25): add [scan] include/exclude path filters#17

Merged
limaronaldo merged 4 commits into
mainfrom
feat/agent-25-path-filters
Jun 14, 2026
Merged

feat(AGENT-25): add [scan] include/exclude path filters#17
limaronaldo merged 4 commits into
mainfrom
feat/agent-25-path-filters

Conversation

@limaronaldo

Copy link
Copy Markdown
Collaborator

Summary

Implements AGENT-25: [scan] include and [scan] exclude glob filters in .agentshield.toml, relative to the scan root. exclude wins when both match. The filter runs before source-file collection/parsing across all adapters (MCP, OpenClaw, Hermes, CrewAI, LangChain, GPT Actions, Cursor Rules) and is surfaced in scan --explain as Path filters.

Two commits:

  1. feat(AGENT-25) — the feature: config parsing, glob compilation, precedence (included && !excluded), per-adapter filtering, and --explain visibility. Docs (README.md, AGENTS.md) and the local Huly skill reference. Adds tests/path_filters.rs (include-only, exclude-only, exclude-wins precedence, explain visibility).
  2. fix(AGENT-25) — closes a bypass found in code review (see below).

Code review fix (metadata-derived findings)

Review found the first commit filtered parsed source files but not metadata read directly from the scan root, so excluded files still produced findings — violating the acceptance criterion "Excluded files do not produce findings".

Reproduced before the fix:

  • exclude = ["package.json"] still emitted SHIELD-009 / SHIELD-012 located at package.json
  • exclude = ["tools.json"] still emitted SHIELD-008 from excluded tool metadata

Fix threads ScanPathFilter into the MCP tools.json load path and into the shared parse_dependencies / parse_provenance helpers (used by all adapters), gating every root-relative metadata file on allows_path: requirements.txt, Pipfile.lock, poetry.lock, uv.lock, package.json, package-lock.json, pyproject.toml, tools.json.

Added regression tests for exclude = ["package.json"], ["requirements.txt"], and ["tools.json"] — these failed before the fix (RED) and pass after (GREEN).

Test plan

  • cargo test — 253 passed
  • cargo test --all-features — 328 passed
  • cargo test --no-default-features — 247 passed
  • cargo clippy --all-targets --all-features -- -D warnings — no issues
  • cargo fmt --all -- --check — clean
  • git diff --check — clean
  • Manual repro: with exclude active → 0 findings; without exclude → findings return (detectors intact)

Closes AGENT-25.

Add `[scan] include` and `[scan] exclude` globs to .agentshield.toml,
relative to the scan root. `exclude` wins when both match. The filter
runs before source-file collection/parsing across all adapters (MCP,
OpenClaw, Hermes, CrewAI, LangChain, GPT Actions, Cursor Rules) and is
surfaced in `scan --explain` as "Path filters".

- src/config/mod.rs: parse include/exclude, compile glob patterns,
  precedence (included && !excluded)
- adapters: apply path filter before collecting source files
- src/ux.rs: report active path filters in --explain output
- README.md / AGENTS.md: document the new config keys; reference the
  local Huly skill at skills/huly/SKILL.md
- tests/path_filters.rs: include-only, exclude-only, exclude-wins
  precedence, and explain visibility
Code review found that `[scan] exclude`/`include` filtered parsed source
files but not metadata read directly from the scan root, so excluded
files still produced findings — violating the acceptance criterion
"Excluded files do not produce findings".

Thread `ScanPathFilter` into the MCP `tools.json` load path and into the
shared `parse_dependencies` / `parse_provenance` helpers (used by all
adapters), gating every root-relative metadata file
(requirements.txt, Pipfile.lock, poetry.lock, uv.lock, package.json,
package-lock.json, pyproject.toml, tools.json) on `allows_path`.

Reproduced before fix: `exclude = ["package.json"]` still emitted
SHIELD-009/SHIELD-012; `exclude = ["tools.json"]` still emitted
SHIELD-008. All suppressed after the fix; detectors still fire when
the files are not excluded.

- src/adapter/mcp.rs: gate tools.json load + parse_dependencies/
  parse_provenance metadata files on the path filter
- src/adapter/{crewai,cursor_rules,gpt_actions,hermes,langchain}.rs:
  pass the active filter into the shared helpers
- tests/path_filters.rs: regression tests for excluded package.json,
  requirements.txt, and tools.json metadata findings
@limaronaldo

limaronaldo commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

Review follow-up for AGENT-25 addressed on this branch.

Fixed in c1425b6:

  • path filters now use explicit case-sensitive glob matching with literal path separators, so *.py no longer crosses directories;
  • leading ./ or / patterns normalize relative to the scan root;
  • trailing slash patterns such as legacy/ match directory contents;
  • glob semantics are covered for **/generated/**, root/nested generated directories, and case-sensitivity;
  • README and starter config comments now document the expected semantics.

Fixed in 8fd6784:

  • metadata path-filter regression tests now assert absence of supply-chain findings, not only absence of one manifest location, so the previous SHIELD-012 migration/vacuous-test gap is covered.

Local verification passed:

  • cargo fmt --all -- --check
  • cargo test (258 tests)
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --no-default-features
  • cargo clippy --all-targets --no-default-features -- -D warnings
  • CLI smoke with exclude = ["legacy/"] confirmed the excluded file and SHIELD-001 do not appear in JSON output.

@limaronaldo limaronaldo merged commit 005680f into main Jun 14, 2026
9 checks passed
@limaronaldo limaronaldo deleted the feat/agent-25-path-filters branch June 14, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant