common: support .trufflehogignore auto-discovery (#2687)#4941
common: support .trufflehogignore auto-discovery (#2687)#4941ChrisJr404 wants to merge 2 commits intotrufflesecurity:mainfrom
Conversation
Add a .trufflehogignore file format that trufflehog auto-discovers at each scan root, in the spirit of .gitignore / .gitleaksignore. The file uses gitignore-style globs (one per line, '#' for comments) and its patterns are appended to the filter's exclude set, so users can commit ignore rules next to their code instead of maintaining the existing --exclude-paths regex file out-of-band. This is the most-upvoted ergonomic ask in the repo (29 reactions on issue trufflesecurity#2687) and brings parity with the .gitleaksignore pattern trufflehog users coming from gitleaks already know. Implementation: * New IgnoreFileName constant ('.trufflehogignore') and a Filter.AddTrufflehogIgnoreFiles(roots...) helper that walks the supplied scan roots, parses each ignore file (skipping '#' comments and blank lines), and appends compiled patterns to the filter's exclude FilterRuleSet. Roots are deduped so the same ignore file is loaded once even when --paths repeats. * New globToRegex helper that converts gitignore-style globs to anchored regexes. Supported syntax mirrors gitignore: '*' (single-segment), '**' / '**/' / '/**' (zero or more dir segments), '?' (single char), '/'-prefix anchor, '/'-suffix dir match. Character classes ('[...]') and '!' re-includes return a clear error rather than silently doing the wrong thing. * Auto-discovery wired into the filesystem source's Init so passing --paths=/repo with a /repo/.trufflehogignore in place automatically applies the ignore rules. Git source is not auto-discovered here (the workspace is checked out into a temp dir on each scan, so a file-based discovery doesn't have a stable root); users on the git source can still point --exclude-paths at their .trufflehogignore. Tests: * TestAddTrufflehogIgnoreFiles — full happy path with vendor/, *.lock, /secrets/known.json, src/**/*.test.go in one ignore file. Asserts ten path/exclusion pairs covering anchored, unanchored, glob, and ** semantics. * TestAddTrufflehogIgnoreFiles_NoFile — no error when the ignore file is missing. * TestAddTrufflehogIgnoreFiles_DedupeRoots — same root passed three times loads the file once. * TestAddTrufflehogIgnoreFiles_RejectsNegation — '!keep_me.go' surfaces a clear 're-include patterns ... not yet supported' error so users who copy-paste a .gitignore aren't silently fooled. * TestGlobToRegex — table-driven, pins '*.lock', 'vendor/', '/secrets/key.txt', 'src/**/*.go' (with zero-depth match), 'foo?bar'. * Existing TestFilterBasic / TestFilterFromFile and the broader pkg/sources/filesystem suite continue to pass. Closes trufflesecurity#2687
The trailing-/** branch in globToRegex had:
if i+2 == len(body) && body[i] == '/' && body[i+1] == '*' && body[i+2] == '*' {
When i+2 == len(body), accessing body[i+2] indexes past the end of
the slice. The branch was effectively dead code (the bounds check
made it unreachable without panicking), but a real .trufflehogignore
entry like 'build/**' would never match the intended trailing-glob
case anyway. Fix the index: i indexes the leading '/', so i+2 must be
the last valid byte ('*'), which means i+3 == len(body).
Adds two new globToRegex cases that would have caught this:
- 'build/**' matches build/foo, build/foo/bar/baz, build/x.txt;
misses 'build' and 'src/build.go'.
- '**/test' matches test, src/test, a/b/c/test; misses 'testless'
and 'test.go'.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 43bce16. Configure here.
|
|
||
| var b strings.Builder | ||
| if anchored { | ||
| b.WriteString("^") |
There was a problem hiding this comment.
Anchored glob patterns silently fail with non-dot scan roots
High Severity
globToRegex converts leading-/ globs (e.g., /secrets/known.json) into a regex anchored with ^ (^secrets/known\.json(?:$|/)). However, the filesystem source passes full paths (rooted at the scan directory) to ShouldExclude and Pass — e.g., /home/user/project/secrets/known.json or myproject/secrets/known.json. The ^ anchor forces matching at position 0, so the pattern never matches unless the scan root happens to be .. The unit tests pass only because they use bare relative paths like "secrets/known.json" rather than paths prefixed by a scan root.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 43bce16. Configure here.


Summary
Closes #2687.
Adds a `.trufflehogignore` file that trufflehog auto-discovers at each filesystem scan root, in the spirit of `.gitignore` / `.gitleaksignore`. Users can now commit ignore rules next to their code instead of maintaining a separate `--exclude-paths` regex file out-of-band.
This is the most-upvoted ergonomic ask in the repo (29 reactions on the issue) and brings parity with the `.gitleaksignore` pattern trufflehog users coming from gitleaks already know.
→ `trufflehog filesystem --directory .` excludes those paths from scanning automatically. No flag needed.
What's in this PR
Implementation notes
Verification
Test coverage (new)
Existing `TestFilterBasic` / `TestFilterFromFile` continue to pass unmodified.
Notes for review
Note
Medium Risk
Medium risk because it changes filesystem scan coverage by dynamically excluding paths based on repository-local ignore files, and introduces new glob-to-regex translation logic that could unintentionally over/under-match patterns.
Overview
Adds automatic discovery of
.trufflehogignoreat each filesystem scan root and merges its gitignore-style glob patterns into the existing exclude filter, with verbose logging of loaded ignore files.Introduces a
globToRegexconverter plus ignore-file parsing that treats missing files as a no-op, dedupes roots, and fails fast on unsupported syntax (notably!negation and[...]character classes), along with new unit tests covering discovery behavior and glob semantics.Reviewed by Cursor Bugbot for commit 43bce16. Bugbot is set up for automated code reviews on this repo. Configure here.