Add Shannon entropy analysis to detect obfuscated payloads#97
Open
contemas-tschmidt wants to merge 2 commits into
Open
Add Shannon entropy analysis to detect obfuscated payloads#97contemas-tschmidt wants to merge 2 commits into
contemas-tschmidt wants to merge 2 commits into
Conversation
Adds --entropy flag that flags PHP files containing string literals with unusually high entropy (>= 5.3 bits/char by default), which is characteristic of encrypted, XOR-obfuscated or base64-packed payloads that evade signature-based detection. New options: --entropy Enable entropy-based scanning --entropy-threshold Override threshold (default: 5.3 bits/char) Targets quoted string literals >= 40 chars only, avoiding noise from short strings. Integrates cleanly with --no-stop so both pattern and entropy hits are reported per file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Owner
|
This is a very interesting idea, i'll test a bit before merging on my code sample with malicious codes. |
Adds --ast flag (requires nikic/php-parser via composer) that parses PHP files as an AST and flags constructs that evade signature-based detection: - eval() with any non-literal argument (dynamic code execution) - Dynamic function calls via variable: $func(...) - create_function() — runtime code compilation - assert() with dynamic argument (code execution in PHP < 8) - preg_replace() with /e modifier (arbitrary code via regex) The visitor is isolated in ast_visitor.php and loaded only at runtime after confirming vendor/autoload.php exists, so the scanner continues to work without any dependencies for users who do not install php-parser. nikic/php-parser is listed under suggest (not require) to keep the PHP >= 5.3.0 minimum intact; v3 supports PHP 5.5+, v4 supports 7.0+, v5 supports 7.4+. composer.lock is added to .gitignore. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--entropyflag that flags PHP files containing string literals with unusually high Shannon entropy (≥ 5.3 bits/char by default)--entropy-thresholdoption to override the default thresholdWhy entropy analysis?
Signature-based scanners only detect known malware. Attackers increasingly use:
High-entropy string literals are a reliable indicator of such techniques. A string of 60+ chars with entropy > 5.3 bits/char is almost certainly encoded/encrypted data.
Integration
Fits cleanly into the existing
scan()pipeline — runs after all pattern checks, respects--no-stop, and uses the sameprintPath()output format:Works alongside pattern detection — if a file triggers both a signature and high entropy, both findings are reported (with
--no-stop).Test
🤖 Generated with Claude Code