perf: optimize all_zeros using fast bytes comparison#3078
Conversation
There was a problem hiding this comment.
Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed
There was a problem hiding this comment.
Code Review
This pull request optimizes the all_zeros helper function by replacing a generator expression with a direct bytes comparison to leverage C-level performance. A review comment identifies a correctness issue where the new implementation fails to compare correctly against other bytes-like objects such as bytearray or memoryview. The reviewer suggests using not any(bytez) as a more robust, type-agnostic, and efficient alternative that also benefits from short-circuiting.
CHANGELOG updated or no update needed, thanks! 😄
williballenthin
left a comment
There was a problem hiding this comment.
nice.
thank you for:
- doing the benchmark, and
- including the inline comment so there's a record of the lesson
mr-tz
left a comment
There was a problem hiding this comment.
We may also need to check floss for these issues.
Optimization: Fast C-level check in
all_zerosDescription
In
capa/features/extractors/helpers.py:all_zeros, the code used a Python generator expressionall(b == 0 for b in ...)to check if a buffer was entirely composed of null bytes. This introduced significant overhead due to Python bytecode execution for each byte.Fix
Refactored the function to use
bytez == b'\x00' * len(bytez). This delegates the operation to the highly optimized C implementation of Python's bytes comparison.Trade-offs
capafor instruction and data references).len(bytez). However, sincecaparestricts the buffer size toMAX_BYTES_FEATURE_SIZE(256 bytes) in all calling contexts, the memory overhead is negligible (at most 256 bytes per call) and far outweighed by the performance gains.Checklist