perf: release the GIL during parsing#91
Merged
Merged
Conversation
parse_email holds the GIL for the entire call, but the actual parse is pure
Rust and never touches the Python interpreter. Holding the GIL serializes all
concurrent parse_email calls onto a single core.
Wrap the parse in py.detach() (formerly allow_threads) so the GIL is released
for its duration. The byte payload is already an owned copy, so nothing borrows
from a Python object while the GIL is released; errors and the PyMail are built
after re-attaching, where the interpreter is required.
Single-thread latency is unchanged. Multi-threaded throughput on a 12-core box
(large_message.eml, best-of-3) goes from flat to near-linear:
threads before after
1 645/s 646/s
2 635/s 1247/s
4 637/s 2405/s
8 636/s 4691/s (7.4x)
No behavior change; all 91 correctness tests pass.
Signed-off-by: yuriyryabikov <22548029+kurok@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
parse_emailheld the GIL for the entire call, but the parse itself (mailparse + base64/charset decoding) is pure Rust and never touches the Python interpreter. Holding the GIL serializes every concurrentparse_emailcall onto a single core.This wraps the parse in
py.detach()(PyO3 0.29's rename ofallow_threads) so the GIL is released for its duration. The byte payload is already an owned copy (payload_to_bytes), so nothing borrows from a Python object while the GIL is released; theParseErrorandPyMailare built after re-attaching, where the interpreter is needed.Benchmark
Single-thread latency — unchanged (min 1.369 ms vs 1.369 ms baseline).
Multi-threaded throughput —
large_message.eml, N threads × 300 parses, best-of-3, 12-core machine:Baseline throughput is flat regardless of thread count (the GIL serializes parsing); after this change it scales near-linearly with cores. This is the win that matters for any service parsing many emails concurrently.
Risk
cargo clippy --releaseclean.detachrequires the closure + return type beUngil(i.e.Send):Vec<u8>,mail_parser::Mail, andMailParseErrorare allSend, and noPy/Pythonvalue crosses the boundary, so this is enforced at compile time.Sendacross the GIL release). A zero-copy alternative would remove that copy but is mutually exclusive with releasing the GIL; for concurrent/server workloads, GIL release is the larger win.