perf(interpreter): add unlikely() hint to gas-block charge in run() hot loop#810
Open
zhoutianxia1 wants to merge 3 commits into
Open
perf(interpreter): add unlikely() hint to gas-block charge in run() hot loop#810zhoutianxia1 wants to merge 3 commits into
zhoutianxia1 wants to merge 3 commits into
Conversation
Two compile errors prevented grey from building on Windows: 1. build.rs: PathBuf::display() emits Windows backslash separators which Rust's lexer then interprets as escape sequences inside include_bytes! string literals (e.g. \j, \g, \o). Fix by replacing '\\' with '/' before embedding the path into the generated service_blobs.rs. 2. node.rs: tokio::signal::unix is not available on Windows. Wrap the SIGTERM, SIGUSR1, and SIGHUP signal registrations in #[cfg(unix)] and provide std::future::pending::<()>() placeholders on non-Unix so the select! arms compile and are simply never selected.
…ot loop Gas-block boundaries are sparse (~5-15% of instructions). The �b_gas_cost > 0 guard is almost always false, but without an explicit hint LLVM may not lay out the fall-through path optimally. Changes: - Add #![feature(core_intrinsics)] + #![allow(internal_features)] to crates/javm/src/lib.rs (project is already nightly-only per rust-toolchain.toml) - Import core::intrinsics::unlikely in interpreter/mod.rs - Wrap �b_gas_cost > 0 check with unlikely() so LLVM moves the charge block to a cold section, keeping the sequential hot path as fall-through Expected effect: 1-3% I-cache improvement on instruction-dense workloads (blake2b, keccak, ed25519). No change in semantics or gas values. Also add docs/interpreter-gas-branch-misprediction-analysis.md with full frequency analysis, cache impact study, and micro-architectural cost estimate. Refs: jarchain#400 (Optimize javm interpreter performance)
Contributor
Genesis ReviewComparison targets:
How to reviewPost a comment with the following format (rank from best to worst): Use the short commit hashes above and To meta-review another reviewer's comment, react with 👍 or 👎. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Gas-block boundaries are sparse (~5-15% of instructions for typical workloads like blake2b, keccak, ed25519). The
bb_gas_cost > 0guard in therun()hot loop is almost always false, but without an explicit hint LLVM cannot prove this and may not lay out the fall-through path as the primary code stream.This PR adds
core::intrinsics::unlikely()around the guard so LLVM moves the gas-charge block to a cold section, keeping the sequential hot path as fall-through code - reducing I-cache footprint per iteration.Changes
grey/crates/javm/src/lib.rs: add#![feature(core_intrinsics)]+#![allow(internal_features)]. The project already requires nightly (rust-toolchain.tomlpins nightly edition 2024).grey/crates/javm/src/interpreter/mod.rs: importcore::intrinsics::unlikely; wrapbb_gas_cost > 0check.docs/interpreter-gas-branch-misprediction-analysis.md: full analysis including frequency study, cache impact, micro-architectural cost estimate, recommendations.Analysis
Key findings:
unlikelymoves the cold charge block out-of-line, shrinking the hot path cache footprintCorrectness
Zero change in semantics or gas values.
unlikelyis a pure code-generation hint.cargo check -p javmpasses with no errors and no warnings.Refs: #400