feat(memory): add single-fault-guard pointer-chain read primitives#87
Conversation
Add plausible_userspace_ptr, seh_resolve_chain, seh_read_chain<T>, and seh_read_chain_bytes for resolving and reading multi-level pointer chains under one fault guard. Include a hot-path memory usage guide, unit tests, and a standalone microbenchmark.
📝 WalkthroughWalkthroughThis PR adds guarded multi-level pointer-chain primitives and an inline pointer plausibility predicate, implements SEH/MinGW-safe chain resolution and chain-read helpers, introduces tests for chain semantics and non-default-constructible typed reads, and ships hot-path documentation plus a microbenchmark measuring validation/read costs and contention. ChangesPointer-chain resolution and hot-path memory access
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/test_memory_chain.cpp (1)
16-170: ⚡ Quick winWrap this suite in a fixture.
This new file uses raw
TEST(...)cases, but the repo test contract fortests/**/*.cpprequires a::testing::Testsubclass withSetUp()/TearDown(). Even if the hooks are empty today, converting these toTEST_Fkeeps the suite consistent and gives you a safe place to reset memory state if future cases start touching the cache.As per coding guidelines: "
tests/**/*.cpp: Use GoogleTest framework. One test file per module:tests/test_<module>.cppmirrorssrc/<module>.cpp. Each suite must use a::testing::Testsubclass withSetUp()/TearDown()for temp file cleanup."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_memory_chain.cpp` around lines 16 - 170, The tests must be converted to use a GoogleTest fixture class (e.g., declare class MemoryTest : public ::testing::Test with SetUp() and TearDown() methods, even if empty) and replace all TEST(...) macros in this file (MemoryPlausiblePtr, MemorySehResolveChain, MemorySehReadChain cases) with TEST_F(MemoryTest, ...), so each case runs under the fixture and you have a place to reset memory state or caches later; ensure the fixture class is defined above the tests and referenced by each TEST_F invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/DetourModKit/memory.hpp`:
- Around line 382-397: The template seh_read currently requires
std::is_default_constructible_v<T> and declares a T value, which prevents
reading trivially copyable but non-default-constructible types; change seh_read
(and the typed chained-read overloads) to only require
std::is_trivially_copyable_v<T>, allocate raw storage (e.g.,
std::array<std::byte, sizeof(T)> or std::aligned_storage_t<sizeof(T),
alignof(T)>) and call seh_read_bytes(addr, storage.data(), sizeof(T)), then
return std::bit_cast<T>(storage) on success (or std::nullopt on failure); remove
the default-constructible constraint and any direct T value declaration so the
byte-wise read works for non-default-constructible trivially copyable types.
In `@tests/bench_memory.cpp`:
- Around line 174-185: make_churn_pool can return an empty pool which causes
division-by-zero / invalid indexing in run_contention and run_probe; after
filling the pool in make_churn_pool check if pool.empty() and fail fast (e.g.,
throw a descriptive std::runtime_error or call std::exit after printing an
error) so callers never receive an empty vector; include a clear message
referencing the failed VirtualAlloc allocations so the benchmark setup error is
explicit.
---
Nitpick comments:
In `@tests/test_memory_chain.cpp`:
- Around line 16-170: The tests must be converted to use a GoogleTest fixture
class (e.g., declare class MemoryTest : public ::testing::Test with SetUp() and
TearDown() methods, even if empty) and replace all TEST(...) macros in this file
(MemoryPlausiblePtr, MemorySehResolveChain, MemorySehReadChain cases) with
TEST_F(MemoryTest, ...), so each case runs under the fixture and you have a
place to reset memory state or caches later; ensure the fixture class is defined
above the tests and referenced by each TEST_F invocation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 994799ac-b0f9-4e1a-a7c5-08ce3bbe2408
📒 Files selected for processing (8)
AGENTS.mddocs/misc/hot-path-memory.mddocs/tests/README.mdinclude/DetourModKit/memory.hppsrc/memory.cpptests/CMakeLists.txttests/bench_memory.cpptests/test_memory_chain.cpp
- seh_read and both seh_read_chain<T> overloads now require only std::is_trivially_copyable_v<T>, read into raw std::array<std::byte> storage, and return via std::bit_cast, so non-default-constructible trivially-copyable types can be read. - Build the benchmarks with the library Release LTO so a std::thread-using bench links under MinGW (works around a GCC LTO multiple-definition bug on a mixed non-LTO/LTO link). - make_churn_pool fails fast when no pages can be allocated. - Add a non-default-constructible read test and a memory benchmark analysis doc.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/bench_memory.cpp (1)
249-250:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAlign DIRECT-path comments with actual benchmark behavior.
These comments say DIRECT is “one
__tryper probe”, but the code does raw volatile dereferences withoutseh_*calls. Please reword to avoid overstating the measured path.Suggested wording update
- // predicate and reads under one __try per probe. This measures both, + // predicate and reads directly (raw dereference baseline). This measures both, @@ - // DIRECT = read inside one per-probe __try (the SEH fast path). + // DIRECT = raw reads (dereference baseline; no per-read validation gate).Also applies to: 432-433
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/bench_memory.cpp` around lines 249 - 250, Update the inline comments that describe the DIRECT path so they accurately reflect the implementation: change any claim like "one __try per probe" to indicate that DIRECT does raw volatile dereferences (no seh_* calls or SEH __try/__except usage) and therefore measures the non-SEH/unchecked access path; apply this wording change in the comment blocks around the DIRECT-path descriptions (the comment near the volatile dereference usage and the corresponding second comment later in the file).
🧹 Nitpick comments (1)
include/DetourModKit/memory.hpp (1)
393-396: ⚡ Quick winInitialize
storageat declaration.
std::array<std::byte, sizeof(T)> storage;is left uninitialized here (and identically at Line 489). It is not a correctness bug today becausebit_castonly runs on the success path whereseh_read_bytesfully populates the buffer, but it deviates from the project's initialization rule.Note
seh_read_bytesis an opaque (non-inline) call, so the zero-init won't be elided on this benchmarked hot path. If the few-byte cost is unacceptable, keep it uninitialized but add a//comment documenting the deliberate omission and the full-write contract.♻️ Brace-initialize the storage buffers
- std::array<std::byte, sizeof(T)> storage; + std::array<std::byte, sizeof(T)> storage{}; if (!seh_read_bytes(addr, storage.data(), sizeof(T)))Apply the same change at Line 489:
- std::array<std::byte, sizeof(T)> storage; + std::array<std::byte, sizeof(T)> storage{}; if (!seh_read_chain_bytes(base, offsets, storage.data(), sizeof(T)))As per coding guidelines: "Do not add uninitialized variables -- always initialize at declaration with brace syntax or assignment."
Also applies to: 489-492
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/DetourModKit/memory.hpp` around lines 393 - 396, The local std::array buffer (storage) must be initialized at declaration to satisfy the project's rule against uninitialized variables; change the two declarations of std::array<std::byte, sizeof(T)> storage; to brace-initialize (e.g., std::array<std::byte, sizeof(T)> storage{};) so storage is zero-initialized before calling seh_read_bytes and then std::bit_cast<T>(storage) is safe; if you intentionally want to avoid the zeroing for performance, instead leave it uninitialized but add a clear comment next to the declaration documenting the deliberate omission and the full-write contract guaranteed by seh_read_bytes and why that makes it safe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tests/bench_memory.cpp`:
- Around line 249-250: Update the inline comments that describe the DIRECT path
so they accurately reflect the implementation: change any claim like "one __try
per probe" to indicate that DIRECT does raw volatile dereferences (no seh_*
calls or SEH __try/__except usage) and therefore measures the non-SEH/unchecked
access path; apply this wording change in the comment blocks around the
DIRECT-path descriptions (the comment near the volatile dereference usage and
the corresponding second comment later in the file).
---
Nitpick comments:
In `@include/DetourModKit/memory.hpp`:
- Around line 393-396: The local std::array buffer (storage) must be initialized
at declaration to satisfy the project's rule against uninitialized variables;
change the two declarations of std::array<std::byte, sizeof(T)> storage; to
brace-initialize (e.g., std::array<std::byte, sizeof(T)> storage{};) so storage
is zero-initialized before calling seh_read_bytes and then
std::bit_cast<T>(storage) is safe; if you intentionally want to avoid the
zeroing for performance, instead leave it uninitialized but add a clear comment
next to the declaration documenting the deliberate omission and the full-write
contract guaranteed by seh_read_bytes and why that makes it safe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 870dabb6-e0b6-42fa-bbc7-99c12c345413
📒 Files selected for processing (7)
AGENTS.mddocs/analysis/memory_bench_v3.x/README.mddocs/misc/hot-path-memory.mdinclude/DetourModKit/memory.hpptests/CMakeLists.txttests/bench_memory.cpptests/test_memory_chain.cpp
✅ Files skipped from review due to trivial changes (2)
- docs/misc/hot-path-memory.md
- AGENTS.md
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/test_memory_chain.cpp
- tests/CMakeLists.txt
What
New
Memoryprimitives for resolving and reading multi-level (Cheat-Engine-style) pointer chains under one fault guard:plausible_userspace_ptr(p)-- constexpr x64 user-mode pointer check (no syscall, no memory access)seh_resolve_chain(base, {offsets...})-- resolve a chain to its final addressseh_read_chain<T>(base, {offsets...})-- resolve and read a typed valueseh_read_chain_bytes(base, {offsets...}, out, n)-- resolve and read a raw rangeThe whole walk runs in one fault guard (a single
__tryon MSVC, VirtualQuery-guarded per link on MinGW). Intermediate links are plausibility-screened, so a faulting or implausible link aborts the walk and returnsnullopt/false.Why
Gating every hot-path read with
is_readablecosts a lock plus a possibleVirtualQueryper field and is a time-of-check/time-of-use illusion. These primitives read directly under one guard instead. Usage guidance indocs/misc/hot-path-memory.md.Benchmark
tests/bench_memory.cpp, MSVC 2022 release (/O2,-DDMK_BUILD_BENCHMARKS=ON), 200k iterations x 15 samples, median per call. One machine, illustrative; run it for your own target.Per-call cost (warm cache where applicable):
read_ptr_uncheckedseh_read<u64>is_readablewarm HITis_readablecold MISS (VirtualQuery)Pointer chain, 6 links, warm cache:
seh_read_chain<u64>(one guard)is_readablebefore each)The chain primitive is ~29x faster than gating each link, and a single SEH-guarded read (7.4 ns) is within ~2x of a raw load because the MSVC
__tryis table-driven.Hot-path probe model (8 reads across ~3 cache-missing objects), per probe:
is_readableper read)Gating costs ~69x the mean and a much worse tail (p99 10500 ns vs 400 ns). At 256 such probes per frame that is 9.4% of a 16.67 ms frame gated vs 0.14% direct.
Summary by CodeRabbit
New Features
Documentation
Tests