Summarize getter storage/mem-copy loops as (Storage)CopySlice#1063
Open
msooseth wants to merge 1 commit into
Open
Summarize getter storage/mem-copy loops as (Storage)CopySlice#1063msooseth wants to merge 1 commit into
msooseth wants to merge 1 commit into
Conversation
031373c to
0f3765a
Compare
4 tasks
0f3765a to
825350b
Compare
b5cda57 to
825350b
Compare
add missing GetterDetection.hs (fixup for 79720af) The previous commit referenced EVM.GetterDetection from hevm.cabal and src/EVM.hs but the file itself was not staged, leaving the branch unbuildable from a fresh checkout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> tests: cover string/array public getters Extend the GetterDetection test group with end-to-end cases for the other public-getter shapes the issue (#697) is concerned with: - string public name — same SLOAD->MSTORE shape as bytes - uint256[] public arr — no copy loop; detection must not mis-fire - uint8[10] public arr — fixed-size; no copy loop either - string[] public strs — element-wise string copy still loops Also adds TODO-storagecopyslice.md tracking the residual work on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> tests: soundness check for the bytes-getter summary Adds 'bytes-getter-summary-sound' which exercises the same contract as the existing no-partial test but additionally asserts: - no Cex results from checkAssert (no false-positive panics introduced by the summary); - no Error results (no StorageCopySlice with symbolic numWords leaking to exprToSMT, which would currently surface as a Left). Catches both a wrong summary (false Cex) and the SMT fallback gap called out in TODO-storagecopyslice.md task #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> SMT: cap StorageCopySlice unroll and align error wording with CopySlice Two small SMT-encoder changes around StorageCopySlice: - Add an explicit unroll cap (4096 words) for the (Lit numWords) case so a pathologically large concrete count can't blow up the encoder before SMT even sees the query. - Use the same error wording as CopySlice for the symbolic-size case, making it clear this is the same limitation that already exists for CopySlice rather than a separate bug. A real fix for the symbolic-numWords case (over-approximating to a fresh AbstractBuf so it surfaces as Partial instead of Error) would need to touch the buf-discovery preprocessing — same fix that CopySlice would need. Tracked in TODO-storagecopyslice.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> SymExec: fail loudly when summary can't find the executing contract The fallback to AbstractStore(LitAddr 0) silently aimed the summary at slot 0 of a wrong address when the executing contract was missing from env.contracts. That's a real-bug-masker (wrong-answer-not-crash) for a case that, by EVM-state invariants, can't actually happen. Replace it with internalError so we crash loudly if the invariant is ever broken. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> GetterDetection: drop unreachable empty-stack arms The inner case-expressions for OpPop/OpMload/OpMstore/OpMstore8/OpSload/ OpSwap all had `[] -> Nothing` arms that GHC was flagging as redundant: the outer `go{Mem,Stor} _ [] _ _ _ = Nothing` clause already catches the empty-stack case, so by the time we reach the inner case `stk` is provably non-empty. Removes 7 redundant-pattern warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> docs: changelog + common-options entry for --skip-getter-loops Add a CHANGELOG line under [Unreleased]/Added and a documentation section in doc/src/common-options.md describing what the flag does, when to reach for it, and the conservative detector scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Let's add TODO. Trun on skipGetterLoops SymExec: demote symbolic-size CopySlice leaves to Partial Adds a new PartialExec variant, SymbolicCopySliceSize, and a check in verifyInputsWithHandler that, after simplifying the leaf and its SMT props, looks for any CopySlice or StorageCopySlice whose size is not a Lit. If found, the leaf is replaced with a Partial(SymbolicCopySliceSize) before checkSatWithProps runs, so the case surfaces as a graceful Partial rather than a hard Error from exprToSMT's "symbolically sized region not currently implemented" path. Closes task #3 in TODO-storagecopyslice.md. Also adds two cross-contract tests: - cross-contract-bytes-getter-with-detection C deploys A, seeds A's length slot via inline assembly (so the long-string branch of myBytes() iterates symbolically), then cross-calls a.myBytes() and asserts the decoded length. With skipGetterLoops=True the summary fires; the ReadWord for b.length is at a symbolic offset against memory containing symbolic-offset WriteWords and a symbolic-size CopySlice, which the simplifier cannot fold without interval analysis. The new demotion kicks in and we verify: no Cex (no false-positive panic) and no Error. - cross-contract-bytes-getter-without-detection Same setup with skipGetterLoops=False and a small maxIter; the symbolic getter loop must blow the budget and surface as Partial. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Expr: sound disjointness rule for readWord/WriteWord with bounded offset Adds a new readWord clause that elides a WriteWord whose offset is of the form `Add (Lit c) X`, when X has a derivable static upper bound and the write region is provably disjoint from the read in EVM (mod 2^256) arithmetic. Backed by a new conservative `upperBound :: Expr EWord -> Maybe W256` helper that handles Lit, And-with-Lit, Mod-by-Lit, Div-by- Lit, Mul/Add with one Lit operand, and SHR by a Lit shift. The disjointness check uses Integer arithmetic to detect wrap on both the read and write byte ranges before concluding the write can be skipped, so the rule does not assume non-wrapping addition. Tests: - 5 local Expr unit tests in memoryTests: * positive: WriteWord at Add(Lit, And-bounded) skipped * positive: WriteWord at Add(Lit, Mul-And-bounded) skipped * negative: unbounded X — not skipped * negative: wrap-risk near maxBound — not skipped * negative: read region overlaps write min offset — not skipped - 1 end-to-end contract test: a function that reads from `below[0]` at 0x80 after writing to `arr[idx]` at `0xa0 + 32 * (And 0xff idx)`. No Partial / Cex / Error. Also updates three pre-existing tests that asserted the old "symbolic-size CopySlice surfaces as Error" behaviour to instead assert the new task-#3 behaviour (graceful Partial with SymbolicCopySliceSize). Both branches of the relevant assert paths carry the symbolic-size buf in their constraints, so the path count is 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> GetterDetection: refuse detection on layout-extraction failure If extractMemStackLayout / extractStorageStackLayout returned Nothing, analyseBody would still insert a CopyLoop with a Nothing layout; the runtime then applied a no-op summary but still took the exit branch, silently skipping the loop body with stale memory. Pull layout extraction into the detection's Maybe chain so detection fails when extraction fails, and turn the Nothing arms in the apply* functions into internalError to make any future regression loud rather than silent. Also: - Flip --skip-getter-loops to --no-skip-getter-loops so the CLI matches the library default (on). Update the docs accordingly. - Update the two hand-crafted GetterDetection unit tests so the comparison includes the original loop bound at stack[2] (DUP1 DUP4 instead of DUP2 DUP2), matching how Solidity actually generates these loops. Without this, both compare operands were Computed and the now-strict layout extractor (correctly) rejected them. - Drop the leftover Data.List import in SymExecTests now that getResError-based assertions are gone. - CHANGELOG: trim the storage-copy-slice entry and add bullets for the readWord disjointness rule and the symbolic-size->Partial demotion. - Remove TODO-storagecopyslice.md and storagecopyslice.md scratchpads from the repo (and the two test comments referencing them). - Dedupe a doubled paragraph in verifyInputsWithHandler. Condense comments and add bytes[] getter-loop Foundry tests Trim verbose multi-line comments across the storagecopyslice work to one- or two-liners; no behavioral change. Rename a pattern binding in simplifyNoLitToKeccak for readability. Add test/contracts/pass/getterLoopBytesArray.t.sol with four Forge-style scenarios exercising the SLOAD->MSTORE copy loop in auto-generated `bytes[] public` getters (single-contract, cross-contract consumer, NFT registry, Merkle-proof reader), and wire them into FoundryTests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Cleanup Expr: fold single-word StorageCopySlice to WriteWord simplifyNoLitToKeccak already collapses a zero-word StorageCopySlice to its dst. A numWords=1 StorageCopySlice is, by definition, a single WriteWord of SLoad(baseSlot, store) at dstOff -- so rewrite it as such. Sound for symbolic baseSlot/dstOff and abstract store, and it turns a StorageCopySlice node (which exprToSMT only handles via a bounded unroll) into a WriteWord-of-SLoad that encodes and folds cleanly. Mirrors the existing 32-byte CopySlice -> WriteWord rule in the copySlice smart constructor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> SMT: lower storageCopySliceUnrollLimit from 4096 to 32 A StorageCopySlice that reaches exprToSMT with a concrete numWords is unrolled into a WriteWord(SLoad) chain. 4096 words is 8192 SMT nodes for a single slice -- well past the point where the query is tractable. In practice readByte/readWord fold concrete-offset reads before encoding, so a slice this large reaching the encoder signals an intractable query anyway; refusing it early is better than emitting it. No tests depend on the old value; the getter-loop tests all copy 2-3 words. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> GetterDetection: deduplicate the stack-layout extractors extractMemStackLayout and extractStorageStackLayout were ~90 lines each and near-identical -- the only real difference was MLOAD vs SLOAD and the result record's field names. Merge the abstract-stack simulation into a single extractStackLayout parameterised by the load opcode, returning a plain (load, bound, store) depth triple that the two thin wrappers rename into LoopStackInfo / StorageLoopStackInfo. Also tighten the rest of the module, all behaviour-preserving: - buildJumpdestSet: a one-line list comprehension - findBackwardJumps: a list comprehension instead of manual index recursion + a checkPrev helper - analyseBody: factor the two near-identical CopyLoop literals into a local mkLoop - fold duplicated OpJump/OpJumpi branches via a shared isJumpOp, and the five comparison opcodes via isCmpOp Net ~120 lines smaller; GetterDetection test group unchanged and green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> GetterDetection: drop mem-to-mem loop summarization Storage-to-mem detection is sufficient for the public-getter case this PR targets; keeping the mem-to-mem branch added a parallel code path (LoopKind, memStackLayout, applyMemCopyLoopSummary) with no validated in-tree need. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
825350b to
2aba41a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #697 — public-getter copy loops blowing up symbolic execution.
What
Expr BufnodeStorageCopySlicewith read-through rules, SMT unroll (capped 4096 words), CSE/Format/Traversals.EVM.GetterDetection: detects Solidity SLOAD→MSTORE and MLOAD→MSTORE copy loops (both backward-JUMPI and backward-JUMP+forward-exit-JUMPI shapes). Refuses detection if stack-layout extraction fails.SymExecconsultsContract.getterLoopsat JUMPI, takes the exit branch, and replaces memory with a singleCopySlice/StorageCopySlicesummary node so concrete-offset reads resolve directly.PartialExecvariantSymbolicCopySliceSize: symbolic-sizeCopySlice/StorageCopySliceleaves now surface asPartialinstead of a hard SMTError.readWorddisjointness rule: skips aWriteWordatAdd (Lit c) XwhenXhas a derivable static upper bound and byte ranges are provably disjoint mod 2^256.--no-skip-getter-loops(library default is on).Tests
StorageCopySliceread-through unit tests.bytes,string,string[]collapse to non-Partial;uint256[],uint8[10]confirm no mis-fire.Cexand noError.Checklist
doc/src/common-options.md)