merger: add optional validation of one-block files before merging#150
Closed
YaroShkvorets wants to merge 1 commit into
Closed
merger: add optional validation of one-block files before merging#150YaroShkvorets wants to merge 1 commit into
YaroShkvorets wants to merge 1 commit into
Conversation
The streaming bundle reader byte-concatenates one-block files into the merged bundle without any decoding, so a one-block file with a corrupted block payload is silently propagated into the merged-blocks store, poisoning every downstream consumer of the bundle (index-builder, firehose, substreams, ...). This adds a --merger-validate-one-block-files flag (default false, preserving the existing pure-streaming behavior). When enabled, each one-block file is buffered (one at a time, memory stays bounded to a single block) and validated before being written out: the DBIN framing and pbbstream.Block envelope must decode, the payload must be non-empty and valid protobuf wire-format (checked generically, without knowledge of the chain-specific block type), and the file must contain exactly one block. A corrupted file fails the merge with an error naming the file, so the operator can re-extract it instead of discovering the corruption much later in a consumer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
843aa9a to
3376b73
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The merger's streaming bundle reader byte-concatenates one-block files into the merged bundle without any decoding. A one-block file containing a corrupted block payload (e.g. produced by a faulty reader node) is silently propagated into the merged-blocks store, where it later poisons every consumer of the bundle — index-builder, firehose, substreams — typically surfacing much later as a bare
proto: cannot parse invalid wire-format data(see #149).We hit this in production on a MegaETH deployment: a single block with an invalid payload inside an otherwise-healthy 100-block bundle (valid zstd, valid DBIN framing, valid block envelope — only the inner payload was garbage).
Change
New
--merger-validate-one-block-filesflag, defaultfalse— the existing pure-streaming behavior is preserved unless explicitly opted in.When enabled, each one-block file is buffered (one file at a time, so memory stays bounded to a single block) and validated before being written into the bundle:
pbbstream.Blockenvelope must decodeemptypb.EmptywithDiscardUnknown, so no knowledge of the chain-specific block type is neededA corrupted file fails the merge with an error naming the file and block:
so the operator can re-extract the block instead of discovering the corruption much later in a downstream consumer.
Testing
validateOneBlockFile(corrupt payload, empty payload, truncated file, trailing data)TestStreamingBundleReader_ReadSimpleFilesruns both modes)TestStreamingBundleReader_CorruptPayloadPassesWithoutValidationMergeAndStorefailure path covered byTestMergerIO_MergeUploadCorruptBlockWithValidation🤖 Generated with Claude Code