Skip to content

feat: add firehose protobuf field annotations and fireproto package#157

Draft
maoueh wants to merge 9 commits into
developfrom
feature/tagging-chain-proto-fields
Draft

feat: add firehose protobuf field annotations and fireproto package#157
maoueh wants to merge 9 commits into
developfrom
feature/tagging-chain-proto-fields

Conversation

@maoueh

@maoueh maoueh commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add pb/firehose/options.proto defining two FieldOptions extensions: (firehose.transactions) marks the repeated field holding block transactions, (firehose.nondeterministic) marks fields that may differ between nodes (e.g. gas used, fees)
  • Add fireproto package with WalkNonDeterministicFields, ClearNonDeterministicFields, and FindTransactionsField — enables generic tooling to locate/clear tagged fields without chain-specific knowledge
  • Pin github.com/streamingfast/protox to published commit cd8a8cf which adds WalkMessageInstanceFields (instance-level DFS walker with cycle prevention)

How annotations look on a chain-specific Block proto

A chain team annotates their generated proto like this:

syntax = "proto3";
package sf.example.type.v1;

import "sf/firehose/type/v1/options.proto";

message Block {
  uint64 number      = 1;
  bytes  hash        = 2;
  bytes  parent_hash = 3;

  // Exactly one field gets (firehose.transactions) — lets generic tooling
  // locate transactions without knowing the chain field name or number.
  repeated Transaction transactions = 4 [(firehose.transactions) = true];

  // Fields tagged nondeterministic are cleared before deterministic comparisons
  // (e.g. diffing two nodes views of the same block).
  uint64 gas_used      = 5 [(firehose.nondeterministic) = true];
  bytes  fees_burned   = 6 [(firehose.nondeterministic) = true];
  string validator_tag = 7 [(firehose.nondeterministic) = true];
}

message Transaction {
  bytes  hash      = 1;
  uint64 gas_limit = 2;

  // Per-node fee estimation may differ.
  bytes gas_fee = 3 [(firehose.nondeterministic) = true];
}

Generic consumer code (no chain knowledge required):

block := &examplev1.Block{ /* ... */ }

// Find transactions without knowing field name/number:
fd := fireproto.FindTransactionsField(block)
txs := block.ProtoReflect().Get(fd).List() // len == block.Transactions

// Clear node-specific noise before comparison:
a, b := fetchFromNode1(), fetchFromNode2()
fireproto.ClearNonDeterministicFields(a)
fireproto.ClearNonDeterministicFields(b)
// a and b now differ only on content, not per-node metadata
assert.Equal(t, a, b)

Test plan

  • go test ./fireproto/... — 5 tests covering find, walk, clear, and no-annotation paths
  • go build ./... — no compilation errors
  • Verify pb/firehose/options.pb.go exports E_Transactions and E_Nondeterministic extension vars

maoueh added 8 commits June 10, 2026 10:18
Add replace directive pointing github.com/streamingfast/protox to the
local branch at /Users/maoueh/work/sf/protox which contains the new
WalkMessageInstanceFields function needed for tagging chain proto fields.
…c proto fields

Adds the fireproto package with WalkNonDeterministicFields, ClearNonDeterministicFields,
and FindTransactionsField utilities that use the firehose field option extensions to
locate and manipulate annotated proto fields without chain-specific knowledge.
@sduchesneau

sduchesneau commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

🔍 Vulnerabilities of ghcr.io/streamingfast/firehose-core:a71ab1f-amd64

📦 Image Reference ghcr.io/streamingfast/firehose-core:a71ab1f-amd64
digestsha256:57fa5481fb45e4349f86b07583d62db41f47c5667e14f0922b1579f6d944a1cb
vulnerabilitiescritical: 0 high: 0 medium: 0 low: 0
platformlinux/amd64
size204 MB
packages505
📦 Base Image ubuntu:24.04
also known as
  • c136481b8f4cd58cb213a22ee358ea1047bc963a338c61cdccb518e233409f86
  • noble
  • noble-20260509.1
digestsha256:023f8a753c22258c9fe2d0005a7d28258038da7d620e9f93e9ad78aa266f9f11
vulnerabilitiescritical: 0 high: 1 medium: 26 low: 11

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces custom protobuf field annotations for Firehose block semantics and adds a fireproto helper package to discover and clear annotated fields generically (without chain-specific logic). It also pins github.com/streamingfast/protox to a commit that provides an instance-level message field walker used by the new utilities.

Changes:

  • Add pb/firehose/options.proto (+ generated options.pb.go) defining (firehose.transactions) and (firehose.nondeterministic) FieldOptions extensions.
  • Add new fireproto package utilities: WalkNonDeterministicFields, ClearNonDeterministicFields, and FindTransactionsField (+ tests/docs).
  • Update dependencies (protox pin, zap bump) and document the feature in CHANGELOG.md.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pb/firehose/options.proto Defines custom FieldOptions extensions for transactions and nondeterministic fields.
pb/firehose/options.pb.go Generated Go bindings exporting the extension descriptors.
go.mod Adds github.com/streamingfast/protox and bumps go.uber.org/zap.
go.sum Adds checksums for protox and updates zap checksums.
fireproto/walker.go Implements generic discovery/walk/clear utilities using protox and the new annotations.
fireproto/walker_test.go Adds tests using dynamic descriptors to validate find/walk/clear behavior.
fireproto/log_test.go Initializes package logger for tests (consistent with repo patterns).
fireproto/doc.go Package documentation and usage examples for the new utilities.
CHANGELOG.md Documents the new proto options and fireproto package under Unreleased.
Files not reviewed (1)
  • pb/firehose/options.pb.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fireproto/walker_test.go
Comment on lines +25 to +46
txFDP := &descriptorpb.FileDescriptorProto{
Name: new("fireproto_test_tx.proto"),
Syntax: new("proto3"),
Package: new("fireproto.test"),
Options: &descriptorpb.FileOptions{GoPackage: new("fireproto/test;fireprototest")},
MessageType: []*descriptorpb.DescriptorProto{
{
Name: new("Tx"),
Field: []*descriptorpb.FieldDescriptorProto{
{
Name: new("hash"),
Number: new(int32(1)),
Type: descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum(),
Label: descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum(),
JsonName: new("hash"),
},
{
Name: new("fee"),
Number: new(int32(2)),
Type: descriptorpb.FieldDescriptorProto_TYPE_BYTES.Enum(),
Label: descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum(),
JsonName: new("fee"),
Comment thread fireproto/walker_test.go
Comment on lines +59 to +63
Name: new("fireproto_test_block.proto"),
Syntax: new("proto3"),
Package: new("fireproto.test"),
Dependency: []string{"fireproto_test_tx.proto"},
Options: &descriptorpb.FileOptions{GoPackage: new("fireproto/test;fireprototest")},
Comment thread fireproto/walker_test.go
Comment on lines +65 to +74
{
Name: new("Block"),
Field: []*descriptorpb.FieldDescriptorProto{
{
Name: new("transactions"),
Number: new(int32(1)),
Type: descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum(),
Label: descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum(),
TypeName: new(".fireproto.test.Tx"),
JsonName: new("transactions"),
Comment thread fireproto/walker_test.go
Comment on lines +81 to +86
{
Name: new("gas"),
Number: new(int32(2)),
Type: descriptorpb.FieldDescriptorProto_TYPE_BYTES.Enum(),
Label: descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum(),
JsonName: new("gas"),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is a new Golang feature enabling creating pointer directly to literal types, added in Golang 1.26, aren't we compatible already?

@sduchesneau sduchesneau self-requested a review June 12, 2026 16:58
@sduchesneau

Copy link
Copy Markdown
Contributor

cool, good idea.

  1. using it in substreams creates a kind of circular dependency... Could this be all moved to pbgo ?
  2. I'm Interested in performance comparison with what is in 'substreams' for known ethereum type when iterating on transactions, since this here is more dynamic, but it's probably similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants