diag(governance): dev-node battery + isCoherent mismatch dump + hash-mismatch analysis#876
diag(governance): dev-node battery + isCoherent mismatch dump + hash-mismatch analysis#876Shitikyan wants to merge 8 commits into
Conversation
Two minimal changes that together let testing/scripts/run-scenario.sh
and the testenv:* suites run end-to-end on a fresh local devnet:
- .dockerignore re-includes testing/loadgen/ so the demos-devnet-node
image ships testing/loadgen/src/main.ts. The parent testing/ exclude
was hiding it, breaking docker-compose.perf.yml's loadgen service
with "Module not found".
- data/genesis.json funds the 4 docker-compose devnet identities (the
addresses derived from testing/devnet/identities/node{1-4}.identity)
so loadgen scenarios can sign + broadcast native txs without hitting
"Insufficient balance: required N, available 0".
Verified locally: testenv:doctor 4/4 healthy, sanity scenarios ok=true,
consensus_tx_inclusion confirms txs past block 18.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mismatch analysis Adds three pieces of diagnostic infrastructure for the governance hash mismatch surfaced on dev.node2 while exercising `DemosTransactions.proposeNetworkUpgrade` end-to-end: 1. `scripts/dev-node-battery.ts` — single-command smoke test against any deployed RPC: native pay → stake → validators list → governance propose → vote → unstake (arm) → final state, with per-stage tx polling via `getTransactionStatus` nodeCall and a self-contained markdown report writer. Stages 1/3/6 confirm cleanly on dev.node2; stages 4/5 fail with `[Tx Validation] [SIGNATURE ERROR] Transaction hash mismatch` — the failure that drove this PR. 2. `test-reports/dev-node-battery-FINAL.md` — clean run captured 2026-05-28T08:55Z against dev.node2 (node v0.9.8 / a095794, dirty, osDenomination active). 7/10 stages pass; native flow proven. 3. `test-reports/governance-hash-mismatch-analysis.md` — root-cause write-up: the node and SDK serializers (`src/forks/serializerGate.ts` vs `@kynesyslabs/demosdk/denomination/serializerGate.js`) differ in two places — node doesn't walk `gcr_edits[]`, and node rebuilds `transaction_fee` in fixed key order instead of spreading. Local round-trip from the SDK's post-sign canonical shape produces matching bytes for both PAY and PROPOSE, so the dev.node2-specific divergence is not yet pinned — the deployed binary is `dirty=true` and we have no log access. The same report documents why every existing governance test (10 files) misses this: they all cut the wire at the SDK-builder→node-validator boundary. SDK PR kynesyslabs/sdks#90 closes the round-trip coverage gap from the SDK side. Companion debug log in `Transaction.isCoherent` — sibling to PR #870's GCREdit-mismatch dump. When the full content hash diverges, emit the tx type, both hashes, and the bytes the node hashed so the next dev.node2 repro can pinpoint the diverging byte from logs alone. Pure logging — no consensus/validation behaviour change. Strip after root cause is found. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Warning Review limit reached
More reviews will be available in 48 minutes and 7 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR adds comprehensive transaction hash mismatch diagnosis and end-to-end testing infrastructure. It enhances ChangesTransaction Hash Mismatch Diagnosis and Testing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR lands three diagnostic pieces for the governance hash mismatch observed on
Confidence Score: 5/5Safe to merge; all changes are diagnostic-only with no behavioural impact on production paths. The transaction.ts change is a pure log addition on an already-rejected transaction path and cannot affect consensus or correctness. The battery script is a standalone diagnostic tool that never runs in CI or production. The proposalId guard and skipped-stage count fixes are correct. test-reports/dev-node-battery-FINAL.md — committed report was captured from an earlier version of the script and no longer accurately represents what the current script produces. Important Files Changed
Sequence DiagramsequenceDiagram
participant Script as dev-node-battery.ts
participant SDK as Demos SDK
participant Node as dev.node2 RPC
participant Log as node log
Script->>SDK: pay / stake / unstake
SDK->>Node: broadcast signed tx
Node-->>Script: 200 OK included
Script->>SDK: proposeNetworkUpgrade()
SDK->>SDK: sign() compute tx.hash SDK serializer
SDK->>Node: confirm(tx)
Node->>Node: isCoherent() serializeTransactionContent()
Node->>Log: log.warn type sdkHash derivedHash serialized
Node-->>SDK: SIGNATURE ERROR hash mismatch
SDK-->>Script: throws proposalId stays empty
Note over Script: Stage 5 vote skipped
Reviews (5): Last reviewed commit: "review: guard proposalId behind successf..." | Re-trigger Greptile |
- `Transaction.isCoherent` mismatch dump now uses `log.warn` instead of `log.error` (Greptile P1). Hash mismatch is an expected, recoverable condition during investigation — error-level would light up on-call alerts in any monitoring stack watching this node. Collapsed the four separate log lines into one to avoid log spam. - Drop unused `existsSync` import from `dev-node-battery.ts` (Greptile P2). - Track skipped stages separately so the summary headline isn't misleading (Greptile P2). A skipped L2PS stage no longer counts as a failure: "7/8 stages passed (+ 1 skipped)" instead of "7/10 stages passed". Added `skipped?: boolean` to `StageResult`, marked every L2PS/governance/unstake skip path, and switched the table icon selector to `s.skipped` instead of a brittle notes string match. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… gitignore mnemonic SDK signatures take a single argument — the `Transaction` (for confirm) or `RPCResponseWithValidityData` (for broadcast). The `demos` instance is bound as `this` on the method; passing it as a second positional arg was tolerated at runtime (JavaScript ignores extras) but TypeScript correctly flagged it under `tsc --noEmit`. Eight call sites cleaned up. Also adds `stress-test-mnemonic` (the funded test wallet used by `dev-node-battery.ts`) to .gitignore so it can sit alongside `.manual-test-mnemonic` without risk of getting staged on a future `git add .`. The file itself is operator-local and was never tracked. Verified: battery still runs end-to-end against dev.node2, same 7/9 + 1 skipped result. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… default `scripts/dev-node-battery.ts:17` sets the default RPC to a plain-HTTP URL because the deployed dev node listens on plain HTTP — there is no TLS terminator in front of it. SonarCloud's typescript:S5332 rule flagged this as a Quality Gate failure on PR #876. The default is just convenience — an operator pointing the battery at a TLS-terminated endpoint overrides via `RPC=https://...`. Inline NOSONAR with explanatory comment so the next reviewer sees why the HTTP default is the right choice for this specific URL. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test-reports/governance-hash-mismatch-analysis.md (1)
13-15: 💤 Low valueConsider adding language identifiers to fenced code blocks.
The fenced code blocks at lines 13-15 and 176-183 are missing language specifiers. Adding identifiers (e.g.,
```textor```plaintext) improves syntax highlighting and tooling support.📝 Proposed fix
Line 13:
-``` +```text [Tx Validation] [SIGNATURE ERROR] Transaction hash mismatchLine 176: ```diff -``` +```text SDK builder ─── sign ─── confirm ─── isCoherent ─── handleGovernanceTx ─── GCR apply │ │ │ │Also applies to: 176-183
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test-reports/governance-hash-mismatch-analysis.md` around lines 13 - 15, Add a language identifier (e.g., "text" or "plaintext") to the fenced code blocks that are currently unspecified; specifically update the block containing "[Tx Validation] [SIGNATURE ERROR] Transaction hash mismatch" and the ASCII diagram starting with "SDK builder ─── sign ─── confirm ─── isCoherent ─── handleGovernanceTx ─── GCR apply" so they become "```text" fenced blocks to improve syntax highlighting and tooling support.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test-reports/governance-hash-mismatch-analysis.md`:
- Around line 13-15: Add a language identifier (e.g., "text" or "plaintext") to
the fenced code blocks that are currently unspecified; specifically update the
block containing "[Tx Validation] [SIGNATURE ERROR] Transaction hash mismatch"
and the ASCII diagram starting with "SDK builder ─── sign ─── confirm ───
isCoherent ─── handleGovernanceTx ─── GCR apply" so they become "```text" fenced
blocks to improve syntax highlighting and tooling support.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1cf50ecd-0cf4-49ed-adae-889d2e37e83c
📒 Files selected for processing (5)
.gitignorescripts/dev-node-battery.tssrc/libs/blockchain/transaction.tstest-reports/dev-node-battery-FINAL.mdtest-reports/governance-hash-mismatch-analysis.md
`proposalId = randomUUID()` ran before `await demos.confirm(tx)`. When `confirm()` throws (current dev.node2 behaviour — hash mismatch), the exception is caught by `runStage`, but the outer `proposalId` is already a non-empty UUID. The `if (proposalId)` guard at Stage 5 was therefore truthy, firing a vote against a proposal that never existed on chain and producing a misleading ❌ "Proposal not found" instead of ⏭️. Mint the UUID locally as `id`, only promote to the outer `proposalId` after `poll.status === "included"` — mirrors the `stakeOk` pattern in Stage 3. Verified: re-running battery against dev.node2 now shows Stage 5 correctly skipped as ⏭️. Summary line went from `7/9 stages passed (+ 1 skipped)` to `6/7 stages passed (+ 2 skipped)` — the false positive in the headline is gone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Three diagnostic pieces for the governance hash mismatch surfaced on
dev.node2while exercisingDemosTransactions.proposeNetworkUpgradeend-to-end:scripts/dev-node-battery.ts— single-command smoke test against any deployed RPC: native pay → stake → validators list → governance propose → vote → unstake (arm) → final state. Per-stage tx polling viagetTransactionStatusnodeCall; self-contained markdown report writer.test-reports/dev-node-battery-FINAL.md— clean run captured2026-05-28T08:55Zagainstdev.node2.demos.sh:53552(v0.9.8 /a0957941, dirty, osDenomination active). 7/10 stages green; native flow proven.test-reports/governance-hash-mismatch-analysis.md— root-cause write-up. Two serializer asymmetries documented (node doesn't walkgcr_edits[], node rebuildstransaction_feewithout spread); pure local round-trip from the SDK's post-sign canonical shape matches bytes for both PAY and PROPOSE, so the dev.node2-specific divergence is not yet pinned. The deployed binary isdirty=trueand we have no log access — hence the debug log below.src/libs/blockchain/transaction.ts— sibling of PR debug(validation): emit raw gcr_edits dump on mismatch + version canary (DEBUG-ONLY) #870's GCREdit-mismatch dump. WhenisCoherentdiverges, emit the tx type, both hashes, and the serialized bytes so the next dev.node2 repro pinpoints the diverging field from logs alone. Pure logging, strip after root cause is found.Why this is worth landing as diagnostics, not a fix
Without log access to dev.node2 + the binary being
dirty=true, a speculative serializer fix risks chasing the wrong asymmetry. The combination of (a) battery script, (b)isCoherentlog dump, and (c) analysis report gives the next on-call enough to find the diverging byte in one re-deploy cycle.The SDK-side coverage gap that let this slip through is closed in a separate test PR: kynesyslabs/sdks#90 adds
networkUpgrade+networkUpgradeVotefixtures toroundTripHash.test.ts. That PR is byte-equality-only and would have failed CI on any future SDK serializer drift for governance content.Repro
# stress-test-mnemonic must be present + funded on the target RPC RPC=http://dev.node2.demos.sh:53552 bunx tsx scripts/dev-node-battery.tsStages 4 + 5 fail with
[Tx Validation] [SIGNATURE ERROR] Transaction hash mismatch. Native pay / stake / unstake all green.Test plan
test-reports/dev-node-battery-FINAL.mdsrc/libs/blockchain/transaction.tschange is debug-log only — no behaviour change, mirrors the pattern shipped in PR debug(validation): emit raw gcr_edits dump on mismatch + version canary (DEBUG-ONLY) #870[TX] isCoherent mismatch dump.serializedlog line from the failing propose tx, attach to a follow-upCompanion PR
networkUpgrade/networkUpgradeVoteroundTripHash fixtures + TS6 override fixups)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation