plan(tracker): e2e tests for CLI + composed-state lifecycle#10
Draft
dordor12 wants to merge 2 commits into
Draft
plan(tracker): e2e tests for CLI + composed-state lifecycle#10dordor12 wants to merge 2 commits into
dordor12 wants to merge 2 commits into
Conversation
Lands the implementation plan for tracker/test/e2e/. Two surfaces: binary CLI (go build + os/exec, version + config validate exit-code matrix) and composed-state lifecycle (config -> ledger -> registry along the wiring path internal/server.Run will eventually follow: starter grant -> SignedBalance round-trip, registry candidate matching against config-derived filter, chain integrity across orchestrator restart). Seven tasks, one commit each, no production-code changes. Listener / broker / admin / federation e2e are explicitly deferred to plans that land alongside those subsystems. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a "Network surface — deferred test shapes" section sketching the three e2e shapes the eventual internal/stunturn plan should land against: loopback STUN binding, loopback TURN relay round-trip + per- seeder rate limit, and a build-tagged netns NAT-simulation matrix for the spec §11 hole-punching acceptance. Updates the out-of-scope bullet and table of contents to point at the new section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dordor12
added a commit
that referenced
this pull request
May 6, 2026
) * docs(plans): tracker admission persistence — initial draft (tasks 1-5 full TDD) Plan 3 of the admission subsystem trilogy. Covers admission.tlog with CRC32C framing + batched/sync fsync + 1 GiB rotation; snapshot file format with magic 0xADMSNAP1 + atomic write; StartupReplay with snapshot fallback; 11 admin handlers; ~20 Prometheus metrics; and acceptance hardening (§10 #9-20). Tasks 1-5 are full TDD with verbatim test/impl/commit blocks. Tasks 6-15 are scoped outlines — full TDD code follows in subsequent commits before execution begins, matching the plan 2 expand-first pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(plans): tracker admission persistence — expand tasks 6-8 to full TDD Tasks 6-8 now have verbatim test/impl/run/commit blocks matching the plan-2 expansion pattern: - Task 6: snapshot emitter goroutine + retention pruning - Task 7: StartupReplay (snapshot fallback + tlog replay + ledger cross-check + degraded mode) - Task 8: OnLedgerEvent → tlog persist-then-apply ordering, replay suppression flag Tasks 9-15 still outline-only — expansion continues in subsequent commits before execution begins. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(plans): tracker admission persistence — expand tasks 9-11 to full TDD Tasks 9-11 now have verbatim test/impl/run/commit blocks: - Task 9: §7.1-7.4 failure-mode integration + degraded-mode behavior - Task 10: 11 admin handlers, RegisterMux, BasicAuthGuard - Task 11: writeOperatorOverride helper + operator-context key Tasks 12-15 still outline-only — expansion continues in subsequent commits before execution begins. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(plans): tracker admission persistence — expand tasks 12-15 to full TDD Tasks 12-15 now have verbatim test/impl/run/commit blocks: - Task 12: Prometheus Collector + ~20 metrics catalogue - Task 13: §10 #9-12 persistence/recovery acceptance - Task 14: §10 #13-16 performance acceptance + benchmarks - Task 15: §10 #17-20 security acceptance + final integration Plan 3 is now complete: 15 tasks, every step in red→green→commit form, ready for execution starting with Task 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): TLogRecord framing + CRC32C Lands the on-disk frame format from admission-design §4.3: length(4) | seq(8) | ts(8) | kind(1) | payload | crc32c(4) CRC32C uses stdlib hash/crc32 with the Castagnoli polynomial (0x82f63b78). Tests pin the table against crc32.MakeTable so a future import-path change can't silently switch us off Castagnoli. unmarshal returns sentinel ErrTLogTruncated vs ErrTLogCorrupt so replay can distinguish "trailing partial frame from a crash" from "real CRC mismatch on a complete frame" — the former heals by truncating to the last good record; the latter surfaces to the operator. Includes the full kind enum (settlement, dispute_filed, dispute_resolved, heartbeat_bucket_roll, snapshot_mark, operator_override, transfer, starter_grant). TLogKindDispute aliases TLogKindDisputeFiled so Task 8's persistEvent wiring can route both filed and upheld disputes through one write path. No I/O yet — pure bytes-in/bytes-out. The writer + rotation lands in the next task; OnLedgerEvent integration in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): tlog payload types per kind Each TLogKind carries a typed payload — SettlementPayload, DisputePayload, SnapshotMarkPayload, OperatorOverridePayload, TransferPayload, StarterGrantPayload. Marshal/unmarshal pairs use big-endian fixed-width encoding for primitives + length-prefixed bytes for variable-length fields (operator_id, action, params, snapshot path). Each type implements MarshalBinary / UnmarshalBinary so persistEvent (Task 8) and applyTLogRecord (Task 7) can route generically. OperatorOverridePayload also carries a Ts unix-second field so audit records survive a clock skew in admin tooling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): tlog writer — batched fsync + dispute sync + rotation tlogWriter wraps the active admission.tlog file with three concerns: - per-Append routing by kind (disputes synchronous, others batched) - flushInterval ticker driving Sync() on the batched soft-state - size-triggered rotation at rotationBytes (production default 1 GiB) Disputes have no ledger backing if lost (admission-design §4.3 "stricter durability"), so they pay the per-write fsync cost. Settlements / transfers / starter_grants / heartbeat-rolls are recoverable from ledger replay, so they ride the periodic batch. LastSeq() returns the highest seq observed; the snapshot emitter (Task 6) uses it to stamp snapshot files. Open-then-Append-to-existing test pins behavior on restart; concurrent- append test exercises the mutex under -race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): tlog reader + file enumeration readTLogFile parses one tlog file end-to-end with two distinct error shapes: - ErrTLogTruncated at tail: silently healed (post-crash state). Caller takes lastGoodOffset as the true file end. - ErrTLogCorrupt mid-file: propagated to the operator (admission- design §7.2). Pre-corruption records are still returned so replay can apply them before halting. enumerateTLogFiles returns rotated files in seq order followed by the active file. Tests cover out-of-order naming on disk so the sort matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): snapshot file format + atomic write/read Per admission-design §4.3: magic(4) | format_version(4) | seq(8) | ts(8) | consumers_count(4) | repeated ConsumerState | seeders_count(4) | repeated SeederState | trailer_crc32(4) Magic 0xADMSNAP1 encoded as ASCII 'A','D','M','S' (0x41444D53); format_version handles the numeric suffix. Atomic write via <path>.tmp + fsync + rename. Read validates magic, format_version, and trailer CRC; any failure surfaces an error so StartupReplay (Task 7) can fall back to the next-older snapshot. ConsumerState encodes FirstSeenAt + LastBalanceSeen + 30 day-buckets each for {settlement, dispute, flow}. SeederState encodes 10 MinuteBuckets + heartbeat metadata. The 600s emit goroutine + retention pruning lands in Task 6; load-into-Subsystem during replay lands in Task 7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): periodic snapshot emitter + retention pruning Per admission-design §4.3 + §6.5: - runSnapshotEmitOnce: writeSnapshot at current tlog.LastSeq(), prune to SnapshotsRetained, append TLogKindSnapshotMark to the tlog. - snapshotState: per-shard deep-copy under RLock so live mutation cannot tear a write. WithSnapshotPrefix / WithTLogPath / WithSnapshotsRetained Options let tests drive the cycle without real timers; runSnapshotEmitOnce is the test seam. Also fixes a latent race in events.go: applySettlement / applyTransfer / applyStarterGrant / applyDispute now hold the per-shard mutex during mutation, so concurrent snapshotState reads under RLock observe a consistent state. The race was masked by plan-2 tests that didn't exercise concurrent observers. Open() opens the tlog writer when WithTLogPath is set; Close() shuts it down after the aggregator goroutine exits. The ticker-driven background goroutine + auto-replay land in Task 7's StartupReplay wiring; this task's runSnapshotEmitOnce is the test seam both rely on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): StartupReplay + OnLedgerEvent persistence Task 7 — StartupReplay (admission-design §5.7): - Walk newest→oldest snapshots; first that loads is applied. - All-corrupt → degraded mode (decisions still flow). - Empty/no-prefix → clean first-boot (NOT degraded). - Replay tlog records with seq > snapshot.seq; mid-file CRC halts and surfaces ErrTLogCorrupt; trailing-frame truncation heals. - LedgerSource cross-check fills any gap between local tlog and authoritative ledger. v1 default is null. Task 8 — OnLedgerEvent → tlog write-through: Each branch persists its kind-specific payload before mutating in-memory state. Disputes get synchronous fsync via the writer's kind-routing; settlements / transfers / starter_grants ride the batched fsync. s.replaying suppresses tlog writes during StartupReplay so applyTLogRecord doesn't double-write. WithLedgerSource / WithSkipAutoReplay / DegradedMode / SnapshotLoadFailures / TLogCorruptions Options + accessors used by Task 9 metrics + Task 12 Collector. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(tracker/admission): §7.1-7.4 failure-mode + degraded-mode integration Pins admission-design §7.x failure scenarios: §7.1 Crash mid-OnLedgerEvent → ledger cross-check fills tlog gap. §7.2 Mid-tlog corruption → replay halts, decisions still flow, TLogCorruptions counter bumps. §7.3 Snapshot corruption → fall back to next-older, counter bumps. §7.4 All snapshots corrupt → DegradedMode active, decisions still flow. The accessors + counters that these tests verify (DegradedMode, SnapshotLoadFailures, TLogCorruptions) land in Task 7's replay.go; this commit only adds the integration tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): admin handlers + operator override audit trail Per admission-design §9.1 (admin) and §4.3 (audit): 11 admin routes under /admission/, each gated by a MuxGuard. BasicAuthGuard ships as the canonical bearer-token middleware; tests inject a fake validator. RegisterMux mounts everything on a caller-supplied http.ServeMux so the tracker control-plane plan can wire admission into a real listener. GET /status queue + supply snapshot GET /queue ranked queue contents GET /consumer/{id} signals + composite score GET /seeder/{id} heartbeat + headroom POST /queue/drain body {n}, OPERATOR_OVERRIDE POST /queue/eject/{request_id} OPERATOR_OVERRIDE POST /snapshot force runSnapshotEmitOnce POST /recompute/{consumer_id} re-derive (queued) GET /peers/blocklist hex-encoded peer IDs POST /peers/blocklist/{peer_id} OPERATOR_OVERRIDE DELETE /peers/blocklist/{peer_id} OPERATOR_OVERRIDE writeOperatorOverride centralizes the audit-trail write: every mutating admin handler appends a TLogKindOperatorOverride record carrying {operator_id, action, params, ts}. operator_id flows from the request context (WithOperatorContext) or "anonymous" when missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tracker/admission): Prometheus Collector — ~20 metrics Per admission-design §9.2: Hot-path: decisions_total{result}, queue_depth, pressure, supply_total_headroom, demand_rate_ewma, decision_duration_seconds (histogram) Attestations: attestations_issued_total, validation_failures{stage}, attestation_age_seconds, trial_tier_decisions_total Persistence: tlog_replay_gap_entries, tlog_corruption_records_total, snapshot_load_failures_total, snapshot_emit_failures_total, degraded_mode_active (dynamic gauge) Operational: clock_jump_detected_total{direction}, fetchheadroom_timeouts_total, rejections_total{reason}, pressure_threshold_crossing_total{direction}, seeders_contributing Decide bumps decisions_total, rejections_total{reason}, queue_depth, decision_duration_seconds. publishSupply mirrors pressure / supply_total_headroom / seeders_contributing. Collector() returns a composite Collector ready to register with the tracker control-plane's metrics registry. Adds prometheus/client_golang v1.23.2 to tracker module deps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(tracker/admission): §10 #9-20 acceptance — persistence + perf + security Tasks 13, 14, 15 land the spec's §10 acceptance harness in three files: acceptance_persistence_test.go (§10 #9-12): #9 Crash mid-OnLedgerEvent → ledger cross-check fills tlog gap. #10 tlog mid-record corruption → replay halts, decisions still flow, TLogCorruptions counter bumps. #11 Latest snapshot deleted → next-older loads + tlog catches up, 400-entry recovery < 30s. #12 All snapshots corrupted → DegradedMode + decisions still flow. perf_bench_test.go (§10 #13-16): #13 BenchmarkDecide_NoAttestation + TestPerformance_S10_13 pin Decide latency: avg < 1ms. #14 Sustained 500 decides at low pressure keeps queue drained. #15 SupplySnapshot updates within aggregator-tick window. #16 tlog write rate is 1:1 with ledger event rate. acceptance_security_test.go (§10 #17-20): #17 Forged attestation (body tampered post-sign) → score falls back to TrialTierScore. #18 Ejected peer's attestation discarded; consumer falls through. #19 Inflated peer score clamped at MaxAttestationScoreImported. #20 /admission/queue returns 401 without operator token, 200 with. helpers_test.go grows allowAllPeerSet / rejectAllPeerSet fixtures and signedTestAttestation{,WithScore} for §10 #17-19. openTempSubsystem now takes testing.TB so benchmarks can call it. Each test name + leading comment mirrors the §10 spec language so the binding from spec-line to test is searchable. Coverage: 82.8% of statements; race-clean across 10 repeat runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implementation plan for
tracker/test/e2e/. Two test surfaces, scoped to what the tracker actually exposes today:go buildthe tracker, execversionandconfig validateviaos/exec, assert the spec §3.3 exit-code matrix (0 / 1 / 2 / 3) through realos.Exit.cfg.Ledger.StoragePath→ open the ledger orchestrator → seed the registry. Drives a starter-grant →SignedBalanceround-trip, registry candidate matching against acfg.Broker-derived filter, and chain-integrity across orchestrator restart. Same wiring pathinternal/server.Runwill follow when that subsystem lands.Seven tasks, one commit each, no production-code changes. Wall-time budget < 3s for the full e2e package.
Out of scope (deferred to plans that land alongside those subsystems)
internal/server,internal/api,internal/session)internal/broker)internal/admin)internal/federation)When the listener subsystem lands,
helpers_test.gogains adialPluginhelper and the lifecycle tests grow from "compose modules in-process" to "drive modules through the wire" — same scenarios, larger surface.Test plan
This PR ships the plan only; no executable tests yet. Reviewing for plan correctness:
tracker/test/e2e/directory layout (currently empty)config.Load,ledger.Open/WithClock/Close,storage.Open,registry.New/DefaultShardCount/Filter,Ledger.IssueStarterGrant/SignedBalance/AssertChainIntegrity/Tipdocs/superpowers/specs/tracker/2026-04-25-tracker-internal-config-design.md§3.3tracker/test/e2e/testdata/(no cross-package testdata coupling)make -C tracker testandmake -C tracker lintgreen checks included in Task 7Generated by Claude Code