Skip to content

fix(broker): widen LIST_GROUPS api versions#142

Merged
novatechflow merged 8 commits into
KafScale:mainfrom
kamir:feat/broker-smoke-and-protocol-fixes
Jun 2, 2026
Merged

fix(broker): widen LIST_GROUPS api versions#142
novatechflow merged 8 commits into
KafScale:mainfrom
kamir:feat/broker-smoke-and-protocol-fixes

Conversation

@kamir
Copy link
Copy Markdown
Collaborator

@kamir kamir commented May 16, 2026

Summary

This PR now contains a single open-source broker compatibility fix:

  • widen the advertised LIST_GROUPS API version range from 5..5 to 0..5
  • keep the existing handler unchanged; the coordinator path is already version-agnostic
  • add a regression test that locks the advertised range

Why

Older Kafka admin clients negotiate LIST_GROUPS in the 0..4 range. Advertising only 5..5 causes admin tooling such as kafka-consumer-groups --list to fail negotiation even though KafScale already handles the request.

Scope

  • broker API version advertisement only
  • no INIT_PRODUCER_ID support
  • no standalone Helm chart changes

Verification

  • go test ./cmd/broker -run TestHandlerApiVersionsUnsupported|TestGenerateApiVersionsAdvertisesListGroupsCompatibility -count=1
  • go test ./pkg/broker -run TestCoordinatorListDescribeGroups -count=1

Notes

The earlier INIT_PRODUCER_ID stub and standalone smoke-test Helm chart were removed from this PR because they are not appropriate for the open-source repo in their current form.

kamir and others added 3 commits May 16, 2026 11:25
Minimal chart that deploys a single kafscale-broker Pod pointing at external
etcd + S3 (MinIO or cloud). Intended for quick smoke tests and blueprint
convergence on KIND — NOT a replacement for the full operator-based chart at
deploy/helm/kafscale/.

Used by scalytics-all-in-one bp-001 Ops Foundation smoke suite:
COMP-kafscale-01 (pod Ready) + COMP-kafscale-02 (Kafka TCP reachable).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves OPS-005 item #2. The Java admin client (kafka-consumer-groups,
AdminClient.listConsumerGroups, Schema Registry) negotiates LIST_GROUPS in
range [0,4]. Advertising a narrow 5-5 window caused:

  UnsupportedVersionException: Error listing groups ...
  The broker does not support LIST_GROUPS with version in range [0,4].
  The supported range is [5,5].

The underlying h.coordinator.ListGroups handler is version-agnostic; the
encoder handles v0 just as well as v5. The fix is one line — widen the
advertised range.

Verified: kafka-consumer-groups --bootstrap-server kafscale-broker:9092
--list now exits 0 cleanly (from UnsupportedVersionException pre-fix).

The remaining OPS-005 items (INIT_PRODUCER_ID, transactional APIs, Schema
Registry NPE on verifySchemaTopic) are substantive broker-engineering
work and are not addressed here.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a minimum-viable implementation of the Kafka INIT_PRODUCER_ID API
(API key 22). Allocates a monotonically-increasing producer ID with epoch 0;
does not yet track sequence numbers or deduplicate on replay. Sufficient to
unblock Java AdminClient default producers, franz-go idempotent producers,
and Schema Registry's producer-init probe.

Changes:
  - pkg/protocol/api.go: add APIKeyInitProducerID = 22
  - cmd/broker/main.go:
      * handler gains nextProducerID int64 (atomic allocator)
      * dispatch case for *kmsg.InitProducerIDRequest returns pid + epoch=0
      * apiVersions: InitProducerID moves from unsupported to {0, 4}
      * import sync/atomic

Verified:
  - `kafka-console-producer --producer-property enable.idempotence=true`
    now succeeds (was: UnsupportedVersionException).
  - kaf-mirror (franz-go) replicates primary→standby end-to-end:
    PRIMARY offsets = STANDBY offsets, measured lag <1s.
  - SCEN-bp002-06_Replication scenario test flipped SKIP → PASS.

Known limitations (production correctness gap, tracked in OPS-005):
  - no sequence-number tracking: duplicate-on-retry semantics not enforced
  - no epoch management: fencing of stale producers on rebalance not implemented
  - PID allocator is process-local, not persisted across broker restart

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@novatechflow
Copy link
Copy Markdown
Collaborator

Tests fail (license), pls fix.

@novatechflow
Copy link
Copy Markdown
Collaborator

@kamir - please see my last comment.

@novatechflow novatechflow changed the title feat(broker): INIT_PRODUCER_ID + LIST_GROUPS fixes + standalone Helm chart for smoke tests fix(broker): widen LIST_GROUPS api versions Jun 2, 2026
@novatechflow novatechflow merged commit b0a1e12 into KafScale:main Jun 2, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants