Skip to content

[improve][broker] PIP-483: namespace + topic auto split/merge policy overrides#26008

Open
merlimat wants to merge 4 commits into
apache:masterfrom
merlimat:st-autoscale-policy-override
Open

[improve][broker] PIP-483: namespace + topic auto split/merge policy overrides#26008
merlimat wants to merge 4 commits into
apache:masterfrom
merlimat:st-autoscale-policy-override

Conversation

@merlimat

@merlimat merlimat commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Follow-up to #25980, completing PIP-483: the auto split/merge policy can now be overridden per namespace and per topic, resolved most-specific-wins on top of the broker defaults. This also addresses the review question on #25980 about controlling maxSegments / minSegments / maxDagDepth per scalable topic — application is lazy: the controller picks up override changes on its next evaluation and converges using the load stats.

Modifications

  • AutoScalePolicyOverride — all-optional override carrying the same knobs as the broker config (caps, cooldowns, merge window, the eight rate thresholds, enabled). Unset fields fall through to the next layer; enabled = false opts a namespace or topic out entirely.
  • StoragePolicies.scalableTopicAutoScalePolicy (namespace level, following the autoTopicCreationOverride pattern) and ScalableTopicMetadata.autoScalePolicy (topic level, broker-internal + admin wire shapes). SegmentLayout.toMetadata now takes the original record and carries over all non-layout fields — without this, every split/merge CAS would have silently dropped the per-topic override.
  • ResolutionAutoScaleConfig.resolve(conf, nsOverride, topicOverride) layers the overrides and runs the existing invariant validation on the combined result.
  • Set-time validation — the namespace-level set validates the override against the broker defaults; the topic-level set validates against the broker defaults and the current namespace override, rejecting invalid combinations with a 412. This check is best-effort: the namespace override can change after a topic override was validated against it, and broker defaults can change across restarts, so a stored combination can still become invalid later.
  • Runtime resilience — if the controller resolves an invalid combination at evaluation time, it treats auto split/merge as disabled for the topic and warn-logs the reason on each evaluation until an operator fixes the overrides, rather than failing the evaluation chain.
  • ControllerevaluateAndAct resolves the effective policy per evaluation from the metadata-cache-backed namespace policies + topic metadata, so override changes take effect on the next tick with no controller restart or leadership cycle.
  • Admin APIadmin.scalableTopics().set/get/removeAutoScalePolicy(topic) and admin.namespaces().set/get/removeScalableTopicAutoScalePolicy(namespace), with REST endpoints (POST/GET/DELETE .../autoScalePolicy and .../scalableTopicAutoScalePolicy) guarded by the new PolicyName.SCALABLE_TOPIC_AUTO_SCALE. The GETs return 204 when no override is set.

Follow-up (separate PR)

pulsar-admin CLI bindings (CmdScalableTopics / CmdNamespaces) for the new set/get/remove operations — this PR exposes them via the Java admin client and REST only.

Verifying this change

  • AutoScaleConfigTest — override layering (topic wins over namespace over broker), null-overrides identity, invalid-combination rejection.
  • ScalableTopicControllerAutoScaleTest — namespace enabled=false suppresses splits; topic override wins over namespace; per-topic maxSegments caps splits; the override survives a split's metadata rewrite; an invalid stored combination falls back to disabled without failing the evaluation.
  • ScalableTopicAutoScalePolicyTest — end-to-end admin round-trips at both levels through the real HTTP path, including 412 on invariant violation (same-layer and cross-layer) and 404 on a missing topic.
  • SegmentLayoutTesttoMetadata round-trips all non-layout fields.
  • Full org.apache.pulsar.broker.service.scalable.* suite + checkstyle across the five touched modules.

merlimat added 2 commits June 11, 2026 20:29
…overrides

Follow-up to the auto split/merge core (apache#25980): per-namespace and per-topic policy overrides, resolved most-specific-wins on top of the broker defaults.

- AutoScalePolicyOverride: all-optional override carrying the same knobs as the broker config (caps, cooldowns, window, the eight rate thresholds, enabled). Unset fields fall through; enabled=false opts a namespace or topic out entirely.
- Storage: Policies.scalableTopicAutoScalePolicy (namespace) and ScalableTopicMetadata.autoScalePolicy (topic, both broker-internal and admin wire shape). SegmentLayout.toMetadata now takes the original record and carries over all non-layout fields — without this, every split/merge CAS would silently drop the topic override.
- Resolution: AutoScaleConfig.resolve(conf, nsOverride, topicOverride) layers the overrides via toBuilder and runs the existing invariant validation on the result, so an override that is only invalid in combination (e.g. merge threshold raised above the default split threshold) is rejected.
- Controller: evaluateAndAct resolves the effective policy per evaluation from the (cached) namespace policies + topic metadata, so override changes take effect on the next tick with no controller restart.
- Admin API: admin.scalableTopics().set/get/removeAutoScalePolicy(topic) and admin.namespaces().set/get/removeScalableTopicAutoScalePolicy(ns), with REST endpoints, PolicyName.SCALABLE_TOPIC_AUTO_SCALE authorization, and 412 on invariant violations at set time.

Tests: resolve-layering + invalid-combination units; controller integration (namespace disable, topic-wins-over-namespace, per-topic maxSegments cap, override survives a split's metadata rewrite); end-to-end admin round-trips at both levels incl. 412 and 404 paths.
… arg

The all-args constructor grew a parameter when autoScalePolicy was added to ScalableTopicMetadata; update the admin-api test accordingly and assert the new field round-trips.
@lhotari

lhotari commented Jun 12, 2026

Copy link
Copy Markdown
Member

I performed a local review with Claude Code Fable 5 and it found these findings. Please check them.

Summary

Clean, well-structured follow-up to #25980. The layering design (AutoScaleConfig.resolve = broker defaults → namespace → topic, most-specific-wins per field), the SegmentLayout.toMetadata(original) fix so split/merge CAS rewrites don't drop the per-topic override, and the lazy per-evaluation resolution in the controller all match the description, and the test coverage is good (including the override-survives-split case). There is one real gap: the set-time validation doesn't cover the namespace + topic combination, and when that combination is invalid the controller's evaluation fails on every tick, silently disabling auto split/merge for the topic.

Findings

  1. [BUG] Individually-valid namespace and topic overrides can combine into an invalid policy, which then permanently kills auto split/merge for the topic at runtime — ScalableTopics.java (internalSetAutoScalePolicy), NamespacesBase.java (internalSetScalableTopicAutoScalePolicyAsync), ScalableTopicController.resolveAutoScaleConfig.

    The namespace endpoint validates resolve(conf, override, null) and the topic endpoint validates resolve(conf, null, override) — each layer is only ever checked against the broker defaults, never against the other layer. Example: broker defaults splitMsgRateIn=1000, mergeMsgRateIn=100. A namespace override of mergeMsgRateInThreshold=500 passes (1000 > 500). A topic override of splitMsgRateInThreshold=200 passes (200 > 100). Combined, split=200 ≤ merge=500 violates the hysteresis invariant. The same composition problem exists for minSegments/maxSegments across layers, and for stored overrides that become invalid after a broker restart with changed scalableTopic* defaults (validation only ever ran against the config of the broker that handled the admin call).

    At runtime, resolveAutoScaleConfig then throws IllegalArgumentException inside thenCombine on every evaluation; runAutoScaleSafely catches it and logs a WARN, so auto split/merge is silently dead for that topic with no admin-visible signal. (The in-flight flag is correctly released via whenComplete, so there's no leak — it just never works again until someone fixes the override.)

    Suggested fix, two parts: (a) at topic-level set time, validate against the current namespace override (resolve(conf, currentNsOverride, override)) — it's a cheap cached read and closes the common case; (b) since (a) is still racy (the namespace override can change afterwards, and broker defaults can change across restarts), make the controller resilient: catch IllegalArgumentException in resolveAutoScaleConfig and fall back to a defined behavior (e.g. treat as disabled, or drop the most-specific layer) with a clear log, rather than failing the whole evaluation chain.

  2. [INTENT MISMATCH] The PR description and the code comment in ScalableTopics.internalSetAutoScalePolicy overstate the validation guarantee. The description says an override "only invalid in combination … is rejected with a 412 at set time", and the comment reasons "(the namespace layer can only have been valid the same way)" — that reasoning doesn't hold, per finding 1: both layers being valid against broker defaults does not make their combination valid. Worth rewording even if the behavior gap is fixed.

  3. [QUALITY] No pulsar-admin CLI bindings. CmdScalableTopics and CmdNamespaces exist in pulsar-client-tools, and sibling policies like autoTopicCreation have CLI commands; the new set/get/remove endpoints are reachable only via the Java admin client or raw REST. Fine as a stated follow-up, but worth noting in the PR if intentional.

  4. [QUALITY] Minor REST doc inaccuracy: the topic-level GET advertises "200 … empty body if no override is set", but asyncResponse.resume(null) produces a 204 in that case (the namespace GET has the same behavior). The Java client handles it (returns null), but the OpenAPI description doesn't match the wire behavior — ScalableTopics.java getAutoScalePolicy.

  5. [QUALITY] Test hygiene in ScalableTopicAutoScalePolicyTest: testNamespaceLevelRoundTrip and testInvalidOverrideRejected mutate the shared namespace of SharedPulsarBaseTest. If an assertion fails between set and remove, the enabled=false namespace override leaks into other tests sharing that broker. A finally-style cleanup (or a dedicated namespace) would make it robust.

Smaller things checked and found fine: the 404 path for missing topics works (readModifyUpdate propagates NotFoundException, mapped to 404 in both endpoints); moving the enabled check inside the autoScaleInFlight window is harmless; the ScalableTopicMetadata all-args constructor signature change is acceptable since PIP-483 is unreleased; the new PolicyName enum constant is appended safely.

merlimat added 2 commits June 12, 2026 08:39
… on invalid resolve

The namespace and topic overrides were each validated only against the broker defaults, never against each other — two individually-valid layers could combine into an invalid policy (e.g. ns raises a merge threshold, topic lowers the matching split threshold), after which the controller's per-evaluation resolve threw on every tick and auto split/merge was silently dead for the topic.

Two-part fix, as suggested in review:
- The topic-level set now resolves against the current namespace override (cached read) and rejects the combination with 412. The misleading comment claiming the namespace layer 'can only have been valid the same way' is gone; the check is documented as best-effort since the namespace override can change afterwards and broker defaults can change across restarts.
- The controller is resilient to a stored combination that is (or has become) invalid: resolveAutoScaleConfig catches the invariant violation, warn-logs it on each evaluation, and treats auto split/merge as disabled for the topic — predictable and visible, instead of failing the evaluation chain.

Tests: REST-level 412 for a cross-layer conflict (and acceptance of the same override once the conflicting layer is removed); controller-level fallback (evaluation completes, no action taken) for an invalid stored combination.
…ET endpoints

Both GET endpoints resume with null when no override is set, which the JAX-RS layer turns into a 204 with no body — the OpenAPI text claimed a 200 with an empty body. Document the 204 explicitly at both levels (the Java admin client already maps it to null).
@merlimat

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @lhotari — all findings addressed:

  1. Cross-layer validation gap (bug) — fixed in 2 parts as suggested (c09f665):

    • The topic-level set now resolves against the current namespace override (cached read) and rejects the combination with 412. Covered by a new test that sets a conflicting namespace override, asserts the topic-level 412, and asserts the same topic override is accepted once the conflicting layer is removed.
    • The controller is resilient to a stored combination that is (or has become) invalid: resolveAutoScaleConfig catches the invariant violation, warn-logs the reason on each evaluation, and treats auto split/merge as disabled for the topic instead of failing the evaluation chain. Also covered by a new controller test.
  2. Overstated validation claim — the misleading code comment is gone (replaced by one documenting the best-effort nature of the set-time check), and the PR description has been updated accordingly.

  3. CLI bindings — intentional omission; now stated in the PR description as a follow-up PR (CmdScalableTopics / CmdNamespaces).

  4. GET 204 vs documented 200-empty — the OpenAPI annotations on both GET endpoints now document the 204 no-override response (2152340).

  5. Test hygiene — this one is a non-issue in practice: SharedPulsarBaseTest creates a fresh namespace per test method and force-deletes it in @AfterMethod, so a mid-test failure cannot leak the enabled=false override into other tests — the cleanup is structural rather than per-test. (The new cross-layer test uses try/finally anyway, since it manipulates the namespace override mid-test.)

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one remaining question: when a user changes the overrides for a namespace or topic, is there some way to trigger autoscaling immediately? Let's say that the user increases the minSegments or maxSegments where the current number of segments is out of the bounds. I would assume that such an operation would be preferred over the current manual autosplit/merge operations available in the Admin REST API for scalable topics.

The default action could continue to be that autoscaling would just use the new rules when the load based rules trigger an action. Otherwise changing a policy at namespace level could cause a lot of split/merge operations at once so enforcing the autoscaling rules about the minSegments or maxSegments boundaries could be a topic level operation.

Related to minSegments, is it already possible to create a topic with a given amount of minSegments initially, or does it always start from 1? Perhaps there would need to be a separate initialSegments in that case since minSegments seems to be meant for the minimum number of segments where further merges would no longer take place.

@merlimat

Copy link
Copy Markdown
Contributor Author

LGTM. Just one remaining question: when a user changes the overrides for a namespace or topic, is there some way to trigger autoscaling immediately? Let's say that the user increases the minSegments or maxSegments where the current number of segments is out of the bounds. I would assume that such an operation would be preferred over the current manual autosplit/merge operations available in the Admin REST API for scalable topics.

Right now, it would be check at next interval. Though it could be good have a way to immediately correct.

I'd leave it out of the scope of this PR though, since the behavior on a namespace config change should also be considered.

Related to minSegments, is it already possible to create a topic with a given amount of minSegments initially, or does it always start from 1? Perhaps there would need to be a separate initialSegments in that case since minSegments seems to be meant for the minimum number of segments where further merges would no longer take place.

It's already possible to specify the number of segments on the topic creation. Eventually that would be adjusted by auto split/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants