[improve][broker] PIP-483: namespace + topic auto split/merge policy overrides#26008
[improve][broker] PIP-483: namespace + topic auto split/merge policy overrides#26008merlimat wants to merge 4 commits into
Conversation
…overrides Follow-up to the auto split/merge core (apache#25980): per-namespace and per-topic policy overrides, resolved most-specific-wins on top of the broker defaults. - AutoScalePolicyOverride: all-optional override carrying the same knobs as the broker config (caps, cooldowns, window, the eight rate thresholds, enabled). Unset fields fall through; enabled=false opts a namespace or topic out entirely. - Storage: Policies.scalableTopicAutoScalePolicy (namespace) and ScalableTopicMetadata.autoScalePolicy (topic, both broker-internal and admin wire shape). SegmentLayout.toMetadata now takes the original record and carries over all non-layout fields — without this, every split/merge CAS would silently drop the topic override. - Resolution: AutoScaleConfig.resolve(conf, nsOverride, topicOverride) layers the overrides via toBuilder and runs the existing invariant validation on the result, so an override that is only invalid in combination (e.g. merge threshold raised above the default split threshold) is rejected. - Controller: evaluateAndAct resolves the effective policy per evaluation from the (cached) namespace policies + topic metadata, so override changes take effect on the next tick with no controller restart. - Admin API: admin.scalableTopics().set/get/removeAutoScalePolicy(topic) and admin.namespaces().set/get/removeScalableTopicAutoScalePolicy(ns), with REST endpoints, PolicyName.SCALABLE_TOPIC_AUTO_SCALE authorization, and 412 on invariant violations at set time. Tests: resolve-layering + invalid-combination units; controller integration (namespace disable, topic-wins-over-namespace, per-topic maxSegments cap, override survives a split's metadata rewrite); end-to-end admin round-trips at both levels incl. 412 and 404 paths.
… arg The all-args constructor grew a parameter when autoScalePolicy was added to ScalableTopicMetadata; update the admin-api test accordingly and assert the new field round-trips.
|
I performed a local review with Claude Code Fable 5 and it found these findings. Please check them. SummaryClean, well-structured follow-up to #25980. The layering design ( Findings
Smaller things checked and found fine: the 404 path for missing topics works ( |
… on invalid resolve The namespace and topic overrides were each validated only against the broker defaults, never against each other — two individually-valid layers could combine into an invalid policy (e.g. ns raises a merge threshold, topic lowers the matching split threshold), after which the controller's per-evaluation resolve threw on every tick and auto split/merge was silently dead for the topic. Two-part fix, as suggested in review: - The topic-level set now resolves against the current namespace override (cached read) and rejects the combination with 412. The misleading comment claiming the namespace layer 'can only have been valid the same way' is gone; the check is documented as best-effort since the namespace override can change afterwards and broker defaults can change across restarts. - The controller is resilient to a stored combination that is (or has become) invalid: resolveAutoScaleConfig catches the invariant violation, warn-logs it on each evaluation, and treats auto split/merge as disabled for the topic — predictable and visible, instead of failing the evaluation chain. Tests: REST-level 412 for a cross-layer conflict (and acceptance of the same override once the conflicting layer is removed); controller-level fallback (evaluation completes, no action taken) for an invalid stored combination.
…ET endpoints Both GET endpoints resume with null when no override is set, which the JAX-RS layer turns into a 204 with no body — the OpenAPI text claimed a 200 with an empty body. Document the 204 explicitly at both levels (the Java admin client already maps it to null).
|
Thanks for the thorough review @lhotari — all findings addressed:
|
lhotari
left a comment
There was a problem hiding this comment.
LGTM. Just one remaining question: when a user changes the overrides for a namespace or topic, is there some way to trigger autoscaling immediately? Let's say that the user increases the minSegments or maxSegments where the current number of segments is out of the bounds. I would assume that such an operation would be preferred over the current manual autosplit/merge operations available in the Admin REST API for scalable topics.
The default action could continue to be that autoscaling would just use the new rules when the load based rules trigger an action. Otherwise changing a policy at namespace level could cause a lot of split/merge operations at once so enforcing the autoscaling rules about the minSegments or maxSegments boundaries could be a topic level operation.
Related to minSegments, is it already possible to create a topic with a given amount of minSegments initially, or does it always start from 1? Perhaps there would need to be a separate initialSegments in that case since minSegments seems to be meant for the minimum number of segments where further merges would no longer take place.
Right now, it would be check at next interval. Though it could be good have a way to immediately correct. I'd leave it out of the scope of this PR though, since the behavior on a namespace config change should also be considered.
It's already possible to specify the number of segments on the topic creation. Eventually that would be adjusted by auto split/merge |
Follow-up to #25980, completing PIP-483: the auto split/merge policy can now be overridden per namespace and per topic, resolved most-specific-wins on top of the broker defaults. This also addresses the review question on #25980 about controlling
maxSegments/minSegments/maxDagDepthper scalable topic — application is lazy: the controller picks up override changes on its next evaluation and converges using the load stats.Modifications
AutoScalePolicyOverride— all-optional override carrying the same knobs as the broker config (caps, cooldowns, merge window, the eight rate thresholds,enabled). Unset fields fall through to the next layer;enabled = falseopts a namespace or topic out entirely.Policies.scalableTopicAutoScalePolicy(namespace level, following theautoTopicCreationOverridepattern) andScalableTopicMetadata.autoScalePolicy(topic level, broker-internal + admin wire shapes).SegmentLayout.toMetadatanow takes the original record and carries over all non-layout fields — without this, every split/merge CAS would have silently dropped the per-topic override.AutoScaleConfig.resolve(conf, nsOverride, topicOverride)layers the overrides and runs the existing invariant validation on the combined result.evaluateAndActresolves the effective policy per evaluation from the metadata-cache-backed namespace policies + topic metadata, so override changes take effect on the next tick with no controller restart or leadership cycle.admin.scalableTopics().set/get/removeAutoScalePolicy(topic)andadmin.namespaces().set/get/removeScalableTopicAutoScalePolicy(namespace), with REST endpoints (POST/GET/DELETE .../autoScalePolicyand.../scalableTopicAutoScalePolicy) guarded by the newPolicyName.SCALABLE_TOPIC_AUTO_SCALE. The GETs return 204 when no override is set.Follow-up (separate PR)
pulsar-adminCLI bindings (CmdScalableTopics/CmdNamespaces) for the new set/get/remove operations — this PR exposes them via the Java admin client and REST only.Verifying this change
AutoScaleConfigTest— override layering (topic wins over namespace over broker), null-overrides identity, invalid-combination rejection.ScalableTopicControllerAutoScaleTest— namespaceenabled=falsesuppresses splits; topic override wins over namespace; per-topicmaxSegmentscaps splits; the override survives a split's metadata rewrite; an invalid stored combination falls back to disabled without failing the evaluation.ScalableTopicAutoScalePolicyTest— end-to-end admin round-trips at both levels through the real HTTP path, including 412 on invariant violation (same-layer and cross-layer) and 404 on a missing topic.SegmentLayoutTest—toMetadataround-trips all non-layout fields.org.apache.pulsar.broker.service.scalable.*suite + checkstyle across the five touched modules.