Skip to content

Cosmos: add cold-start metadata cache hedging (Java port of dotnet #5923)#49517

Draft
NaluTripician wants to merge 2 commits into
mainfrom
nalutripician/cosmos-metadata-hedging
Draft

Cosmos: add cold-start metadata cache hedging (Java port of dotnet #5923)#49517
NaluTripician wants to merge 2 commits into
mainfrom
nalutripician/cosmos-metadata-hedging

Conversation

@NaluTripician

Copy link
Copy Markdown
Contributor

Summary

Ports cold-start metadata cache hedging from the .NET Cosmos SDK (Azure/azure-cosmos-dotnet-v3#5923) to the Java SDK, adapted to the reactive (Project Reactor) model.

On cold-start metadata cache population (Collection read), the SDK now proactively dispatches a hedged cross-region request when the primary region hasn't responded within an SDK-derived threshold, reducing cold-start tail latency during regional brown-outs / PPAF failover.

What's included

New package com.azure.cosmos.implementation.metadatahedging:

  • MetadataHedgingStrategy (one per client): executeAsync races a primary regional read against a hedge dispatched after a fixed threshold (1.5s = first-attempt + 500ms), returns the first acceptable winner and cancels the loser. Non-blocking per-client Semaphore budget; fail-fast hedging when the primary returns a regional failure before the threshold; shared failure classification (isRegionalFailure, isAcceptableWinner with the hedge-branch 401/403 overlay).
  • Supporting types: MetadataHedgingContext, MetadataHedgingResult, MetadataHedgeDiagnostics, MetadataHedgeEligibility, MetadataHedgeSkipReason, HedgeBranch.

Wiring & config:

  • Configs.getMetadataHedgingForColdStartEnabled() — tri-state opt-in (null=follow PPAF, true=force on, false=off) via COSMOS.METADATA_HEDGING_FOR_COLD_START_ENABLED system property / env var. Default behavior is unchanged (the strategy is only constructed when PPAF is enabled or the customer explicitly opts in).
  • RxClientCollectionCache — injects GlobalEndpointManager + strategy (backward-compatible constructors); cold-start Collection reads route through the strategy with a region-targeted sender that handles both master-key and AAD auth per cloned branch.
  • RxDocumentClientImpl — builds the strategy via createIfEnabled, resolving PPAF state from the per-partition automatic-failover manager.

Testing

  • azure-cosmos compiles; checkstyle and spotbugs both pass (not disabled).
  • New MetadataHedgingStrategyTest: 19/19 unit tests pass covering opt-in resolution, createIfEnabled, regional-failure / acceptable-winner classification, all eligibility skip reasons, and executeAsync race paths (ineligible→primary-only, primary-wins-no-hedge, hedge-wins-on-primary-regional-failure, budget-exhausted fallback).

Scope notes

Phase-1 scope mirrors the .NET phased rollout. Deferred follow-ups: PartitionKeyRangeCache hedging (Java path goes through the full readPartitionKeyRanges query pipeline, not a direct store-model call), OpenTelemetry metrics, and the Gateway kill-switch account flag (hard-wired false in .NET Phase 1 too).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

)

Introduces MetadataHedgingStrategy: bounded cross-region hedging for cold-start metadata cache population. Races a primary regional read against a hedge dispatched after a fixed SDK-derived threshold, returning the first acceptable winner. Wired into RxClientCollectionCache cold-start Collection reads, gated by a tri-state Configs opt-in that follows PPAF when unset. Includes 19 unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix latent bug: hedgeOutcome was cached, decoupling the hedge timer from downstream cancellation so a spurious hedge could fire ~threshold after the primary already won. Removed .cache() on the hedge branch (single consumer) so merge cancellation cancels the timer; added a regression test (fast primary win must not leak a late hedge).

- Guard the per-client budget permit against an assembly-time throw between tryAcquire and the doFinally release; single AtomicBoolean now guards every release path against double-release.

- Thread isColdStart through RxCollectionCache.getByRid/getByName so forced metadata refreshes no longer hedge; only cold-start cache-miss population does.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant