Skip to content

fix(storage): retry throttled fts metadata listing#6994

Merged
BubbleCal merged 1 commit into
mainfrom
yang/azure-fts-indexing-throttle-fix
May 29, 2026
Merged

fix(storage): retry throttled fts metadata listing#6994
BubbleCal merged 1 commit into
mainfrom
yang/azure-fts-indexing-throttle-fix

Conversation

@BubbleCal
Copy link
Copy Markdown
Contributor

Bug Fix

What is the bug?

Distributed FTS indexing can encounter transient object-store list failures during partition metadata discovery, especially Azure ServerBusy and account egress-limit responses. These failures were not retried consistently on the Lance side, and FTS metadata listing could swallow list stream errors and later report the misleading error No partition metadata files found.

What issues or incorrect behavior does the bug cause?

Transient cloud throttling can fail FTS index builds instead of backing off and resuming. When the list failure happens while discovering partition metadata, the user can lose the original service error body, request ID, or throttle reason and see an empty-directory error instead. List streams also did not feed the AIMD limiter before the underlying object-store list request started.

How does this PR fix the problem?

  • Add delayed retry with conservative jittered exponential backoff to object-store list retry streams, while preserving immediate return for non-retryable errors.
  • Resume retried list streams from the last successfully yielded key.
  • Acquire the AIMD list token before creating list and list_with_offset delegate streams, then continue observing yielded items and errors.
  • Expand throttle classification for targeted cloud throttling messages, including Azure ServerBusy, account egress limits, HTTP 429, and known rate-limit phrases.
  • Remove the provider-side client_max_retries=0 AIMD-disable guard for AWS, Azure, and GCP. Users should set lance_aimd_max_retries=0 to explicitly disable Lance AIMD throttling.
  • Propagate real FTS metadata list stream errors immediately while preserving the existing no-files error for genuinely empty metadata directories.

No public Rust, Python, or Java APIs are added.

Validation

  • cargo test -p lance-io list_retry
  • CARGO_TARGET_DIR=/tmp/lance-fc2c-target cargo test -p lance-io throttle
  • CARGO_TARGET_DIR=/tmp/lance-fc2c-target cargo test -p lance-index list_metadata_files
  • CARGO_TARGET_DIR=/tmp/lance-fc2c-target cargo test -p lance merge_existing_index_segments_supports_fts_segments
  • cargo fmt --all
  • CARGO_TARGET_DIR=/tmp/lance-fc2c-target cargo clippy --all --tests --benches -- -D warnings

@github-actions github-actions Bot added the bug Something isn't working label May 29, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 80.92910% with 78 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/inverted/builder.rs 51.56% 31 Missing ⚠️
rust/lance-io/src/object_store/list_retry.rs 87.17% 24 Missing and 1 partial ⚠️
rust/lance-io/src/object_store/throttle.rs 85.33% 22 Missing ⚠️

📢 Thoughts on this report? Let us know!

@BubbleCal BubbleCal requested review from westonpace and wkalt May 29, 2026 10:03
@BubbleCal BubbleCal marked this pull request as ready for review May 29, 2026 10:03
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@BubbleCal BubbleCal requested a review from Xuanwo May 29, 2026 10:14
@Xuanwo
Copy link
Copy Markdown
Collaborator

Xuanwo commented May 29, 2026

It is possible for us to not rely listing?

@BubbleCal
Copy link
Copy Markdown
Contributor Author

let's remove listing in next PR!

@BubbleCal BubbleCal merged commit 334ffb7 into main May 29, 2026
38 of 39 checks passed
@BubbleCal BubbleCal deleted the yang/azure-fts-indexing-throttle-fix branch May 29, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants