Skip to content

feat: add count modes to LostInTheMiddleRanker#11441

Closed
kota-wilson wants to merge 1 commit into
deepset-ai:mainfrom
kota-wilson:feat/lost-in-the-middle-count-mode
Closed

feat: add count modes to LostInTheMiddleRanker#11441
kota-wilson wants to merge 1 commit into
deepset-ai:mainfrom
kota-wilson:feat/lost-in-the-middle-count-mode

Conversation

@kota-wilson
Copy link
Copy Markdown
Contributor

@kota-wilson kota-wilson commented May 30, 2026

Related Issues

Proposed Changes:

  • Add a count_mode parameter to LostInTheMiddleRanker with word, char, and token modes.
  • Add tokenizer_encoding for token counting with tiktoken.
  • Preserve the existing word-based word_count_threshold behavior by default.
  • Update tests, docs, and release notes.

How did you test it?

  • hatch -e test run pytest test/components/rankers/test_lost_in_the_middle.py -q passed: 24 passed.
  • hatch run fmt-check haystack/components/rankers/lost_in_the_middle.py test/components/rankers/test_lost_in_the_middle.py passed.
  • hatch run test:types haystack/components/rankers/lost_in_the_middle.py passed.
  • hatch run test:types test/components/rankers/test_lost_in_the_middle.py passed.
  • npm run build passed in docs-website.
  • pre-commit run --files ... passed.
  • git diff --cached --check passed before commit.

Notes for the reviewer

The word_count_threshold parameter name is preserved for backward compatibility; the unit is selected by count_mode. Token mode follows the existing RecursiveDocumentSplitter default encoding (o200k_base).

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

@kota-wilson kota-wilson requested a review from a team as a code owner May 30, 2026 19:22
@kota-wilson kota-wilson requested review from anakin87 and removed request for a team May 30, 2026 19:22
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

@kota-wilson is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions Bot added topic:tests type:documentation Improvements on the docs labels May 30, 2026
@anakin87
Copy link
Copy Markdown
Member

anakin87 commented Jun 1, 2026

Hey @kota-wilson, thanks for this PR.

As indicated in #11351 (comment),

we would like to keep functionality as is for now

If you actually see the need for this change, please comment/discuss on #11351.

@anakin87 anakin87 closed this Jun 1, 2026
@kota-wilson kota-wilson deleted the feat/lost-in-the-middle-count-mode branch June 2, 2026 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic:tests type:documentation Improvements on the docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support token-based budget in LostInTheMiddleRanker

2 participants