Skip to content

feat: add LiteLLM backend for multi-provider benchmarking#800

Open
RheagalFire wants to merge 2 commits into
vllm-project:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM backend for multi-provider benchmarking#800
RheagalFire wants to merge 2 commits into
vllm-project:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire

@RheagalFire RheagalFire commented Jun 16, 2026

Copy link
Copy Markdown

Summary

Adds a new litellm backend that routes generation requests through the LiteLLM SDK, enabling benchmarking across 100+ providers (Anthropic, Gemini, Bedrock, Groq, Cohere, Mistral, etc.) via a unified interface. Timing instrumentation matches the existing OpenAI HTTP backend so benchmark results are directly comparable.

Details

  • New LiteLLMBackend and LiteLLMBackendArgs following the existing Backend / BackendArgs registration pattern
  • Uses litellm.acompletion(stream=True) with drop_params=True for cross-provider compatibility
  • Reuses ChatCompletionsRequestHandler.format() to build messages from GenerationRequest.columns
  • Lazy-loaded via guidellm.extras.litellm so the optional dep doesn't break imports when not installed
  • litellm>=1.80.0,<1.87.0 added as optional dependency under [project.optional-dependencies].litellm
  • 21 unit tests covering args, registration, lifecycle, streaming dispatch, timing, and token usage
  • All ruff checks pass
  • All 392 existing backend tests still pass

Test Plan

  • Run pytest tests/unit/backends/litellm/ -v to verify unit tests
  • Run pytest tests/unit/backends/ -v to verify no regressions in existing backends
  • Live E2E verified with anthropic/claude-sonnet-4-6 via Azure Foundry:
    args = LiteLLMBackendArgs(
        model="anthropic/claude-sonnet-4-6",
        api_key="...",
        api_base="...",
        max_tokens=50,
    )
    
    Confirmed: streaming works, TTFT signal fires, token counts captured, timing fields populated.

Related Issues

  • N/A (new feature)

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes code generated or substantially modified by an AI agent
  • Includes tests generated or substantially modified by an AI agent

git log

commit 8e17f9b
Author: RheagalFire arishalam121@gmail.com
Date: Tue Jun 16 23:51:47 2026 +0530

feat: add LiteLLM backend for multi-provider benchmarking

Generated-by: Claude claude-opus-4-6
Signed-off-by: RheagalFire <arishalam121@gmail.com>

commit 93c94ca
Author: Aarish Irani rheagalfire@gmail.com
Date: Wed Jul 1 20:00:26 2026 +0530

fix: raise litellm minimum to 1.83, remove upper cap, regen lockfile

- Bump minimum litellm version to >=1.83.0 (post supply-chain fix)
- Remove <1.87.0 upper bound to allow latest releases
- Regenerate uv.lock via tox run -e lock
- Fix ruff formatting issues

Signed-off-by: Aarish Irani <rheagalfire@gmail.com>

Generated-by: Claude claude-opus-4-6
Signed-off-by: RheagalFire arishalam121@gmail.com
Signed-off-by: Aarish Irani rheagalfire@gmail.com

@mergify

mergify Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Hi @RheagalFire, the DCO check has failed. Please click on DCO in the Checks section for instructions on how to resolve this.

@RheagalFire RheagalFire force-pushed the feat/add-litellm-provider branch 2 times, most recently from d0d1bda to 1b1135f Compare June 16, 2026 18:21
Generated-by: Claude claude-opus-4-6
Signed-off-by: RheagalFire <arishalam121@gmail.com>
@RheagalFire RheagalFire force-pushed the feat/add-litellm-provider branch from 1b1135f to 8e17f9b Compare June 16, 2026 18:22
@RheagalFire

Copy link
Copy Markdown
Author

cc @sjmonson

@sjmonson sjmonson left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick issue I noticed. Also I am guessing you didn't run with pre-check locally since at minimum you will need to regen the lock file after changing dependencies. You can do that with tox run -e lock and plain tox run will run all the other CI tasks if you want to have a faster feedback loop.

Comment thread pyproject.toml Outdated
- Bump minimum litellm version to >=1.83.0 (post supply-chain fix)
- Remove <1.87.0 upper bound to allow latest releases
- Regenerate uv.lock via tox run -e lock
- Fix ruff formatting issues

Signed-off-by: Aarish Irani <rheagalfire@gmail.com>
@RheagalFire

Copy link
Copy Markdown
Author

@sjmonson Fixed the version bounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants