Considering using vLLM for LLM-Inference #15

Open

opened

on Mar 12, 2026

Instead of using just Hugging Face Transformers, consider using vLLM:

https://github.com/vllm-project/vllm

this does continuous batching and is written in Python.

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests