Extremely slow (3 t/s) and can't setup llama.cpp + a couple of errors

Hey there, I have just tried Mux, and I face two problems that make the app non-usable at the moment.

1. Getting extremely low speeds with Qwen 3.6 35b a3b:
- Using OpenAI compatible URL for LM Studio, I get 3 t/s speed (note: my average LM Studio speed is 72 t/s)

2. Using Qwen 3.5 9b, it works fast but I get a couple of "Invalid type for 'input'." errors during chats:
- Speed: LM Studio ~ 86 t/s while Mux is ~80 t/s

<img width="991" height="737" alt="Image" src="https://github.com/user-attachments/assets/52b735dd-9dfc-4645-94a2-1aa86320b43b" />

3. When I set llama.cpp up via OpenAI compatible URL, then try to chat, it keeps showing "Cannot determine type of 'item'
" error:

<img width="920" height="211" alt="Image" src="https://github.com/user-attachments/assets/8faefa11-b445-4bac-903f-3dd47c1934fd" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely slow (3 t/s) and can't setup llama.cpp + a couple of errors #3186

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Extremely slow (3 t/s) and can't setup llama.cpp + a couple of errors #3186

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions