Llama.embed() calls LlamaBatch.add_sequence with old 3-arg signature; missing logits_array

# Prerequisites

- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the README.md.
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

# Expected Behavior

`Llama.embed()` should successfully compute embeddings when called on a model constructed with `embeddings=True`.

# Current Behavior

`Llama.embed()` raises a `TypeError` immediately, before any embedding is computed:

```
TypeError: LlamaBatch.add_sequence() missing 1 required positional argument: 'logits_array'
```

The cause: `Llama.embed()` in `llama_cpp/llama.py` (around line 1678) calls `add_sequence` with three positional arguments:

```python
self._batch.add_sequence(tokens, p_batch, logits_all)
```

But `LlamaBatch.add_sequence` in `llama_cpp/_internals.py` (around line 1013) requires four:

```python
def add_sequence(
    self,
    token_array: Sequence[int],
    pos_array: Sequence[int],
    seq_ids: Sequence[Sequence[int]],
    logits_array: Sequence[bool]
)
```

`llama_cpp/llama_embedding.py` (around line 262) already calls `add_sequence` correctly with the four-arg shape — the call site in `Llama.embed()` was apparently missed during the `LlamaBatch.add_sequence` refactor.

# Environment and Context

- Hardware: x86_64, NVIDIA GeForce RTX 4090
- OS: Windows 10 22H2
- Python 3.12.9
- llama-cpp-python 0.3.36 (CUDA 12.8 prebuilt wheel)

```
$ python --version
Python 3.12.9

$ pip show llama-cpp-python | findstr Version
Version: 0.3.36
```

# Failure Information (for bugs)

This is a clean regression — `LlamaBatch.add_sequence` was refactored from a 3-arg signature to a 4-arg one, and the call sites were updated everywhere except in `Llama.embed()`. `llama_embedding.py` shows what the new shape should look like for the embedding code path.

# Steps to Reproduce

```python
from llama_cpp import Llama

m = Llama(model_path="path/to/model.gguf", embeddings=True)
m.embed("hello")
```

Result:
```
TypeError: LlamaBatch.add_sequence() missing 1 required positional argument: 'logits_array'
```

# Failure Logs

```
Traceback (most recent call last):
  File "...\Lib\site-packages\llama_cpp\llama.py", line 1678, in embed
    self._batch.add_sequence(tokens, p_batch, logits_all)
TypeError: LlamaBatch.add_sequence() missing 1 required positional argument: 'logits_array'
```

# Suggested fix

Mirror the call shape already used in `llama_cpp/llama_embedding.py`:

```python
# In llama.py Llama.embed(), replace:
self._batch.add_sequence(tokens, p_batch, logits_all)

# With something like:
self._batch.add_sequence(
    token_array=tokens,
    pos_array=list(range(len(tokens))),
    seq_ids=[p_batch],
    logits_array=[True] * len(tokens) if logits_all else [False] * (len(tokens) - 1) + [True],
)
```

# Workaround

Monkey-patching `LlamaBatch.add_sequence` to detect 3-arg legacy calls and synthesize the missing `pos_array` works as a stopgap. Hit while running Tencent's HY-Motion text-to-motion model, whose text encoder uses `Llama.embed()` against GGUF Qwen3 weights.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama.embed() calls LlamaBatch.add_sequence with old 3-arg signature; missing logits_array #2211

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Suggested fix

Workaround

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Llama.embed() calls LlamaBatch.add_sequence with old 3-arg signature; missing logits_array #2211

Description

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Suggested fix

Workaround

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions