fix: honor skip_special_tokens from extra_body on the vLLM completion path by DongjiGao · Pull Request #1488 · NVIDIA-NeMo/Skills

DongjiGao · 2026-06-15T18:41:03Z

Recreated from an in-repo branch so CI can run on it (fork PRs do not receive CI secrets). Replaces #1478.

What

The vLLM text-completion request builder hardcoded skip_special_tokens: False at the top level, while _build_request_body could simultaneously carry a user-supplied skip_special_tokens inside extra_body. The field was sourced twice, so any extra_body.skip_special_tokens was silently dropped on the completion path.

This pops skip_special_tokens out of extra_body into the single top-level field (default False), so inference.extra_body.skip_special_tokens is honored on the completion path — matching how the chat path already behaves.

Why

skip_special_tokens=false keeps special tokens (e.g. speaker tags) in the decoded output, which multispeaker ASR needs. The chat endpoint already respects this via extra_body; this makes the completion endpoint consistent, without adding any new config surface.

Usage (just extra_body, no new options):

++inference.extra_body.skip_special_tokens=false

Test

tests/test_vllm_completion.py:

default → top-level skip_special_tokens=False, absent from extra_body
extra_body.skip_special_tokens=True → lifted to the single top-level field, not duplicated

Summary by CodeRabbit

New Features
- The skip_special_tokens parameter in inference requests is now configurable via request options, providing greater control over token processing behavior.
Tests
- Added tests to validate skip_special_tokens parameter handling and configuration.

… path The text-completion request hardcoded `skip_special_tokens: False` at the top level while `_build_request_body` could also carry a user-supplied `skip_special_tokens` inside `extra_body`, so the field was sourced twice and the extra_body value was silently dropped. Pop it from extra_body into the single top-level field (default False) so `inference.extra_body.skip_special_tokens` is honored on the completion path, matching how the chat path already behaves (e.g. `=false` keeps speaker tags for multispeaker ASR). Signed-off-by: Dongji Gao <dongjig@nvidia.com>

coderabbitai · 2026-06-15T18:43:13Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c04ba3bb-20e8-4bdb-affc-59256eb92628

📥 Commits

Reviewing files that changed from the base of the PR and between da85a88 and 35bf527.

📒 Files selected for processing (2)

nemo_skills/inference/model/vllm.py
tests/test_vllm_completion.py

📝 Walkthrough

Walkthrough

VLLMModel._build_completion_request_params is refactored to build request_extra_body once via _build_request_body(...), then use pop("skip_special_tokens", False) to extract that field (defaulting to False) instead of hardcoding it. A new test module adds two unit tests verifying the default and override behaviors.

Changes

skip_special_tokens extraction from extra_body

Layer / File(s)	Summary
Completion request assembly and unit tests `nemo_skills/inference/model/vllm.py`, `tests/test_vllm_completion.py`	`_build_completion_request_params` now calls `_build_request_body(...)` once into `request_extra_body`, pops `skip_special_tokens` from it (defaulting to `False`), and sets `extra_body` to the same dict. Two unit tests assert that `skip_special_tokens` defaults to `False` with no duplication in `extra_body`, and that a caller-supplied `skip_special_tokens=True` is promoted to the top-level field and absent from `extra_body`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'fix: honor skip_special_tokens from extra_body on the vLLM completion path' directly and specifically describes the main change: fixing the vLLM completion endpoint to properly handle the skip_special_tokens parameter from extra_body.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/preserve-special-tokens

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

DongjiGao mentioned this pull request Jun 15, 2026

fix: honor skip_special_tokens from extra_body on the vLLM completion path #1478

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: honor skip_special_tokens from extra_body on the vLLM completion path#1488

fix: honor skip_special_tokens from extra_body on the vLLM completion path#1488
DongjiGao wants to merge 1 commit into
mainfrom
fix/preserve-special-tokens

DongjiGao commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 15, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DongjiGao commented Jun 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Test

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 15, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DongjiGao commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading