feat(vllm-model): consume native dynamo token data by jthomson04 · Pull Request #1784 · NVIDIA-NeMo/Gym

jthomson04 · 2026-06-26T18:56:52Z

Summary

consume Dynamo-native token IDs from response.nvext.engine_data when a chat-completion response includes them
attach prompt_token_ids, generation_token_ids, and generation_log_probs from the native response data without calling /tokenize
keep the existing /tokenize fallback only for normal vLLM-style responses that do not include engine_data
fail fast on malformed engine_data instead of silently falling back to /tokenize

The NeMo-RL Dynamo wrapper is responsible for requesting nvext.engine_data and supplying Dynamo nvext.token_data; this Gym change only consumes the response-side native token data.

Companion NeMo-RL PR: jthomson04/RL#9

Validation

uv run pytest responses_api_models/vllm_model/tests/test_app.py -q -k TokenIDInformation -> 3 passed, 66 deselected
uv run ruff check responses_api_models/vllm_model/app.py responses_api_models/vllm_model/tests/test_app.py

…nize When per-message prompt_token_ids/generation_token_ids are attached to assistant messages (training mode), populate the top-level required_prefix_token_ids field on both the chat-completion request and the separate tokenize request. Mirrors NeMoRLOpenAIChatRequestMixin auto-derive in nemo-rl's custom vLLM serving (vllm_worker_async.py). Without this, Dynamo - which has the splice machinery server-side but no auto-derive - re-tokenizes the chat history each turn, breaking the byte-level token-contiguity invariant on multi-turn rollouts. The fix must apply to BOTH endpoints because the contiguity assert in nemo_rl/environments/nemo_gym.py reads prompt_token_ids from the tokenize response, not the chat response. Patching only chat fails at the tokenize step. Signed-off-by: jthomson04 <jwillthomson19@gmail.com>

copy-pr-bot · 2026-06-26T18:56:56Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: jthomson04 <jwillthomson19@gmail.com>

jthomson04 mentioned this pull request Jun 26, 2026

feat(dynamo): add native token wrapper transport jthomson04/RL#9

Merged

feat(vllm-model): consume native dynamo token data

0be56ef

Signed-off-by: jthomson04 <jwillthomson19@gmail.com>

jthomson04 force-pushed the codex/dynamo-native-token-transport branch from 683aa26 to 0be56ef Compare June 26, 2026 20:08

jthomson04 changed the title ~~feat(vllm-model): support dynamo native token transport~~ feat(vllm-model): consume native dynamo token data Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vllm-model): consume native dynamo token data#1784

feat(vllm-model): consume native dynamo token data#1784
jthomson04 wants to merge 2 commits into
NVIDIA-NeMo:mainfrom
jthomson04:codex/dynamo-native-token-transport

jthomson04 commented Jun 26, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jthomson04 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

copy-pr-bot Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jthomson04 commented Jun 26, 2026 •

edited

Loading