Skip to content

feat(dynamo): add native token wrapper transport#9

Merged
jthomson04 merged 1 commit into
dynamo-k8s-integrationfrom
codex/dynamo-native-token-wrapper
Jun 26, 2026
Merged

feat(dynamo): add native token wrapper transport#9
jthomson04 merged 1 commit into
dynamo-k8s-integrationfrom
codex/dynamo-native-token-wrapper

Conversation

@jthomson04

@jthomson04 jthomson04 commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Summary

  • add a NeMo-RL HTTP token wrapper in front of Dynamo chat completions when the Dynamo generation path exposes an OpenAI-compatible rollout server
  • have the wrapper render chat prompts to token IDs, send them to Dynamo via nvext.token_data, and request only nvext.engine_data
  • require successful Dynamo chat responses to include native nvext.engine_data.prompt_token_ids and completion_token_ids, so Gym does not fall back to Dynamo /tokenize
  • share prefix-substitution logic between the existing vLLM worker path and the Dynamo wrapper
  • pass tokenizer state into Dynamo generation from GRPO, while keeping direct /completions calls pointed at the real Dynamo frontend
  • update DGD worker manifests to launch Dynamo workers with --enable-rl and handle newer rl worker discovery/pause semantics for MX refits
  • update the Gym submodule pointer for native Dynamo engine_data token consumption; companion Gym PR: feat(vllm-model): consume native dynamo token data NVIDIA-NeMo/Gym#1784

Validation

  • git diff --cached --check
  • uv run ruff check nemo_rl/algorithms/grpo.py nemo_rl/models/generation/dynamo/dynamo_generation.py nemo_rl/models/generation/dynamo/token_wrapper.py nemo_rl/models/generation/vllm/vllm_worker_async.py nemo_rl/utils/prefix_reuse.py tests/unit/models/generation/test_dynamo_generation.py tests/unit/models/generation/test_dynamo_token_wrapper.py tests/unit/models/generation/test_vllm_generation.py tests/unit/utils/test_prefix_reuse.py
  • uv run pytest tests/unit/models/generation/test_dynamo_token_wrapper.py tests/unit/utils/test_prefix_reuse.py tests/unit/models/generation/test_dynamo_generation.py -q -> 44 passed, 1 warning
  • uv run pytest tests/unit/models/generation/test_vllm_generation.py -q -k replace_prefix_tokens -> 5 passed, 37 deselected
  • in 3rdparty/Gym-workspace/Gym: uv run pytest responses_api_models/vllm_model/tests/test_app.py -q -k TokenIDInformation -> 3 passed, 66 deselected
  • in 3rdparty/Gym-workspace/Gym: uv run ruff check responses_api_models/vllm_model/app.py responses_api_models/vllm_model/tests/test_app.py

Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
@github-actions

Copy link
Copy Markdown

✅ Submodule Fast-Forward Check Results

Check based on commit: 94d4852 (PR #9 from codex/dynamo-native-token-wrapper)

✅ Submodules that are properly updated:

Gym: ✅ PR branch is ahead of dynamo-k8s-integration branch (fast-forward)

All submodule changes look good! ✨

@jthomson04 jthomson04 marked this pull request as ready for review June 26, 2026 20:17
@jthomson04 jthomson04 merged commit 8b9cbfd into dynamo-k8s-integration Jun 26, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant