[rollout, vllm] feat: reject server-side tool parser for TITO agent rollout#6844
Open
Jiang020609 wants to merge 1 commit into
Open
[rollout, vllm] feat: reject server-side tool parser for TITO agent rollout#6844Jiang020609 wants to merge 1 commit into
Jiang020609 wants to merge 1 commit into
Conversation
…ollout Follow-up to verl-project#6560. Per review feedback that RL rollout uses token-in-token-out (TITO) generation with client-side AgentLoop tool parsing, reject server-side vLLM tool parser args (enable_auto_tool_choice, tool_call_parser, tool_parser_plugin) when multi-turn / tool-agent rollout is enabled, instead of exposing them. Add CPU coverage for multi_turn, tool_agent, and function_tool_path configurations. Co-authored-by: OpenAI Codex <codex@openai.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces validation logic in RolloutConfig to prevent the configuration of server-side vLLM tool parser arguments when client-side AgentLoop tool parsing is used (such as in multi-turn RL rollouts). It also adds corresponding unit tests to verify that the appropriate ValueError is raised under these conditions. There are no review comments, so I have no feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Follow-up to #6560. That PR proposed exposing server-side vLLM tool-calling
config (
enable_auto_tool_choice,tool_call_parser,tool_parser_plugin) inthe rollout config. Per review feedback there, RL rollout uses
token-in-token-out (TITO)
generation, so tool parsing happens client-side in the AgentLoop and these
server-side parser args should not be passed to vLLM at all.
This PR implements that guidance as a fail-fast guardrail: when a vLLM
rollout enables the AgentLoop tool path (multi-turn, a non-default
default_agent_loop,tool_config_path, orfunction_tool_path), configuringany of the server-side vLLM tool parser args in
rollout.engine_kwargs.vllmnow raises a clear
ValueErrorinstead of being silently passed through to theengine (where it is redundant/conflicting with client-side parsing).
Plain chat-completion rollouts (multi-turn disabled) are unaffected and may
still set these args.
Checklist Before Starting
[{modules}] {type}: {description}Test
Config validation is CPU-testable; no training experiment is needed (no change
to training dynamics — this only rejects an invalid config combination).
python -m pytest tests/workers/config/test_rollout_config_on_cpu.py -q # 6 passedThe test file is named
*_on_cpu.py, so it is auto-collected by.github/workflows/cpu_unit_tests.yml(which runstests/**/test_*_on_cpu.pyon CPU). Coverage includes:
tool_agentrejects server-side parser argsfunction_tool_pathrejects server-side parser argsengine_kwargs.vllm(e.g.gpu_memory_utilization) is untouchedAPI and Usage Example
No new config fields. Behavior change only: an invalid combination now fails early.
Design & Code Changes
verl/workers/config/rollout.py: inRolloutConfig.__post_init__, after theexisting rollout validations, detect AgentLoop tool usage and, for the
vllmbackend, raise
ValueErrorif any of{enable_auto_tool_choice, tool_call_parser, tool_parser_plugin}is set to a non-default value.tests/workers/config/test_rollout_config_on_cpu.py: new CPU unit testscovering the rejection paths and the allowed paths.
Checklist Before Submitting
cpu_unit_tests.ymlvia the*_on_cpu.pysuffix.recipesubmodule.