Skip to content

[rollout, vllm] feat: reject server-side tool parser for TITO agent rollout#6844

Open
Jiang020609 wants to merge 1 commit into
verl-project:mainfrom
Jiang020609:fix/vllm-tito-tool-parser-config
Open

[rollout, vllm] feat: reject server-side tool parser for TITO agent rollout#6844
Jiang020609 wants to merge 1 commit into
verl-project:mainfrom
Jiang020609:fix/vllm-tito-tool-parser-config

Conversation

@Jiang020609

Copy link
Copy Markdown

What does this PR do?

Follow-up to #6560. That PR proposed exposing server-side vLLM tool-calling
config (enable_auto_tool_choice, tool_call_parser, tool_parser_plugin) in
the rollout config. Per review feedback there, RL rollout uses
token-in-token-out (TITO)
generation, so tool parsing happens client-side in the AgentLoop and these
server-side parser args should not be passed to vLLM at all.

This PR implements that guidance as a fail-fast guardrail: when a vLLM
rollout enables the AgentLoop tool path (multi-turn, a non-default
default_agent_loop, tool_config_path, or function_tool_path), configuring
any of the server-side vLLM tool parser args in rollout.engine_kwargs.vllm
now raises a clear ValueError instead of being silently passed through to the
engine (where it is redundant/conflicting with client-side parsing).

Plain chat-completion rollouts (multi-turn disabled) are unaffected and may
still set these args.

Checklist Before Starting

Test

Config validation is CPU-testable; no training experiment is needed (no change
to training dynamics — this only rejects an invalid config combination).

python -m pytest tests/workers/config/test_rollout_config_on_cpu.py -q
# 6 passed

The test file is named *_on_cpu.py, so it is auto-collected by
.github/workflows/cpu_unit_tests.yml (which runs tests/**/test_*_on_cpu.py
on CPU). Coverage includes:

  • vLLM multi-turn rejects server-side parser args (dataclass and dict config)
  • vLLM tool_agent rejects server-side parser args
  • vLLM function_tool_path rejects server-side parser args
  • chat-completion (multi-turn disabled) still allows the args
  • unrelated engine_kwargs.vllm (e.g. gpu_memory_utilization) is untouched

API and Usage Example

No new config fields. Behavior change only: an invalid combination now fails early.

from verl.workers.config import RolloutConfig, MultiTurnConfig

# Raises ValueError: server-side tool parser args are not allowed for TITO AgentLoop rollout
RolloutConfig(
    name="vllm",
    multi_turn=MultiTurnConfig(enable=True),
    engine_kwargs={"vllm": {"enable_auto_tool_choice": True, "tool_call_parser": "hermes"}},
)

# Still allowed without multi-turn (plain chat-completion rollout)
RolloutConfig(
    name="vllm",
    multi_turn=MultiTurnConfig(enable=False),
    engine_kwargs={"vllm": {"enable_auto_tool_choice": True, "tool_call_parser": "hermes"}},
)

Design & Code Changes

  • verl/workers/config/rollout.py: in RolloutConfig.__post_init__, after the
    existing rollout validations, detect AgentLoop tool usage and, for the vllm
    backend, raise ValueError if any of {enable_auto_tool_choice, tool_call_parser, tool_parser_plugin} is set to a non-default value.
  • tests/workers/config/test_rollout_config_on_cpu.py: new CPU unit tests
    covering the rejection paths and the allowed paths.

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks (ruff / ruff-format / mypy pass; files unchanged by formatters).
  • Add / Update the documentation. — N/A: no new config surface; this only rejects an already-invalid combination.
  • Add unit test(s) to the CI workflow — auto-collected by cpu_unit_tests.yml via the *_on_cpu.py suffix.
  • Not related to the recipe submodule.

…ollout

Follow-up to verl-project#6560. Per review feedback that RL rollout uses
token-in-token-out (TITO) generation with client-side AgentLoop tool
parsing, reject server-side vLLM tool parser args (enable_auto_tool_choice,
tool_call_parser, tool_parser_plugin) when multi-turn / tool-agent rollout
is enabled, instead of exposing them. Add CPU coverage for multi_turn,
tool_agent, and function_tool_path configurations.

Co-authored-by: OpenAI Codex <codex@openai.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces validation logic in RolloutConfig to prevent the configuration of server-side vLLM tool parser arguments when client-side AgentLoop tool parsing is used (such as in multi-turn RL rollouts). It also adds corresponding unit tests to verify that the appropriate ValueError is raised under these conditions. There are no review comments, so I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant