Add timeout for tool calls#1466
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a per-tool-call timeout: config field on GenerationTaskConfig, propagated into get_tool_calling_model and ToolCallingWrapper, where asyncio.wait_for enforces the timeout and tests validate timeout and validation behavior. ChangesPer-Tool-Call Timeout Support
Sequence DiagramssequenceDiagram
participant GenerationTask
participant get_tool_calling_model
participant ToolCallingWrapper
GenerationTask->>get_tool_calling_model: setup_llm() forwards tool_call_timeout_s
get_tool_calling_model->>ToolCallingWrapper: passes tool_call_timeout_s to __init__
Note over ToolCallingWrapper: stores timeout for use during execution
sequenceDiagram
participant ToolCallingWrapper
participant asyncio as asyncio.wait_for
participant ToolManager
ToolCallingWrapper->>asyncio: wait_for(ToolManager.execute_tool(), tool_call_timeout_s)
alt Completes in time
ToolManager->>ToolCallingWrapper: returns result
else Timeout exceeded
asyncio->>ToolCallingWrapper: raises asyncio.TimeoutError
Note over ToolCallingWrapper: returns {"error": "tool call timed out"}
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemo_skills/inference/model/tool_call.py (1)
53-69:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate timeout bounds at construction.
tool_call_timeout_s <= 0will make tool execution fail immediately, which is easy to misconfigure and hard to diagnose. Enforce> 0orNone(disable) up front.Proposed fix
def __init__( self, model: BaseModel, tool_modules: list[str] | None = None, tool_overrides: dict | None = None, additional_config: dict | None = None, schema_overrides: dict | None = None, max_tool_calls: int = -1, tool_call_timeout_s: float | None = 300.0, ): self.model = model additional_config = additional_config or {} @@ self.schema_overrides = load_schema_overrides(schema_overrides) self.schema_mappings = {} # Built when tools are listed self.max_tool_calls = max_tool_calls + if tool_call_timeout_s is not None and tool_call_timeout_s <= 0: + raise ValueError("tool_call_timeout_s must be > 0 or None.") self.tool_call_timeout_s = tool_call_timeout_s🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nemo_skills/inference/model/tool_call.py` around lines 53 - 69, Validate the tool_call_timeout_s parameter in the constructor where tool_call_timeout_s is assigned: ensure it is either None or > 0 and raise a ValueError with a clear message if tool_call_timeout_s is <= 0; update the assignment to set self.tool_call_timeout_s only after this check (referencing the parameter name tool_call_timeout_s and the instance attribute self.tool_call_timeout_s).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_mcp_clients.py`:
- Around line 241-243: SlowTool.execute currently ignores its inputs and always
sleeps; update SlowTool.execute to validate inputs and raise immediately on
unexpected values instead of proceeding to the timeout. Specifically, in
SlowTool.execute check that tool_name equals the expected tool identifier(s)
used by the tests (and/or that required keys exist in the arguments dict and
that no unknown keys are present), and raise a clear exception (e.g.,
ValueError/TypeError) when validation fails; only if validation passes keep the
artificial delay/return value. This ensures the test fails fast on wrong routing
or argument shapes.
---
Outside diff comments:
In `@nemo_skills/inference/model/tool_call.py`:
- Around line 53-69: Validate the tool_call_timeout_s parameter in the
constructor where tool_call_timeout_s is assigned: ensure it is either None or >
0 and raise a ValueError with a clear message if tool_call_timeout_s is <= 0;
update the assignment to set self.tool_call_timeout_s only after this check
(referencing the parameter name tool_call_timeout_s and the instance attribute
self.tool_call_timeout_s).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3bac962a-3743-4972-90b0-24e3a9aff5b9
📒 Files selected for processing (4)
nemo_skills/inference/generate.pynemo_skills/inference/model/__init__.pynemo_skills/inference/model/tool_call.pytests/test_mcp_clients.py
Kipok
left a comment
There was a problem hiding this comment.
@gwarmstrong can you also check this? I think it's probably appropriate to have some client side timeout as currently, e.g. search tools don't see to have any configured (and in principle this can be used with external tool implementations that we don't have control over I guess). But ideally we somehow also enforce this on our own tool implementations so that the tool calls handle timeout appropriately directly, otherwise we might have a bunch of orphan processes
Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>
e805ffa to
28de790
Compare
- Reject non-positive tool_call_timeout_s at construction (must be > 0, or None to disable). - SlowTool test helper fails fast on unexpected tool name / arguments. - Add a constructor-validation test. Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>
28de790 to
1a42c17
Compare
|
Addressed the bot findings: validate @gwarmstrong on the bigger point — the wrapper-level |
Summary
Adds
GenerationTaskConfig.tool_call_timeout_swith a default of300.0seconds and wraps tool execution in a wall-clock timeout. When a tool times out, generation receives a model-visible tool error instead of waiting indefinitely.Why
If retrieval/tool/environment calls hang while an inference server is allocated, GPUs can sit idle until the Slurm job is cancelled by idle-GPU monitoring. A bounded timeout lets the sample fail or continue cleanly.
Tests
python -m pytest tests/test_mcp_clients.py::test_tool_calling_wrapper_times_out_slow_tool_call -qpython -m ruff check nemo_skills/inference/generate.py nemo_skills/inference/model/__init__.py nemo_skills/inference/model/tool_call.py tests/test_mcp_clients.pySummary by CodeRabbit
New Features
Tests