Add timeout for tool calls by tamohannes · Pull Request #1466 · NVIDIA-NeMo/Skills

tamohannes · 2026-05-28T21:21:13Z

Summary

Adds GenerationTaskConfig.tool_call_timeout_s with a default of 300.0 seconds and wraps tool execution in a wall-clock timeout. When a tool times out, generation receives a model-visible tool error instead of waiting indefinitely.

Why

If retrieval/tool/environment calls hang while an inference server is allocated, GPUs can sit idle until the Slurm job is cancelled by idle-GPU monitoring. A bounded timeout lets the sample fail or continue cleanly.

Tests

python -m pytest tests/test_mcp_clients.py::test_tool_calling_wrapper_times_out_slow_tool_call -q
python -m ruff check nemo_skills/inference/generate.py nemo_skills/inference/model/__init__.py nemo_skills/inference/model/tool_call.py tests/test_mcp_clients.py

Summary by CodeRabbit

New Features
- Added configurable per-tool-call execution timeout (default 300s). Tool calls that exceed the timeout are aborted and return a structured error indicating the call timed out, improving robustness of tool integrations. Initialization validates timeout values and rejects non-positive values.
Tests
- Added tests verifying timeout enforcement for slow tools and validation that non-positive timeout values are rejected.

coderabbitai · 2026-05-28T21:25:09Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 695ae641-002c-4e88-b141-60978c691834

📥 Commits

Reviewing files that changed from the base of the PR and between 28de790 and 1a42c17.

📒 Files selected for processing (2)

nemo_skills/inference/model/tool_call.py
tests/test_mcp_clients.py

🚧 Files skipped from review as they are similar to previous changes (1)

nemo_skills/inference/model/tool_call.py

📝 Walkthrough

Walkthrough

Adds a per-tool-call timeout: config field on GenerationTaskConfig, propagated into get_tool_calling_model and ToolCallingWrapper, where asyncio.wait_for enforces the timeout and tests validate timeout and validation behavior.

Changes

Per-Tool-Call Timeout Support

Layer / File(s)	Summary
Timeout Configuration Contract `nemo_skills/inference/generate.py`	`GenerationTaskConfig` adds `tool_call_timeout_s: float \| None = 300.0` field.
Configuration Propagation Through Stack `nemo_skills/inference/generate.py`, `nemo_skills/inference/model/__init__.py`	`GenerationTask.setup_llm()` forwards `tool_call_timeout_s` to `get_tool_calling_model()`, which accepts and forwards it to `ToolCallingWrapper` constructor.
Timeout Enforcement in ToolCallingWrapper `nemo_skills/inference/model/tool_call.py`	`ToolCallingWrapper` imports `asyncio`, accepts a `tool_call_timeout_s` constructor arg, validates it, and wraps tool execution with `asyncio.wait_for`; `asyncio.TimeoutError` is caught and returned as a structured timeout error.
Timeout Test Validation `tests/test_mcp_clients.py`	Adds `SlowTool` and tests: one asserting a slow tool call returns a timeout error, another parametrized test asserting non-positive `tool_call_timeout_s` raises `ValueError`.

Sequence Diagrams

sequenceDiagram
  participant GenerationTask
  participant get_tool_calling_model
  participant ToolCallingWrapper
  GenerationTask->>get_tool_calling_model: setup_llm() forwards tool_call_timeout_s
  get_tool_calling_model->>ToolCallingWrapper: passes tool_call_timeout_s to __init__
  Note over ToolCallingWrapper: stores timeout for use during execution

sequenceDiagram
  participant ToolCallingWrapper
  participant asyncio as asyncio.wait_for
  participant ToolManager
  ToolCallingWrapper->>asyncio: wait_for(ToolManager.execute_tool(), tool_call_timeout_s)
  alt Completes in time
    ToolManager->>ToolCallingWrapper: returns result
  else Timeout exceeded
    asyncio->>ToolCallingWrapper: raises asyncio.TimeoutError
    Note over ToolCallingWrapper: returns {"error": "tool call timed out"}
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add timeout for tool calls' directly and concisely describes the main change: introducing a timeout mechanism for tool execution.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch tamohannes/waste-tool-timeout

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

nemo_skills/inference/model/tool_call.py (1)

53-69: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate timeout bounds at construction.

tool_call_timeout_s <= 0 will make tool execution fail immediately, which is easy to misconfigure and hard to diagnose. Enforce > 0 or None (disable) up front.

Proposed fix

     def __init__(
         self,
         model: BaseModel,
         tool_modules: list[str] | None = None,
         tool_overrides: dict | None = None,
         additional_config: dict | None = None,
         schema_overrides: dict | None = None,
         max_tool_calls: int = -1,
         tool_call_timeout_s: float | None = 300.0,
     ):
         self.model = model
         additional_config = additional_config or {}
@@
         self.schema_overrides = load_schema_overrides(schema_overrides)
         self.schema_mappings = {}  # Built when tools are listed
         self.max_tool_calls = max_tool_calls
+        if tool_call_timeout_s is not None and tool_call_timeout_s <= 0:
+            raise ValueError("tool_call_timeout_s must be > 0 or None.")
         self.tool_call_timeout_s = tool_call_timeout_s

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemo_skills/inference/model/tool_call.py` around lines 53 - 69, Validate the
tool_call_timeout_s parameter in the constructor where tool_call_timeout_s is
assigned: ensure it is either None or > 0 and raise a ValueError with a clear
message if tool_call_timeout_s is <= 0; update the assignment to set
self.tool_call_timeout_s only after this check (referencing the parameter name
tool_call_timeout_s and the instance attribute self.tool_call_timeout_s).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_mcp_clients.py`:
- Around line 241-243: SlowTool.execute currently ignores its inputs and always
sleeps; update SlowTool.execute to validate inputs and raise immediately on
unexpected values instead of proceeding to the timeout. Specifically, in
SlowTool.execute check that tool_name equals the expected tool identifier(s)
used by the tests (and/or that required keys exist in the arguments dict and
that no unknown keys are present), and raise a clear exception (e.g.,
ValueError/TypeError) when validation fails; only if validation passes keep the
artificial delay/return value. This ensures the test fails fast on wrong routing
or argument shapes.

---

Outside diff comments:
In `@nemo_skills/inference/model/tool_call.py`:
- Around line 53-69: Validate the tool_call_timeout_s parameter in the
constructor where tool_call_timeout_s is assigned: ensure it is either None or >
0 and raise a ValueError with a clear message if tool_call_timeout_s is <= 0;
update the assignment to set self.tool_call_timeout_s only after this check
(referencing the parameter name tool_call_timeout_s and the instance attribute
self.tool_call_timeout_s).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3bac962a-3743-4972-90b0-24e3a9aff5b9

📥 Commits

Reviewing files that changed from the base of the PR and between b620e79 and e805ffa.

📒 Files selected for processing (4)

nemo_skills/inference/generate.py
nemo_skills/inference/model/__init__.py
nemo_skills/inference/model/tool_call.py
tests/test_mcp_clients.py

Kipok

@gwarmstrong can you also check this? I think it's probably appropriate to have some client side timeout as currently, e.g. search tools don't see to have any configured (and in principle this can be used with external tool implementations that we don't have control over I guess). But ideally we somehow also enforce this on our own tool implementations so that the tool calls handle timeout appropriately directly, otherwise we might have a bunch of orphan processes

Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>

- Reject non-positive tool_call_timeout_s at construction (must be > 0, or None to disable). - SlowTool test helper fails fast on unexpected tool name / arguments. - Add a constructor-validation test. Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>

tamohannes · 2026-06-10T23:05:06Z

Addressed the bot findings: validate tool_call_timeout_s > 0 at construction, and the SlowTool test fails fast on unexpected args.

@gwarmstrong on the bigger point — the wrapper-level wait_for cancels the await but the underlying tool process can keep running, so a timeout can still leave an orphan. Enforcing timeouts inside the tool implementations themselves is worth a follow-up; happy to scope it.

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Comment thread tests/test_mcp_clients.py

Kipok reviewed Jun 9, 2026

View reviewed changes

Add timeout for tool calls

1208c40

Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>

tamohannes force-pushed the tamohannes/waste-tool-timeout branch from e805ffa to 28de790 Compare June 10, 2026 22:56

tamohannes force-pushed the tamohannes/waste-tool-timeout branch from 28de790 to 1a42c17 Compare June 10, 2026 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add timeout for tool calls#1466

Add timeout for tool calls#1466
tamohannes wants to merge 2 commits into
mainfrom
tamohannes/waste-tool-timeout

tamohannes commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Kipok left a comment

Uh oh!

tamohannes commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tamohannes commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kipok left a comment

Choose a reason for hiding this comment

Uh oh!

tamohannes commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tamohannes commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading