Skip to content

Add timeout for tool calls#1466

Open
tamohannes wants to merge 2 commits into
mainfrom
tamohannes/waste-tool-timeout
Open

Add timeout for tool calls#1466
tamohannes wants to merge 2 commits into
mainfrom
tamohannes/waste-tool-timeout

Conversation

@tamohannes

@tamohannes tamohannes commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds GenerationTaskConfig.tool_call_timeout_s with a default of 300.0 seconds and wraps tool execution in a wall-clock timeout. When a tool times out, generation receives a model-visible tool error instead of waiting indefinitely.

Why

If retrieval/tool/environment calls hang while an inference server is allocated, GPUs can sit idle until the Slurm job is cancelled by idle-GPU monitoring. A bounded timeout lets the sample fail or continue cleanly.

Tests

  • python -m pytest tests/test_mcp_clients.py::test_tool_calling_wrapper_times_out_slow_tool_call -q
  • python -m ruff check nemo_skills/inference/generate.py nemo_skills/inference/model/__init__.py nemo_skills/inference/model/tool_call.py tests/test_mcp_clients.py

Summary by CodeRabbit

  • New Features

    • Added configurable per-tool-call execution timeout (default 300s). Tool calls that exceed the timeout are aborted and return a structured error indicating the call timed out, improving robustness of tool integrations. Initialization validates timeout values and rejects non-positive values.
  • Tests

    • Added tests verifying timeout enforcement for slow tools and validation that non-positive timeout values are rejected.

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 695ae641-002c-4e88-b141-60978c691834

📥 Commits

Reviewing files that changed from the base of the PR and between 28de790 and 1a42c17.

📒 Files selected for processing (2)
  • nemo_skills/inference/model/tool_call.py
  • tests/test_mcp_clients.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • nemo_skills/inference/model/tool_call.py

📝 Walkthrough

Walkthrough

Adds a per-tool-call timeout: config field on GenerationTaskConfig, propagated into get_tool_calling_model and ToolCallingWrapper, where asyncio.wait_for enforces the timeout and tests validate timeout and validation behavior.

Changes

Per-Tool-Call Timeout Support

Layer / File(s) Summary
Timeout Configuration Contract
nemo_skills/inference/generate.py
GenerationTaskConfig adds tool_call_timeout_s: float | None = 300.0 field.
Configuration Propagation Through Stack
nemo_skills/inference/generate.py, nemo_skills/inference/model/__init__.py
GenerationTask.setup_llm() forwards tool_call_timeout_s to get_tool_calling_model(), which accepts and forwards it to ToolCallingWrapper constructor.
Timeout Enforcement in ToolCallingWrapper
nemo_skills/inference/model/tool_call.py
ToolCallingWrapper imports asyncio, accepts a tool_call_timeout_s constructor arg, validates it, and wraps tool execution with asyncio.wait_for; asyncio.TimeoutError is caught and returned as a structured timeout error.
Timeout Test Validation
tests/test_mcp_clients.py
Adds SlowTool and tests: one asserting a slow tool call returns a timeout error, another parametrized test asserting non-positive tool_call_timeout_s raises ValueError.

Sequence Diagrams

sequenceDiagram
  participant GenerationTask
  participant get_tool_calling_model
  participant ToolCallingWrapper
  GenerationTask->>get_tool_calling_model: setup_llm() forwards tool_call_timeout_s
  get_tool_calling_model->>ToolCallingWrapper: passes tool_call_timeout_s to __init__
  Note over ToolCallingWrapper: stores timeout for use during execution
Loading
sequenceDiagram
  participant ToolCallingWrapper
  participant asyncio as asyncio.wait_for
  participant ToolManager
  ToolCallingWrapper->>asyncio: wait_for(ToolManager.execute_tool(), tool_call_timeout_s)
  alt Completes in time
    ToolManager->>ToolCallingWrapper: returns result
  else Timeout exceeded
    asyncio->>ToolCallingWrapper: raises asyncio.TimeoutError
    Note over ToolCallingWrapper: returns {"error": "tool call timed out"}
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add timeout for tool calls' directly and concisely describes the main change: introducing a timeout mechanism for tool execution.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tamohannes/waste-tool-timeout

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
nemo_skills/inference/model/tool_call.py (1)

53-69: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate timeout bounds at construction.

tool_call_timeout_s <= 0 will make tool execution fail immediately, which is easy to misconfigure and hard to diagnose. Enforce > 0 or None (disable) up front.

Proposed fix
     def __init__(
         self,
         model: BaseModel,
         tool_modules: list[str] | None = None,
         tool_overrides: dict | None = None,
         additional_config: dict | None = None,
         schema_overrides: dict | None = None,
         max_tool_calls: int = -1,
         tool_call_timeout_s: float | None = 300.0,
     ):
         self.model = model
         additional_config = additional_config or {}
@@
         self.schema_overrides = load_schema_overrides(schema_overrides)
         self.schema_mappings = {}  # Built when tools are listed
         self.max_tool_calls = max_tool_calls
+        if tool_call_timeout_s is not None and tool_call_timeout_s <= 0:
+            raise ValueError("tool_call_timeout_s must be > 0 or None.")
         self.tool_call_timeout_s = tool_call_timeout_s
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemo_skills/inference/model/tool_call.py` around lines 53 - 69, Validate the
tool_call_timeout_s parameter in the constructor where tool_call_timeout_s is
assigned: ensure it is either None or > 0 and raise a ValueError with a clear
message if tool_call_timeout_s is <= 0; update the assignment to set
self.tool_call_timeout_s only after this check (referencing the parameter name
tool_call_timeout_s and the instance attribute self.tool_call_timeout_s).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_mcp_clients.py`:
- Around line 241-243: SlowTool.execute currently ignores its inputs and always
sleeps; update SlowTool.execute to validate inputs and raise immediately on
unexpected values instead of proceeding to the timeout. Specifically, in
SlowTool.execute check that tool_name equals the expected tool identifier(s)
used by the tests (and/or that required keys exist in the arguments dict and
that no unknown keys are present), and raise a clear exception (e.g.,
ValueError/TypeError) when validation fails; only if validation passes keep the
artificial delay/return value. This ensures the test fails fast on wrong routing
or argument shapes.

---

Outside diff comments:
In `@nemo_skills/inference/model/tool_call.py`:
- Around line 53-69: Validate the tool_call_timeout_s parameter in the
constructor where tool_call_timeout_s is assigned: ensure it is either None or >
0 and raise a ValueError with a clear message if tool_call_timeout_s is <= 0;
update the assignment to set self.tool_call_timeout_s only after this check
(referencing the parameter name tool_call_timeout_s and the instance attribute
self.tool_call_timeout_s).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3bac962a-3743-4972-90b0-24e3a9aff5b9

📥 Commits

Reviewing files that changed from the base of the PR and between b620e79 and e805ffa.

📒 Files selected for processing (4)
  • nemo_skills/inference/generate.py
  • nemo_skills/inference/model/__init__.py
  • nemo_skills/inference/model/tool_call.py
  • tests/test_mcp_clients.py

Comment thread tests/test_mcp_clients.py

@Kipok Kipok left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gwarmstrong can you also check this? I think it's probably appropriate to have some client side timeout as currently, e.g. search tools don't see to have any configured (and in principle this can be used with external tool implementations that we don't have control over I guess). But ideally we somehow also enforce this on our own tool implementations so that the tool calls handle timeout appropriately directly, otherwise we might have a bunch of orphan processes

Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>
@tamohannes tamohannes force-pushed the tamohannes/waste-tool-timeout branch from e805ffa to 28de790 Compare June 10, 2026 22:56
- Reject non-positive tool_call_timeout_s at construction (must be > 0, or None to disable).
- SlowTool test helper fails fast on unexpected tool name / arguments.
- Add a constructor-validation test.

Signed-off-by: tamohannes <hovhannes.tamoyan@gmail.com>
@tamohannes tamohannes force-pushed the tamohannes/waste-tool-timeout branch from 28de790 to 1a42c17 Compare June 10, 2026 23:02
@tamohannes

Copy link
Copy Markdown
Collaborator Author

Addressed the bot findings: validate tool_call_timeout_s > 0 at construction, and the SlowTool test fails fast on unexpected args.

@gwarmstrong on the bigger point — the wrapper-level wait_for cancels the await but the underlying tool process can keep running, so a timeout can still leave an orphan. Enforcing timeouts inside the tool implementations themselves is worth a follow-up; happy to scope it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants