RFC: Native Tool Use for Top-Tier AI Models

### What problem does this proposed feature solve?

Currently, Roo Code uses XML tags for tool calling across all AI models, including top-tier models like Claude that have native tool calling capabilities. This approach results in:

- **High Failure Rate**: Approximately 10% of tool calls fail when using XML tag-based tool calling with top-tier models
- **Increased Complexity Failures**: Functions like `apply_diff` have significantly higher failure rates (>15%)
- **Multi-Turn Degradation**: In agent mode with consecutive tool calls across multi-turn conversations, reliability decreases progressively
- **Suboptimal Performance**: XML parsing by models not designed for it introduces latency and accuracy issues
- **Inconsistent User Experience**: Users experience unpredictable tool behavior, particularly when editing large files

Quantitative data shows that XML tag-based approaches are less reliable than native tool calling implementations that top-tier models have specifically optimized for their architectures.


### Describe the proposed solution in detail

Implement a tiered tool calling system that prioritizes native tool calling APIs for models that support them, while maintaining backward compatibility through XML tags for models that don't.

Key functionalities:

1. **Provider/Model Detection**: Automatically identify the model provider and specific model ID during runtime
2. **Native Tool Routing**: Route tool calls through native APIs when available:
   - Use Claude's native tool calling API for Claude 3.5/Sonnet/Opus models
   - Use OpenAI's function calling for GPT models
   - Use Gemini's function calling for Gemini models
3. **Transparent Translation Layer**: Create a unified interface that handles the appropriate method selection:
   ```typescript
   // Example API (simplified)
   async function executeTool(toolName, params, modelProvider, modelId) {
     if (supportsNativeToolCalling(modelProvider, modelId)) {
       return executeNativeTool(toolName, params, modelProvider, modelId);
     } else {
       return executeXmlTagTool(toolName, params);
     }
   }
   ```
4. **Tool Mapping System**: Implement mappings between Roo Code tools and provider-specific tool formats:

   | Functionality | Anthropic Claude | Roo Code Current |
   |----------------------------|-------------------------------------|------------------------------------------|
   | Read File | `view: path, view_range` | `read_file: path, start_line, end_line` |
   | Read Directory | `view: path` | `list_files: path, recursive` |
   | Code Replacement | `str_replace: path, old_str, new_str` | `apply_diff: path, diff` |
   | New File Creation | `create: path, file_text` | `write_to_file: path, content, line_count` |
   | Code Insertion | `insert: path, insert_line, new_str` | `insert_content: path, line, content` |

5. **Progressive Rollout**: Implement the feature in three phases:
   - Phase 1: Add native tool support for Claude models
   - Phase 2: Expand to other top-tier providers (Gemini, GPT)
   - Phase 3: XML tags are only used as a fallback for models that don't support tool use


### Technical considerations or implementation details (optional)

1. **Abstraction Layer Architecture**:
   - Create a new `ToolExecutionStrategy` interface with model-specific implementations
   - Implement a `ToolExecutionFactory` that selects the appropriate strategy based on model provider and ID
   - Maintain the current XML tag processor as a fallback strategy

2. **Parameter Translation**:
   - Build a bidirectional mapping system between Roo Code parameters and native tool parameters
   - For complex operations like `apply_diff`, we need specialized translation logic:
     ```typescript
     // Example translation for apply_diff to Claude's str_replace
     function translateApplyDiffToStrReplace(path, diff) {
       const { oldStr, newStr } = parseDiff(diff);
       return { tool: "str_replace", params: { path, old_str: oldStr, new_str: newStr } };
     }
     ```

3. **Error Handling and Retries**:
   - Implement intelligent fallback: if a native tool call fails, attempt XML format as backup
   - Add telemetry to track success rates of different approaches (with user permission)
   - Create specialized error types for better debugging

4. **Required Dependencies**:
   - Updated client libraries for each provider's API
   - Structured response parsers for each tool call format

5. **Implementation Phases**:
   - **Phase 1**: Claude integration
   - **Phase 2**: GPT and Gemini integration
   - **Phase 3**: Optimization and fallback mechanism refinement



### Describe alternatives considered (if any)

1. **Enhanced XML Tag Processing**:
   - Could improve the current XML tag approach with better formatting and context
   - Would still have fundamental limitations since models aren't optimized for XML parsing
   - Rejected because it wouldn't address the root cause of failures

2. **Custom Intermediary Format**:
   - Could create a new intermediate format specifically designed for AI models
   - Would require significant research to optimize
   - Rejected due to high development cost and lack of clear advantage over native tools

3. **Model-Specific Prompting**:
   - Could use tailored prompts for each model instead of changing the tool calling method
   - Tests showed only marginal improvements (2-3% reduction in failures)
   - Rejected because native tools provide much greater reliability improvements (>90%)

4. **Hybrid XML/JSON Approach**:
   - Using JSON for structured data within XML tags
   - Complexity outweighed benefits in testing
   - Rejected because it adds complexity without addressing fundamental model capabilities


### Additional Context & Mockups

### Industry Evidence

1. **Cursor Team Research** (Lex Fridman Interview):
   From the [YouTube interview: Apply Part](https://www.youtube.com/watch?v=oFfVt3S51T4?t=1962):
   > "You see shallow copies of apply elsewhere and it just breaks most of the time because you think you can try to do some deterministic matching and then it fails at least 40% of the time and that just results in a terrible product experience."

2. **GitHub Copilot's Implementation (May 2025)**:
   From the [VSCode v1.100 update: Faster agent mode edits](https://code.visualstudio.com/updates/v1_100#_faster-agent-mode-edits):
   > "We've implemented support for OpenAI's apply patch editing format (GPT 4.1 and o4-mini) and Anthropic's replace string tool (Claude Sonnet 3.7 and 3.5) in agent mode. This means that you benefit from significantly faster edits, especially in large files."

3. **VSCode's AI Strategy**:
   From the [May 2025 blog post](https://code.visualstudio.com/blogs/2025/05/19/openSourceAIEditor):
   > "We will open source the code in the GitHub Copilot Chat extension under the MIT license... This is the next and logical step for us in making VS Code an open source AI editor."
   **This will have an impact on other AI coding tools, whether they are open source or not. In short, it raises the average baseline for open source products, so hopefully Roo Code will see this and optimize especially the weak points in the agent mode or the frustration of users.**

### Technical Documentation
- [Tool Use with Claude](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/overview)
- [Text Editor Tool Documentation](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/text-editor-tool)

### Claude Text Editor Tool Curl Demo

```shell
curl https://api.anthropic.com/v1/messages \
  -H "content-type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-opus-4-20250514",
    "max_tokens": 1024,
    "tools": [
      {
        "type": "text_editor_20250429",
        "name": "str_replace_based_edit_tool"
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?"
      }
    ]
  }'
```

### Proposal Checklist

- [x] I have searched existing Issues and Discussions to ensure this proposal is not a duplicate.
- [x] This proposal is for a specific, actionable change intended for implementation (not a general idea).
- [x] I understand that this proposal requires review and approval before any development work begins.

### Are you interested in implementing this feature if approved?

- [x] Yes, I would like to contribute to implementing this feature.

Functionality	Anthropic Claude	Roo Code Current
Read File	`view: path, view_range`	`read_file: path, start_line, end_line`
Read Directory	`view: path`	`list_files: path, recursive`
Code Replacement	`str_replace: path, old_str, new_str`	`apply_diff: path, diff`
New File Creation	`create: path, file_text`	`write_to_file: path, content, line_count`
Code Insertion	`insert: path, insert_line, new_str`	`insert_content: path, line, content`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Native Tool Use for Top-Tier AI Models #4047

What problem does this proposed feature solve?

Describe the proposed solution in detail

Technical considerations or implementation details (optional)

Describe alternatives considered (if any)

Additional Context & Mockups

Industry Evidence

Technical Documentation

Claude Text Editor Tool Curl Demo

Proposal Checklist

Are you interested in implementing this feature if approved?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RFC: Native Tool Use for Top-Tier AI Models #4047

Description

What problem does this proposed feature solve?

Describe the proposed solution in detail

Technical considerations or implementation details (optional)

Describe alternatives considered (if any)

Additional Context & Mockups

Industry Evidence

Technical Documentation

Claude Text Editor Tool Curl Demo

Proposal Checklist

Are you interested in implementing this feature if approved?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions