Skip to content

Gemini CLI agent is not currently reliable with AssetOpsBench + TokenRouter #413

Description

@ChathurangiShyalika

Summary

We investigated adding gemini-cli-agent as a CLI-based agent for AssetOpsBench, but it is currently not reliable for benchmark runs with TokenRouter-backed models.

The main blocker is that Gemini CLI expects Gemini-compatible auth/model/tool-call behavior, while our preferred TokenRouter model path is OpenAI-compatible (tokenrouter/MiniMax-M3). Even with a Gemini-style TokenRouter route, the run failed during tool streaming/tool calling.

What we tried

Smoke command:

uv run gemini-cli-agent \
  --model-id tokenrouter_gemini/google/gemma-4-26b-a4b-it \
  --workspace-dir /tmp/assetopsbench-gemini/smoke \
  --allow-files \
  --allow-bash \
  --show-trajectory \
  "What is the current time?"

Observed behavior:

After auth/config fixes, Gemini CLI was able to start and call an AssetOpsBench MCP tool:

tool: mcp_utilities_current_time_english
output: {
  "english": "2026-06-25 17:45:00",
  "iso": "2026-06-25T17:45:00.614808Z"
}

But the run ended with:

Gemini CLI error: Invalid stream: The model returned an empty response or malformed tool call.

It also attempted invalid tool calls:

Tool "generic_tool" not found. Did you mean one of: "read_file", "grep_search", "glob"?

Why this is a blocker

AssetOpsBench depends on reliable MCP tool use for operational data access. In this test, Gemini CLI could reach MCP, but the model/tool stream was malformed, causing the agent to fail before producing a usable answer.
This makes Gemini CLI unsuitable for benchmark runs with TokenRouter at the moment.

Root causes / likely causes

tokenrouter/MiniMax-M3 is OpenAI-compatible, not Gemini API-compatible, so it cannot be used directly with Gemini CLI.
The Gemini-compatible TokenRouter route starts, but tool-call streaming is not compatible enough for Gemini CLI.
The model emitted generic_tool, which Gemini CLI did not register, suggesting tool-call schema mismatch.
Therefore, Gemini CLI + TokenRouter is not currently reliable for AssetOpsBench MCP-heavy scenarios.

Expected behavior

Gemini CLI agent should:
Start with TokenRouter-backed model config.
Call AssetOpsBench MCP tools correctly.
Receive valid streamed tool-call responses.
Produce a final answer and trajectory JSON.

Actual behavior

The agent starts and calls at least one MCP tool, but fails with malformed stream/tool-call errors.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions