Skip to content

Eval bug: LFM2.5-8B-A1B tool parser rejects documented <|tool_call_start|>[...]<|tool_call_end|> format #23838

@chdlc

Description

@chdlc

Name and Version

version: 9378 (09e7b76)
built with Clang 19.1.5 for Windows x86_64

Operating systems

Windows

GGML backends

CUDA

Hardware

i7-13620H - RTX 4070

Models

https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF

Problem description & steps to reproduce

Problem description

I'm seeing a parser failure with tool calls from LiquidAI/LFM2.5-8B-A1B-GGUF when using llama-server.

Normal chat completions work. The issue only shows up when tools are provided.

The model generates this tool call:

<|tool_call_start|>[web_search_exa(query='current temperature in Mexico City', numResults=5)]<|tool_call_end|>

llama-server then returns a 500 error:

{
  "error": {
    "code": 500,
    "message": "Failed to parse input at pos 395: <|tool_call_start|>[web_search_exa(query='current temperature in Mexico City', numResults=5)]<|tool_call_end|>",
    "type": "server_error"
  }
}

From the LFM2.5 model card, this is the documented tool-call format: a Python-style function call wrapped with <|tool_call_start|> and <|tool_call_end|>. The model is producing the documented format, but the llama.cpp parser is rejecting it.

Prior art: PR #20251 added a dedicated PEG parser for LFM2's Python-style tool calls, but it only detected LFM2 via <|tool_list_start|> / <|tool_list_end|> tags. PR #21242 then added LFM2.5 support. My build (09e7b76c9, May 28 2026) includes PR #21242, yet the issue persists -- suggesting a regression or incomplete detection for LFM2.5 specifically.

Possible root cause: In common/chat.cpp, the is_lfm2_template() function detects LFM2 templates by checking for <|tool_list_start|> and <|tool_list_end|>. LFM2.5's template does not use these tags -- it only uses <|tool_call_start|> / <|tool_call_end|>. This means LFM2.5 may not be routed to the Python-style parser at all.

Steps to reproduce

  1. Start llama-server with Jinja enabled and the official chat_template.jinja from LiquidAI/LFM2.5-8B-A1B:
llama-server \
  -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M \
  --jinja \
  --chat-template-file /path/to/chat_template.jinja
  1. Send this request to /v1/chat/completions:
{
  "model": "LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M",
  "messages": [
    {
      "role": "user",
      "content": "Search for the current temperature in Mexico City."
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "web_search_exa",
        "description": "Search the web",
        "parameters": {
          "type": "object",
          "properties": {
            "query": {
              "type": "string"
            },
            "numResults": {
              "type": "integer"
            }
          },
          "required": ["query"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "parse_tool_calls": true,
  "temperature": 0,
  "max_tokens": 256
}
  1. The server returns:
{
  "error": {
    "code": 500,
    "message": "Failed to parse input at pos 395: <|tool_call_start|>[web_search_exa(query='current temperature in Mexico City', numResults=5)]<|tool_call_end|>",
    "type": "server_error"
  }
}

I also tried the same request with "parse_tool_calls": false but it still fails with the same parser error.

Chat template reference

The official LFM2.5 chat_template.jinja renders tool calls as (from the render_tool_calls macro):

<|tool_call_start|>[func_name(arg1='value1', arg2=value2)]<|tool_call_end|>

Note: the template does not contain <|tool_list_start|> or <|tool_list_end|> -- only <|tool_call_start|> / <|tool_call_end|>. This is different from LFM2 which uses both pairs.

First Bad Commit

Likely related to the autoparser refactoring (#18675) or the LFM2.5-specific handling in #21242 not fully covering the case.

Relevant log output

Logs
Request with parse_tool_calls=true:

{"error":{"code":500,"message":"Failed to parse input at pos 395: <|tool_call_start|>[web_search_exa(query='current temperature in Mexico City', numResults=5)]<|tool_call_end|>","type":"server_error"}}

Request with parse_tool_calls=false:

{"error":{"code":500,"message":"Failed to parse input at pos 395: <|tool_call_start|>[web_search_exa(query='current temperature in Mexico City', numResults=5)]<|tool_call_end|>","type":"server_error"}}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions