Skip to content

Bug: glm-4.7-flash tagged COST_EFFECTIVE but is a thinking model — outputs reasoning traces instead of direct responses #93

@stackbilt-admin

Description

@stackbilt-admin

Bug

`@cf/zai-org/glm-4.7-flash` is currently tagged `active, [COST_EFFECTIVE, BALANCED, TOOL_CALLING, LONG_CONTEXT]` in `model-catalog`. This causes it to be selected as the first-choice model for summary and cost-effective routing.

The model outputs chain-of-thought reasoning traces instead of direct responses, making it unsuitable for any route class where the caller expects a clean answer. Example of what users get when glm-4.7-flash is selected for a summary turn:

```

  1. Analyze the user's request:
    • Topic: Local LLM gateway proxy.
    • Constraint: "In one sentence."
    • Why do people use it? ...
    • Draft 1: A local LLM gateway proxy is software that...
    • *Draft 2 (More technical
      ```

Response truncates mid-reasoning at `max_tokens` because the model never gets to the actual answer.

Confirmed affected models (same pattern)

Two additional models expected to have the same issue if added to the catalog:

  • `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` — DeepSeek R1 distill, reasoning model
  • `@cf/qwen/qwq-32b` — QwQ is explicitly a reasoning model

Fix

Either:

  1. Remove `COST_EFFECTIVE` and `BALANCED` use cases from `glm-4.7-flash` — keep it only under an explicit `REASONING` or `ANALYTICAL` use case
  2. Add a `thinkingModel: true` flag to the catalog entry so routers can filter these out for direct-response routing

When `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` and `@cf/qwen/qwq-32b` are added (see companion issue), apply the same handling.

Discovered via

bildy end-to-end smoke test: live summary routing selected `glm-4.7-flash` and returned reasoning trace to Claude Code instead of a summary response.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions