Bug: glm-4.7-flash tagged COST_EFFECTIVE but is a thinking model — outputs reasoning traces instead of direct responses

## Bug

\`@cf/zai-org/glm-4.7-flash\` is currently tagged \`active, [COST_EFFECTIVE, BALANCED, TOOL_CALLING, LONG_CONTEXT]\` in \`model-catalog\`. This causes it to be selected as the **first-choice model for summary and cost-effective routing**.

The model outputs chain-of-thought reasoning traces instead of direct responses, making it unsuitable for any route class where the caller expects a clean answer. Example of what users get when glm-4.7-flash is selected for a summary turn:

\`\`\`
1. **Analyze the user's request:**
   * **Topic:** Local LLM gateway proxy.
   * **Constraint:** "In one sentence."
   * Why do people use it? ...
   * *Draft 1:* A local LLM gateway proxy is software that...
   * *Draft 2 (More technical
\`\`\`

Response truncates mid-reasoning at \`max_tokens\` because the model never gets to the actual answer.

## Confirmed affected models (same pattern)

Two additional models expected to have the same issue if added to the catalog:
- \`@cf/deepseek-ai/deepseek-r1-distill-qwen-32b\` — DeepSeek R1 distill, reasoning model
- \`@cf/qwen/qwq-32b\` — QwQ is explicitly a reasoning model

## Fix

Either:
1. Remove \`COST_EFFECTIVE\` and \`BALANCED\` use cases from \`glm-4.7-flash\` — keep it only under an explicit \`REASONING\` or \`ANALYTICAL\` use case
2. Add a \`thinkingModel: true\` flag to the catalog entry so routers can filter these out for direct-response routing

When \`@cf/deepseek-ai/deepseek-r1-distill-qwen-32b\` and \`@cf/qwen/qwq-32b\` are added (see companion issue), apply the same handling.

## Discovered via

bildy end-to-end smoke test: live summary routing selected \`glm-4.7-flash\` and returned reasoning trace to Claude Code instead of a summary response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: glm-4.7-flash tagged COST_EFFECTIVE but is a thinking model — outputs reasoning traces instead of direct responses #93

Bug

Confirmed affected models (same pattern)

Fix

Discovered via

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: glm-4.7-flash tagged COST_EFFECTIVE but is a thinking model — outputs reasoning traces instead of direct responses #93

Description

Bug

Confirmed affected models (same pattern)

Fix

Discovered via

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions