Add missing CF Workers AI models: llama-3.3-70b-fp8-fast, qwen2.5-coder-32b, gpt-oss-20b, mistral-small-3.1, qwen3-30b, llama-3.2-1b/3b, kimi-k2.7-code

## Summary

Running bildy (https://github.com/Stackbilt-dev/bildy) against the live CF Workers AI model catalog via \`GET /accounts/{id}/ai/models/search?task=Text+Generation\` reveals 7 models missing from \`model-catalog\` that are material to routing quality.

## Missing models (CF API confirms all active)

| Model | Suggested use cases | Notes |
|---|---|---|
| \`@cf/meta/llama-3.3-70b-instruct-fp8-fast\` | \`BALANCED\`, \`HIGH_PERFORMANCE\`, \`TOOL_CALLING\` | Best value model on Workers AI right now. Fast FP8 quant. Primary gap for \`planning\` route class. |
| \`@cf/qwen/qwen2.5-coder-32b-instruct\` | \`COST_EFFECTIVE\` (code) | Purpose-built for code generation. Critical for \`code_draft\` routing. |
| \`@cf/openai/gpt-oss-20b\` | \`COST_EFFECTIVE\`, \`TOOL_CALLING\` | Smaller sibling of \`gpt-oss-120b\` (already active). Good cheap tool-calling option. |
| \`@cf/mistralai/mistral-small-3.1-24b-instruct\` | \`BALANCED\`, \`TOOL_CALLING\` | Strong quality/cost ratio, vision capable. |
| \`@cf/qwen/qwen3-30b-a3b-fp8\` | \`BALANCED\`, \`HIGH_PERFORMANCE\` | Qwen3 generation, state-of-the-art for cost tier. |
| \`@cf/meta/llama-3.2-1b-instruct\` | \`COST_EFFECTIVE\` | Ultra-cheap for dead-simple summary/classify turns. |
| \`@cf/meta/llama-3.2-3b-instruct\` | \`COST_EFFECTIVE\` | Pair with 1b above as step-up model. |
| \`@cf/moonshotai/kimi-k2.7-code\` | \`TOOL_CALLING\`, \`BALANCED\` | Code-focused variant of kimi-k2.6 (already active). |

## Impact on bildy routing

Without \`llama-3.3-70b-fp8-fast\`, bildy's \`planning\` and \`code_draft\` route classes fall back to \`gpt-oss-120b\` which is the right capability tier but not the best cost model. Without \`qwen2.5-coder-32b\`, code generation routing has no purpose-built option.

## Intentionally excluded

- LoRA variants (\`-lora\` suffix) — fine-tune infrastructure, not general routing
- \`@cf/meta/llama-guard-3-8b\` — safety classifier, not chat
- \`@cf/aisingapore/gemma-sea-lion-v4-27b-it\` — SEA-localized, niche

Model	Suggested use cases	Notes
`@cf/meta/llama-3.3-70b-instruct-fp8-fast`	`BALANCED`, `HIGH_PERFORMANCE`, `TOOL_CALLING`	Best value model on Workers AI right now. Fast FP8 quant. Primary gap for `planning` route class.
`@cf/qwen/qwen2.5-coder-32b-instruct`	`COST_EFFECTIVE` (code)	Purpose-built for code generation. Critical for `code_draft` routing.
`@cf/openai/gpt-oss-20b`	`COST_EFFECTIVE`, `TOOL_CALLING`	Smaller sibling of `gpt-oss-120b` (already active). Good cheap tool-calling option.
`@cf/mistralai/mistral-small-3.1-24b-instruct`	`BALANCED`, `TOOL_CALLING`	Strong quality/cost ratio, vision capable.
`@cf/qwen/qwen3-30b-a3b-fp8`	`BALANCED`, `HIGH_PERFORMANCE`	Qwen3 generation, state-of-the-art for cost tier.
`@cf/meta/llama-3.2-1b-instruct`	`COST_EFFECTIVE`	Ultra-cheap for dead-simple summary/classify turns.
`@cf/meta/llama-3.2-3b-instruct`	`COST_EFFECTIVE`	Pair with 1b above as step-up model.
`@cf/moonshotai/kimi-k2.7-code`	`TOOL_CALLING`, `BALANCED`	Code-focused variant of kimi-k2.6 (already active).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing CF Workers AI models: llama-3.3-70b-fp8-fast, qwen2.5-coder-32b, gpt-oss-20b, mistral-small-3.1, qwen3-30b, llama-3.2-1b/3b, kimi-k2.7-code #92

Summary

Missing models (CF API confirms all active)

Impact on bildy routing

Intentionally excluded

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add missing CF Workers AI models: llama-3.3-70b-fp8-fast, qwen2.5-coder-32b, gpt-oss-20b, mistral-small-3.1, qwen3-30b, llama-3.2-1b/3b, kimi-k2.7-code #92

Description

Summary

Missing models (CF API confirms all active)

Impact on bildy routing

Intentionally excluded

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions