Releases · Stackbilt-dev/llm-providers

13 Jun 11:47

stackbilt-admin

v1.16.0

b826a5d

v1.16.0 — Full CF Workers AI integration Latest

Latest

What's new

4 new Cloudflare Workers AI models

Model	Context	Tools	Use cases
`@cf/nvidia/nemotron-3-120b-a12b`	256K	✓	`HIGH_PERFORMANCE`, `TOOL_CALLING`, `LONG_CONTEXT`
`anthropic/claude-opus-4.8`	1M	—	`HIGH_PERFORMANCE`, `LONG_CONTEXT` (CF-managed Anthropic)
`@cf/deepseek-ai/deepseek-r1-distill-qwen-32b`	80K	—	`RESEARCH` (thinking model)
`@cf/qwen/qwq-32b`	24K	—	`RESEARCH` (thinking model)

`thinkingModel` routing guard

New ModelCapabilities.thinkingModel?: boolean flag marks models that output chain-of-thought reasoning traces. rankModels() now excludes all thinkingModel: true entries from every use-case pool except RESEARCH, preventing them from winning direct-response routes (summary, chat, tool calling, etc.).

Affected models: @cf/deepseek-ai/deepseek-r1-distill-qwen-32b, @cf/qwen/qwq-32b, @cf/zai-org/glm-4.7-flash.

Anthropic-via-CF response normalizer

anthropic/claude-opus-4.8 uses the Anthropic message wire format through the Workers AI binding. The provider now routes these through a dedicated formatter (system as a top-level field, not a system role message) and normalizes the content[{type, text}] + stop_reason response shape to the standard LLMResponse contract.

GLM-4.7-Flash reclassification (completes #93)

@cf/zai-org/glm-4.7-flash is now RESEARCH-only with thinkingModel: true, fully excluding it from all direct-response routing pools (not just deprioritized as in v1.15.0).

Assets 2

10 Jun 12:21

stackbilt-admin

v1.14.5

b1a1da8

v1.14.5

Patch fix for Cloudflare bad-input error classification (issue #91).

Fixed

Cloudflare InvalidRequestError wrapping — CloudflareProvider now catches Workers AI AiError: Bad input responses and re-throws them as InvalidRequestError instead of propagating the raw AiError. Callers and gateways can now distinguish non-retryable bad-input failures from transient infrastructure errors without parsing raw message strings.

Assets 2

10 Jun 10:28

stackbilt-admin

v1.14.4

1cc5210

v1.14.4

Patch compatibility fix for Cloudflare local gateway streaming.\n\n- CloudflareProvider.streamResponse now normalizes non-streaming chat-completion JSON responses through the same parser used by generateResponse.\n- Preserves Cloudflare Kimi/Workers AI output when llm-gateway asks Cloudflare for JSON and synthesizes Claude/Codex client streams.\n- Includes reasoning_content fallback from v1.14.3 for Cloudflare reasoning models that return content:null on truncated responses.\n\nValidation:\n- npm run typecheck\n- npm test: 444/444\n- npm run test:package

Assets 2

09 Jun 22:19

stackbilt-admin

v1.14.2

273b314

v1.14.2

v1.14.2 — 2026-06-09

Workers AI catalog expansion for Cloudflare credit-backed gateway routing.

Added

Cloudflare Kimi K2.6 — adds @cf/moonshotai/kimi-k2.6 as an active Workers AI catalog entry with long context, tool calling, vision, and structured-agent workload metadata.
Cloudflare GLM-4.7-Flash — adds @cf/zai-org/glm-4.7-flash as an active fast/balanced Workers AI catalog entry with long context and tool-calling metadata.
Cloudflare DeepSeek V4 Pro — adds the dashboard model slug deepseek/deepseek-v4-pro as an active high-performance Workers AI catalog entry for reasoning and coding routes.

This release triggers .github/workflows/publish.yml, which will run CI and publish @stackbilt/llm-providers@1.14.2 to npm with provenance.

Assets 2

08 Jun 08:30

stackbilt-admin

v1.14.1

a1741c1

v1.14.1

Patch release for Cerebras OpenAI-compatible tool-call responses that omit message.content. Includes PR #88 compatibility fix and release metadata from PR #89.

Assets 2

07 Jun 11:10

stackbilt-admin

v1.14.0

5105ae5

v1.14.0

@stackbilt/llm-providers v1.14.0

Worker gateway route-planning surface from issue #87. Additive only — no breaking changes.

Added

getGatewayRoutePlan() helper — packages canonical normalization, catalog routing, cache hints, capability checks, degradations, and warnings into a single Worker-friendly object for use behind OpenAI-compatible, Ollama-style, or Anthropic-compatible API routers. Accepts either compatibility LLMRequest or CanonicalLLMRequest input.
Route plan types — GatewayRoutePlan, GatewayRouteRequirements, GatewayRouteCapabilityReport, and GatewayRouteCachePlan describe the shape. Storage-agnostic — consumers map plan.cache onto their own KV / Cache API / D1 / R2 implementation.
LoRA degradation reporting — when a request carries lora and routing selects a non-Cloudflare provider, the plan reports a stripped degradation and warns that Cloudflare adapter ids are forwarded to Workers AI without validation.
Route plan tests — src/__tests__/gateway-routing.test.ts covers canonical→plan mapping, cache-hint handling, LoRA-on/off-Cloudflare paths, and built-in tool capability mismatches.

Validation

21 test files / 441 tests passing locally and in CI
tsc --noEmit clean
Published with npm provenance

See the full entry in CHANGELOG.md.

Assets 2

06 Jun 10:57

stackbilt-admin

v1.13.1

15b5f1c

v1.13.1 — Groq tool-call content fix

Fixed

GroqProvider now accepts tool-call-only assistant responses where Groq omits message.content while preserving finishReason: "tool_calls" and populated toolCalls.
Adds regression coverage for omitted content plus message.tool_calls.

Fixes #86.

Assets 2

06 Jun 09:48

stackbilt-admin

v1.13.0

f3a3674

v1.13.0 - Cloudflare Workers AI cache binding support

Added

Cloudflare Workers AI run options now translate CacheHints.sessionId into x-session-affinity for provider-prefix and both cache strategies, including streaming and raw vision calls.
CloudflareConfig.gateway exposes typed Workers AI binding Gateway options and merges request cache metadata into the third env.AI.run() argument.
Cloudflare usage parsing now normalizes Workers AI cached input token counts into TokenUsage.cachedInputTokens.

Validation

npm run typecheck
npm test
npm run test:package
npm audit --omit=dev

Closes #84.

Assets 2

05 Jun 20:19

stackbilt-admin

v1.12.0

c96f50e

v1.12.0 — Canonical provider contract

Canonical provider contract hardening from issue #81 / PR #82. Additive only.

Added

Exported canonical provider contract types, including CanonicalLLMRequest, CanonicalLLMResponse, and related helper types.
Added normalizeLLMRequest() to map compatibility LLMRequest fields into the canonical shape.
Added canonicalToLLMRequest() to convert canonical requests back into existing adapter input while providers migrate internally.
Added normalizeLLMResponse() for stable canonical response routing metadata, fallback/degradation fields, normalized error slots, and provider-extra metadata.
Added contract tests covering OpenAI-compatible, Anthropic-compatible, Groq/Cerebras, NVIDIA, and Cloudflare adapter preparation without live API calls.
Documented the gateway boundary: client protocol -> gateway adapter -> CanonicalLLMRequest -> llm-providers -> vendor API.

Validation

npm run typecheck
npm test
npm run test:package
npm audit --omit=dev

Assets 2

31 May 10:58

stackbilt-admin

v1.11.0

faacc42

v1.11.0

@stackbilt/llm-providers v1.11.0

Reliability and gateway-routing hardening. Additive APIs plus bug fixes from issues #61, #62, #63, #64, #65, and #67.

Added

Generated VERSION export synced from package.json during build/test/publish paths.
Streaming usage reconciliation across OpenAI, Groq, Cerebras, NVIDIA, Anthropic, and Cloudflare streams.
CircuitBreaker, CircuitBreakerManager, and ExhaustionRegistry persistence APIs for Workers KV/D1/Redis/Durable Objects.
Workload-aware model defaults via ModelWorkloadClass, ModelPreferenceMap, getRecommendedModelForWorkload(), and getProviderDefaultModelForWorkload().
Provider-agnostic TokenUsage.cacheWriteInputTokens for cache write/create telemetry.

Fixed

Vision requests now reject or skip providers that cannot process images instead of silently dropping image content.
Anthropic URL images now throw ConfigurationError instead of lossy placeholder conversion.

Validation

npm run typecheck
npm test (19 files, 421 tests)
npm run build
npm run test:package
npm audit --omit=dev

Assets 2

Releases: Stackbilt-dev/llm-providers

v1.16.0 — Full CF Workers AI integration

What's new

4 new Cloudflare Workers AI models

thinkingModel routing guard

Anthropic-via-CF response normalizer

GLM-4.7-Flash reclassification (completes #93)

Uh oh!

v1.14.5

Fixed

Uh oh!

v1.14.4

Uh oh!

v1.14.2

v1.14.2 — 2026-06-09

Added

Uh oh!

v1.14.1

Uh oh!

v1.14.0

@stackbilt/llm-providers v1.14.0

Added

Validation

Uh oh!

v1.13.1 — Groq tool-call content fix

Fixed

Uh oh!

v1.13.0 - Cloudflare Workers AI cache binding support

Added

Validation

Uh oh!

v1.12.0 — Canonical provider contract

Added

Validation

Uh oh!

v1.11.0

@stackbilt/llm-providers v1.11.0

Added

Fixed

Validation

Uh oh!

`thinkingModel` routing guard