feat: add Claude Opus 4.8 support across Anthropic, Bedrock, and Vertex providers#386
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds claude-opus-4-8 model metadata across Anthropic, Bedrock, and Vertex; integrates the model into Anthropic prompt-caching and Vertex 1M-context paths; extends Bedrock to detect Opus/Sonnet 4.7+ and send adaptive-thinking payloads (omitting temperature); and adds tests covering model metadata, request construction, and token-handling. ChangesMulti-Provider Model Support and Bedrock Adaptive Thinking
Sequence Diagram(s)sequenceDiagram
participant Client
participant AnthropicHandler
participant BedrockHandler
participant VertexHandler
Client->>AnthropicHandler: createMessage(claude-opus-4-8)
AnthropicHandler->>AnthropicHandler: Apply ephemeral cache preprocessing
AnthropicHandler->>AnthropicHandler: Add prompt-caching-2024-07-31 beta
AnthropicHandler-->>Client: request built
Client->>BedrockHandler: createMessage(anthropic.claude-opus-4-8, enableReasoning)
BedrockHandler->>BedrockHandler: Detect adaptive-thinking model
BedrockHandler->>BedrockHandler: thinking type: adaptive, output_config.effort: xhigh
BedrockHandler->>BedrockHandler: Omit temperature
BedrockHandler-->>Client: InvokeModel request
Client->>VertexHandler: getModel(claude-opus-4-8, vertex1MContext)
VertexHandler->>VertexHandler: Apply 1M-context beta header
VertexHandler-->>Client: ModelInfo with 1M context window
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/api/providers/bedrock.ts`:
- Around line 448-452: The completePrompt flow still unconditionally injects
temperature into the Bedrock inferenceConfig causing 400s for adaptive-thinking
models; update the completePrompt implementation (the inferenceConfig
construction in the completePrompt function) to guard temperature the same way
createMessage does by checking isAdaptiveThinkingModel and omitting temperature
when true, using modelConfig.temperature or this.options.modelTemperature only
when not adaptive-thinking; ensure the same symbol names (completePrompt,
inferenceConfig, isAdaptiveThinkingModel, modelConfig) are used so behavior
matches createMessage.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 943f0c9b-4c4b-4eb2-a01c-9403cec98c02
📒 Files selected for processing (10)
packages/types/src/providers/anthropic.tspackages/types/src/providers/bedrock.tspackages/types/src/providers/vertex.tssrc/api/providers/__tests__/anthropic-vertex.spec.tssrc/api/providers/__tests__/anthropic.spec.tssrc/api/providers/__tests__/bedrock.spec.tssrc/api/providers/anthropic.tssrc/api/providers/bedrock.tssrc/api/providers/openai.tssrc/shared/__tests__/api.spec.ts
dfcfff8 to
e2d630a
Compare
|
Actionable comments posted: 0 |
…g models Addresses CodeRabbit review on Zoo-Code-Org#386. completePrompt was unconditionally sending temperature in its inferenceConfig, which causes a 400 error for Claude Opus/Sonnet 4.7 and 4.8 (sampling parameters were removed by Anthropic for these models). createMessage already guarded this, but the non-stream path did not. - Extract the adaptive-thinking detection into a private isAdaptiveThinkingModel(modelId) method (parseBaseModelId-aware, so cross-region/global prefixes are handled). - Reuse it in both createMessage and completePrompt so the two request paths stay consistent. - Add two regression tests: completePrompt omits temperature for opus-4-8 and still sends it for opus-4-6. 64 bedrock tests pass, check-types clean.
|
Nice! Thank you for your contribution. I recently added bedrock to our e2e test harness would you be able to add a smoke test there for 4.8? |
|
Good catch by @coderabbitai — fixed in
What changed:
Validation: 64 bedrock tests pass, |
|
Tip For best results, initiate chat on the files or code changes.
🐇 ✨ If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
Actionable comments posted: 0 |
Adds Claude Opus 4.7 to the Bedrock native model registry with: - Full ModelInfo (maxTokens, contextWindow, pricing, cache config) - supportsReasoningBudget: true (enables thinking budget in UI) - cachableFields for multi-point prompt caching - 1M context tier pricing - Global Inference support Without this entry, custom model usage falls back to guessModelInfoFromId() which lacks supportsReasoningBudget and cachableFields, causing "too many tokens" errors during parallel file injection (no cache = tokens accumulate). Note: Pricing estimated based on claude-opus-4-6-v1. To be verified against Bedrock console pricing page before merge.
…rature
Claude Opus/Sonnet 4.7 introduced breaking API changes:
- temperature/top_p/top_k removed (causes 400 error)
- thinking.type 'enabled' + budget_tokens removed (causes 400 error)
- New thinking.type 'adaptive' with output_config.effort levels
- New display: 'summarized' option to surface thinking content
Changes:
- Detect Gen 4.7+ models via baseModelId.includes('opus-4-7' | 'sonnet-4-7')
- Omit temperature from inferenceConfig for 4.7+ models
- Use thinking: { type: 'adaptive', display: 'summarized' } for 4.7+
- Set output_config.effort: 'xhigh' (highest level for coding/agentic tasks)
- Maintain full backward compatibility with 4.6 and earlier models
- Expanded BedrockAdditionalModelFields interface to support both formats
References:
- Claude 4.7 release notes (Apr 16, 2026)
- effort levels: low | medium | high | xhigh | max
…gistries - Register claude-opus-4-8 in anthropicModels with 1M context, 128k output, supportsReasoningBudget, supportsReasoningBinary, supportsTemperature: false (mirrors 4.7 - no breaking API changes per the official migration guide). - Register anthropic.claude-opus-4-8 in bedrockModels with cache points, cachableFields, and 1M context tier pricing. - Register claude-opus-4-8 in vertexModels with the same shape. - Add anthropic.claude-opus-4-8 to BEDROCK_1M_CONTEXT_MODEL_IDS and BEDROCK_GLOBAL_INFERENCE_MODEL_IDS. - Add claude-opus-4-8 to VERTEX_1M_CONTEXT_MODEL_IDS.
… detection) - Anthropic provider: add claude-opus-4-8 to both prompt-caching switch statements so it gets the same handling as 4.7 (native 1M context, no beta header required). - Bedrock provider: rename isGen47Model -> isAdaptiveThinkingModel and expand the pattern to match opus-4-7, opus-4-8, sonnet-4-7, sonnet-4-8. 4.8 inherits the same adaptive-thinking + temperature-rejection contract from 4.7 with no breaking API changes. - OpenAI-compatible provider: update comment to mention 4.8 alongside 4.7; no logic change (already honors the supportsTemperature: false flag). The rename describes the capability (adaptive thinking) rather than a specific generation, making future Claude versions easier to support.
- anthropic.spec.ts: 5 cases mirroring 4.7 (1M-beta-header guard, adaptive thinking ON/OFF, custom maxTokens, getModel info). - anthropic-vertex.spec.ts: 1M context tier pricing for Vertex Opus 4.8. - shared/api.spec.ts: getModelMaxOutputTokens hybrid-token handling on 4.8. - bedrock.spec.ts: new 'Claude 4.7+ adaptive thinking' block with 5 cases covering 4.7 + 4.8 adaptive thinking, reasoning-off behaviour, a 4.6 regression guard (budget_tokens + temperature), and cross-region prefix detection (us.anthropic.claude-opus-4-8). 235 unit tests pass, 0 type errors. Validated live end-to-end via Bedrock Global Inference (global.anthropic.claude-opus-4-8).
…g models Addresses CodeRabbit review on Zoo-Code-Org#386. completePrompt was unconditionally sending temperature in its inferenceConfig, which causes a 400 error for Claude Opus/Sonnet 4.7 and 4.8 (sampling parameters were removed by Anthropic for these models). createMessage already guarded this, but the non-stream path did not. - Extract the adaptive-thinking detection into a private isAdaptiveThinkingModel(modelId) method (parseBaseModelId-aware, so cross-region/global prefixes are handled). - Reuse it in both createMessage and completePrompt so the two request paths stay consistent. - Add two regression tests: completePrompt omits temperature for opus-4-8 and still sends it for opus-4-6. 64 bedrock tests pass, check-types clean.
Addresses @edelauna's review request on Zoo-Code-Org#386 to cover 4.8 in the new Bedrock e2e harness. Mirrors the existing user-agent smoke test but re-points the provider at us.anthropic.claude-opus-4-8. Since 4.8 is an adaptive-thinking model, this exercises the request path that omits temperature (and sends thinking.type "adaptive" when reasoning is enabled), proving a Bedrock round-trip completes without a 400. Runs against the binary-event-stream mock server in CI and against real AWS when BEDROCK_LIVE_E2E=true. The original 4.7-era test is left untouched; model id is overridable via BEDROCK_OPUS_48_MODEL_ID. check-types clean (tsconfig.esm.json).
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
2bd49d7 to
506bc9a
Compare
|
Done — added in I rebased onto latest Details, keeping it consistent with your harness:
I kept it as a parity/connectivity smoke test to match the harness's intent. If you'd prefer a deeper assertion — e.g. extending the mock server to capture the request body and asserting
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/api/providers/__tests__/bedrock.spec.ts (1)
76-82: 💤 Low valueConsider consistent naming: "ZooCode" instead of "Zoo Code".
The title uses "Zoo Code" (two words), but the assertion on line 90 checks for "ZooCode#" (one word). Consider using "ZooCode" throughout for consistency with the actual user-agent marker.
📝 Suggested title adjustment
- it("should identify itself as Zoo Code in the AWS client app id", () => { + it("should identify itself as ZooCode in the AWS client app id", () => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/api/providers/__tests__/bedrock.spec.ts` around lines 76 - 82, Update the test title to use the same "ZooCode" spelling as the assertion: change the it(...) description that currently reads "should identify itself as Zoo Code in the AWS client app id" to use "ZooCode" (one word) so it matches the expectation checked by expect.stringMatching(/^ZooCode#/), ensuring consistency between the test description and the mockBedrockRuntimeClient assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/api/providers/__tests__/bedrock.spec.ts`:
- Around line 76-82: Update the test title to use the same "ZooCode" spelling as
the assertion: change the it(...) description that currently reads "should
identify itself as Zoo Code in the AWS client app id" to use "ZooCode" (one
word) so it matches the expectation checked by
expect.stringMatching(/^ZooCode#/), ensuring consistency between the test
description and the mockBedrockRuntimeClient assertion.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: b941adf2-e8c0-46a7-9c5e-f744d3e8de00
📒 Files selected for processing (11)
apps/vscode-e2e/src/suite/providers/bedrock.test.tspackages/types/src/providers/anthropic.tspackages/types/src/providers/bedrock.tspackages/types/src/providers/vertex.tssrc/api/providers/__tests__/anthropic-vertex.spec.tssrc/api/providers/__tests__/anthropic.spec.tssrc/api/providers/__tests__/bedrock.spec.tssrc/api/providers/anthropic.tssrc/api/providers/bedrock.tssrc/api/providers/openai.tssrc/shared/__tests__/api.spec.ts
✅ Files skipped from review due to trivial changes (1)
- src/api/providers/openai.ts
🚧 Files skipped from review as they are similar to previous changes (8)
- src/api/providers/anthropic.ts
- src/api/providers/tests/anthropic-vertex.spec.ts
- src/shared/tests/api.spec.ts
- packages/types/src/providers/anthropic.ts
- src/api/providers/tests/anthropic.spec.ts
- packages/types/src/providers/bedrock.ts
- packages/types/src/providers/vertex.ts
- src/api/providers/bedrock.ts
Codecov flagged the sonnet-4-7 and sonnet-4-8 branches of isAdaptiveThinkingModel as uncovered — they have no Bedrock registry entry yet (future-proof guards), so no existing test reached them. Add a focused unit test that calls the private method directly (same pattern the suite already uses for parseBaseModelId / getPrefixForRegion), covering: - all four positive patterns: opus-4-7, opus-4-8, sonnet-4-7, sonnet-4-8 - cross-region / global prefixes (us./eu./global.) via parseBaseModelId - negative cases: opus-4-6, sonnet-4-6, claude-3-5, nova Brings patch coverage to 100%. 68 bedrock tests pass, check-types clean.
edelauna
left a comment
There was a problem hiding this comment.
Nice! Thanks for adding the e2e tests 🔥
| outputPrice: 25.0, // $25 per million output tokens (≤200K context) | ||
| cacheWritesPrice: 6.25, // $6.25 per million tokens | ||
| cacheReadsPrice: 0.5, // $0.50 per million tokens | ||
| supportsReasoningBudget: true, |
There was a problem hiding this comment.
Missing supportsReasoningBinary: true — the Anthropic provider entry has it (anthropic.ts:126). Without it, anthropic-vertex.ts resolves thinking via getModelParams → getAnthropicReasoning and emits { type: "enabled", budget_tokens: N }. Anthropic docs confirm type: "enabled" is not supported on Opus 4.7+ (returns 400).
Adding the flag alone won't fully fix it — anthropic-vertex.ts also needs to call getAnthropicProviderReasoning directly (like anthropic.ts:65) instead of relying on reasoning from getModelParams.
| contextWindow: 200_000, // Default 200K, extendable to 1M with beta flag 'context-1m-2025-08-07' | ||
| supportsImages: true, | ||
| supportsPromptCache: true, | ||
| supportsReasoningBudget: true, |
There was a problem hiding this comment.
Missing supportsTemperature: false — without it ApiOptions.tsx:787 still shows the temperature slider for this model even though temperature is silently stripped by isAdaptiveThinkingModel.
| contextWindow: 200_000, // Default 200K, extendable to 1M with beta flag 'context-1m-2025-08-07' | ||
| supportsImages: true, | ||
| supportsPromptCache: true, | ||
| supportsReasoningBudget: true, |
There was a problem hiding this comment.
Same for Opus 4.8 — supportsTemperature: false should be included here too.
Summary
Adds full support for Claude Opus 4.8 (
claude-opus-4-8/anthropic.claude-opus-4-8) across the Anthropic, AWS Bedrock, and Google Vertex providers. Opus 4.8 was released by Anthropic on May 28, 2026 as the successor to Opus 4.7.This builds directly on the adaptive-thinking compatibility layer introduced for Opus 4.7 in #316. Per Anthropic's official migration guide, there are no breaking API changes between 4.7 and 4.8 — 4.8 inherits the same adaptive-thinking contract, the same removal of sampling parameters, the same native 1M context window, and the same prompt-cache behaviour. The change is therefore small and surgical.
What changed
Model registry (
packages/types)anthropic.ts— registeredclaude-opus-4-8: 1M context, 128k max output,supportsReasoningBudget,supportsReasoningBinary,supportsTemperature: false(mirrors 4.7).bedrock.ts— registeredanthropic.claude-opus-4-8with prompt-cache points,cachableFields, and 1M tiered pricing; added it toBEDROCK_1M_CONTEXT_MODEL_IDSandBEDROCK_GLOBAL_INFERENCE_MODEL_IDS.vertex.ts— registeredclaude-opus-4-8with the same shape; added it toVERTEX_1M_CONTEXT_MODEL_IDS.Provider logic (
src/api/providers)anthropic.ts— addedclaude-opus-4-8to both prompt-cachingswitchstatements so it gets the same handling as 4.7 (native 1M context, nocontext-1m-2025-08-07beta header required).bedrock.ts— renamedisGen47Model→isAdaptiveThinkingModeland expanded the pattern to matchopus-4-7,opus-4-8,sonnet-4-7, andsonnet-4-8. The rename describes the actual capability (adaptive thinking) rather than a single generation, making future Claude versions easier to plug in. The branch still emitsthinking: { type: "adaptive", display: "summarized" }+output_config: { effort: "xhigh" }and omitstemperature.openai.ts— comment-only update mentioning 4.8; no logic change (already honors thesupportsTemperature: falseflag).Tests
anthropic.spec.ts— 5 new cases mirroring 4.7 (1M-beta-header guard, adaptive thinking ON/OFF, custommaxTokens,getModelinfo).anthropic-vertex.spec.ts— 1M context tier pricing for Vertex Opus 4.8.shared/api.spec.ts—getModelMaxOutputTokenshybrid-token handling on 4.8.bedrock.spec.ts— newClaude 4.7+ adaptive thinking (Opus 4.7 / Opus 4.8)describe block with 5 cases covering:xhigheffort for 4.7xhigheffort for 4.8budget_tokens+temperatureus.anthropic.claude-opus-4-8)This also closes the unit-test coverage gap that existed for the Bedrock adaptive-thinking branch in #316.
Fast Mode — out of scope
Opus 4.8 introduces a "fast mode" (2.5x faster, $10/$50 per 1M tokens). Per Anthropic docs it is not available on the Claude Platform on AWS / Bedrock and direct-API access is gated behind a waitlist, so it is intentionally excluded from this PR.
Other 4.8 features (no client change needed)
The following 4.8 additions are server-side optimizations that the existing client pipeline benefits from transparently — no code change required:
role: "system"messagesrefusalcontent block + documentedstop_detailsBackward compatibility
✅ 100% backward compatible. The adaptive-thinking branch only activates for
opus-4-7,opus-4-8,sonnet-4-7, andsonnet-4-8. Claude 4.6 and earlier continue to use thebudget_tokensthinking format and still sendtemperature(covered by a new regression test).Pricing note
Testing
packages/typesandsrcglobal.anthropic.claude-opus-4-8(Global Inference Profile). Adaptive thinking active withxhigheffort, no 400 errors ontemperatureorthinking.type.enabled.global.anthropic.claude-opus-4-7still works after theisGen47Model→isAdaptiveThinkingModelrename.References
Summary by CodeRabbit
New Features
Tests