Skip to content

Respect adapter endpoint routing in CLI stream retries#12516

Open
lyonsno wants to merge 1 commit into
continuedev:mainfrom
lyonsno:fix-cli-stream-routing-10474
Open

Respect adapter endpoint routing in CLI stream retries#12516
lyonsno wants to merge 1 commit into
continuedev:mainfrom
lyonsno:fix-cli-stream-routing-10474

Conversation

@lyonsno
Copy link
Copy Markdown

@lyonsno lyonsno commented May 29, 2026

Summary

Fixes #10474.

The CLI retry helper was routing Responses-capable model names directly to llmApi.responsesStream() whenever that optional method existed. That bypassed the OpenAI adapter's apiBase guard, so OpenAI-compatible providers and proxies could be sent to /v1/responses even when they only support /v1/chat/completions.

This change keeps endpoint selection inside the adapter by having chatCompletionStreamWithBackoff() delegate streaming calls to the required llmApi.chatCompletionStream() contract. The OpenAI adapter still uses the Responses API for official OpenAI endpoints when appropriate, but custom apiBase values stay on the chat-completions path.

Changes

  • Remove the CLI helper's direct responsesStream() routing branch.
  • Add a regression test proving a Responses-capable request with both adapter methods present still goes through chatCompletionStream().
  • Update the streaming/tool-call preservation test so it fails if the CLI calls responsesStream() directly.

Verification

  • npm --prefix extensions/cli test
  • npm --prefix packages/openai-adapters test -- --run
  • npm --prefix extensions/cli run build
  • npm --prefix extensions/cli run test:e2e
  • npm --prefix extensions/cli run test:smoke
  • npm --prefix extensions/cli run typecheck
  • git diff --check

I also ran two manual regression smokes:

  • a local OpenAI-compatible proxy with a Responses-capable model name and custom apiBase, which hit /v1/chat/completions once and /v1/responses zero times;
  • the official OpenAI path with a Responses-capable model, to confirm adapter-owned Responses routing still streams successfully.

@lyonsno lyonsno requested a review from a team as a code owner May 29, 2026 04:33
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

setFetchedModelsList((prev) =>
selectedProvider.provider === providerAtFetchTime ? models : prev,
);

P2 Badge Guard fetched models with current provider state

If the user clicks the refresh icon and then switches providers before the request completes, this closure still compares against the provider value captured when the request started, so the condition is always true and the old provider's fetched models are inserted into the newly selected provider's model list. This can make the form offer models with the wrong providerOptions until another selection clears the list.


const base = apiBase || "https://generativelanguage.googleapis.com/v1beta/";
const url = new URL("models", base);

P2 Badge Normalize Gemini API base before appending models

When apiBase is provided without a trailing slash, new URL("models", base) replaces the final path segment instead of appending to it; for example https://generativelanguage.googleapis.com/v1beta becomes https://generativelanguage.googleapis.com/models?... rather than /v1beta/models. That makes model fetching fail for the common custom-base spelling without a trailing slash.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@lyonsno
Copy link
Copy Markdown
Author

lyonsno commented May 29, 2026

I have read the CLA Document and I hereby sign the CLA

Recheck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

CLI: exponentialBackoff bypasses apiBase check for Responses API routing

1 participant