Skip to content

Add multi-provider feature parity improvements inspired by pi-mono#9

Open
vijaysharm wants to merge 1 commit into
mainfrom
claude/analyze-pi-mono-comparison-Js14L
Open

Add multi-provider feature parity improvements inspired by pi-mono#9
vijaysharm wants to merge 1 commit into
mainfrom
claude/analyze-pi-mono-comparison-Js14L

Conversation

@vijaysharm

Copy link
Copy Markdown
Owner
  • Add OpenAIChatCompletionsProvider for /v1/chat/completions API, unlocking
    Groq, Together, xAI, OpenRouter, Mistral, Ollama, and other compatible endpoints
  • Add OpenAICompletionsCompat flags to handle provider-specific quirks
    (maxTokensField, supportsDeveloperRole, requiresToolResultName, etc.)
  • Add ConversationTransformer for cross-provider message normalization:
    tool call ID sanitization (Anthropic 64-char, Mistral 9-char limits),
    orphaned tool call repair, and empty message stripping
  • Add CacheRetention enum and prompt caching support in LLMRequestOptions,
    with Anthropic cache_control implementation
  • Extend TokenUsage with cacheReadTokens/cacheWriteTokens and add
    estimatedCost() for dollar cost calculation from token rates
  • Add ReasoningEffort.max for maximum reasoning budget (adaptive thinking
    for Claude 4.x models, 32k budget for Gemini)
  • Add ServiceTier enum (auto/flex/priority) for OpenAI tiered pricing
  • Update all existing providers (OpenAI, Anthropic, Gemini, ClaudeCode)
    to handle new ReasoningEffort.max case and service tier parameter

https://claude.ai/code/session_012D8VW6xmUwXbdzA4cDjffP

- Add OpenAIChatCompletionsProvider for /v1/chat/completions API, unlocking
  Groq, Together, xAI, OpenRouter, Mistral, Ollama, and other compatible endpoints
- Add OpenAICompletionsCompat flags to handle provider-specific quirks
  (maxTokensField, supportsDeveloperRole, requiresToolResultName, etc.)
- Add ConversationTransformer for cross-provider message normalization:
  tool call ID sanitization (Anthropic 64-char, Mistral 9-char limits),
  orphaned tool call repair, and empty message stripping
- Add CacheRetention enum and prompt caching support in LLMRequestOptions,
  with Anthropic cache_control implementation
- Extend TokenUsage with cacheReadTokens/cacheWriteTokens and add
  estimatedCost() for dollar cost calculation from token rates
- Add ReasoningEffort.max for maximum reasoning budget (adaptive thinking
  for Claude 4.x models, 32k budget for Gemini)
- Add ServiceTier enum (auto/flex/priority) for OpenAI tiered pricing
- Update all existing providers (OpenAI, Anthropic, Gemini, ClaudeCode)
  to handle new ReasoningEffort.max case and service tier parameter

https://claude.ai/code/session_012D8VW6xmUwXbdzA4cDjffP
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants