Skip to content

Add support for alternate LLM providers #74

Description

@heeki

Add support for alternate LLM providers

Overview

Add support for invoking models beyond AWS Bedrock, including 3rd party public endpoints (e.g., OpenAI, Anthropic API, Cohere) and privately hosted LLM routers (e.g., LiteLLM). This will provide users with flexibility to choose their preferred model providers and enable deployment in environments where Bedrock may not be available or optimal.

Requirements

R1: Multi-Provider Model Selection

Users should be able to select models from other model providers, public or private. This includes:

  • Support for major public LLM providers (OpenAI, Anthropic API, Cohere, Google Vertex AI)
  • Support for privately hosted LiteLLM model router endpoints
  • Support for custom OpenAI-compatible endpoints
  • Model selection UI that displays available models across all configured providers
  • Provider-specific authentication and authorization
  • Fallback behavior when a provider is unavailable

Technical Considerations

Provider Abstraction Layer

  • LLM Provider Interface: Create an abstract provider interface that normalizes:
    • Model invocation (prompt/messages format)
    • Response streaming
    • Token usage tracking
    • Error handling
    • Rate limiting and retry logic
  • Provider Implementations: Implement concrete providers:
    • BedrockProvider (existing, refactored)
    • OpenAIProvider (OpenAI API and compatible endpoints)
    • AnthropicProvider (direct Anthropic API)
    • LiteLLMProvider (privately hosted LiteLLM router)
    • VertexAIProvider (Google Vertex AI)
    • CohereProvider (Cohere API)

Configuration Management

  • Update etc/environment.sh to include provider-specific configuration:
    • Provider type (bedrock, openai, anthropic, litellm, etc.)
    • API endpoints (base URLs for custom/private deployments)
    • Authentication method (API keys, IAM, OAuth)
    • Model availability and pricing
  • Store sensitive credentials in AWS Secrets Manager, not in environment files
  • Support per-agent provider configuration (some agents may use different providers)
  • Environment-specific provider configurations (dev vs production)

Backend Changes

  • Provider Factory: Create a factory pattern to instantiate the correct provider based on configuration
  • Provider Registry: Maintain a registry of available providers and their models
  • Cost Tracking: Extend cost tracking to support different provider pricing models
    • Token-based pricing (most providers)
    • Request-based pricing
    • Custom pricing for private deployments
  • Response Normalization: Normalize responses across providers to maintain consistent API contracts
  • Streaming Support: Ensure streaming works consistently across providers

Frontend Changes

  • Model Selection UI: Update model selection dropdown to show:
    • Provider name/logo
    • Model name
    • Model capabilities (context length, vision support, etc.)
    • Estimated pricing
  • Provider Status: Display provider availability status
  • Error Handling: Provider-specific error messages and troubleshooting

Security Considerations

  • API Key Management:
    • Store API keys in AWS Secrets Manager
    • Never log or expose API keys in responses
    • Support key rotation
  • VPC Integration: Private LiteLLM deployments may require VPC connectivity (see Issue Add support for VPC-enabled agents #73)
  • Rate Limiting: Implement provider-specific rate limiting to avoid quota issues
  • Audit Logging: Log which provider/model was used for each invocation

Dependencies

  • Install provider SDKs as needed:
    • openai (Python SDK for OpenAI and compatible endpoints)
    • anthropic (Python SDK for Anthropic API)
    • cohere (Python SDK for Cohere)
    • litellm (Python SDK for LiteLLM router)
    • google-cloud-aiplatform (for Vertex AI)

Acceptance Criteria

AC1: Provider Abstraction

  • Provider interface defined with clear contract
  • At least 3 providers implemented (Bedrock + 2 others)
  • All providers support streaming responses
  • Provider factory and registry implemented
  • Unit tests for each provider implementation

AC2: Configuration

  • Provider configuration added to etc/environment.sh
  • API keys stored in AWS Secrets Manager
  • Documentation for configuring each provider
  • Support for multiple simultaneous providers
  • Environment-specific provider configs work

AC3: Model Selection

  • UI displays models from all configured providers
  • Users can select and invoke models from any provider
  • Model metadata (context length, pricing) displayed correctly
  • Provider status/health check visible in UI

AC4: Cost Tracking

  • Cost tracking works for all providers
  • Provider name shown in cost dashboards
  • Cost formatting uses provider-specific pricing
  • Costs are accurately estimated with ~ prefix

AC5: Error Handling

  • Provider-specific errors handled gracefully
  • Fallback to other providers when configured
  • Clear error messages for authentication failures
  • Rate limiting properly handled with retries

AC6: Testing

  • Integration tests with mock provider endpoints
  • Unit tests for provider abstraction layer
  • Manual testing with at least 2 real providers
  • Load testing with different providers

Implementation Notes

Suggested Approach

Phase 1: Provider Abstraction (Week 1)

  1. Design and implement provider interface in backend/
  2. Refactor existing Bedrock integration to use new interface
  3. Create provider factory and registry
  4. Add unit tests

Phase 2: OpenAI Provider (Week 1-2)

  1. Implement OpenAIProvider class
  2. Add OpenAI SDK dependency with uv
  3. Configure API key in Secrets Manager
  4. Test with OpenAI API and OpenAI-compatible endpoints
  5. Update cost tracking for OpenAI pricing

Phase 3: LiteLLM Provider (Week 2)

  1. Implement LiteLLMProvider class
  2. Add LiteLLM SDK dependency
  3. Support custom endpoint configuration for private deployments
  4. Test with public LiteLLM demo endpoint first
  5. Document private deployment setup

Phase 4: Frontend Integration (Week 2-3)

  1. Update model selection API to return multi-provider models
  2. Update frontend UI to display provider information
  3. Add provider status indicators
  4. Update cost dashboard to show provider-specific costs

Phase 5: Additional Providers (Week 3)

  1. Implement Anthropic API provider
  2. Implement other providers as needed
  3. Comprehensive testing across all providers

Key Files to Modify/Create

Backend:

  • backend/providers/ (new directory):
    • base.py: Abstract provider interface
    • bedrock.py: Refactored Bedrock provider
    • openai.py: OpenAI provider implementation
    • litellm.py: LiteLLM provider implementation
    • anthropic.py: Anthropic API provider
    • factory.py: Provider factory
    • registry.py: Provider registry
  • backend/models/: Update data models for provider metadata
  • backend/api/: Update endpoints to support provider selection
  • backend/cost/: Extend cost calculation for multiple providers
  • backend/requirements.txt or pyproject.toml: Add provider SDKs

Configuration:

  • etc/environment.sh: Add provider configuration
  • iac/: SAM templates for Secrets Manager secrets

Frontend:

  • frontend/src/components/ModelSelector.tsx: Multi-provider model selection
  • frontend/src/components/ProviderStatus.tsx: Provider health indicators
  • frontend/src/pages/CostDashboardPage.tsx: Update to show provider in costs

Tests:

  • backend/tests/providers/: Provider-specific tests
  • backend/tests/integration/: Multi-provider integration tests

Provider Priority

  1. OpenAI - Most widely used, good for testing abstraction
  2. LiteLLM - Enables private deployments and multi-provider routing
  3. Anthropic API - Direct access to Claude models outside AWS
  4. Others - Cohere, Vertex AI (lower priority)

Cost Tracking Updates

Each provider has different pricing models:

  • OpenAI: Per 1K tokens (separate input/output pricing)
  • Anthropic API: Per 1M tokens (separate input/output pricing)
  • Bedrock: Per 1K tokens (varies by model)
  • LiteLLM: Depends on backend provider

Update formatCost functions to handle provider-specific pricing and maintain consistency across:

  • CostDashboardPage.tsx
  • InvocationDetailPage.tsx
  • LatencySummary.tsx
  • InvocationTable.tsx

Testing Strategy

  • Unit Tests: Mock provider responses and test interface implementation
  • Integration Tests: Use provider SDK test modes or mock servers
  • Manual Testing: Test with real provider accounts (low quota/cost)
  • Load Testing: Verify rate limiting and retry logic
  • Cost Validation: Ensure cost calculations match provider billing

References

Dependencies

Priority

High - Increases platform flexibility and reduces vendor lock-in

Estimated Effort

Large (3 weeks) - Requires significant refactoring, new integrations, and thorough testing

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions