Add support for alternate LLM providers

# Add support for alternate LLM providers

## Overview
Add support for invoking models beyond AWS Bedrock, including 3rd party public endpoints (e.g., OpenAI, Anthropic API, Cohere) and privately hosted LLM routers (e.g., LiteLLM). This will provide users with flexibility to choose their preferred model providers and enable deployment in environments where Bedrock may not be available or optimal.

## Requirements

### R1: Multi-Provider Model Selection
Users should be able to select models from other model providers, public or private. This includes:
- Support for major public LLM providers (OpenAI, Anthropic API, Cohere, Google Vertex AI)
- Support for privately hosted LiteLLM model router endpoints
- Support for custom OpenAI-compatible endpoints
- Model selection UI that displays available models across all configured providers
- Provider-specific authentication and authorization
- Fallback behavior when a provider is unavailable

## Technical Considerations

### Provider Abstraction Layer
- **LLM Provider Interface**: Create an abstract provider interface that normalizes:
  - Model invocation (prompt/messages format)
  - Response streaming
  - Token usage tracking
  - Error handling
  - Rate limiting and retry logic
- **Provider Implementations**: Implement concrete providers:
  - `BedrockProvider` (existing, refactored)
  - `OpenAIProvider` (OpenAI API and compatible endpoints)
  - `AnthropicProvider` (direct Anthropic API)
  - `LiteLLMProvider` (privately hosted LiteLLM router)
  - `VertexAIProvider` (Google Vertex AI)
  - `CohereProvider` (Cohere API)

### Configuration Management
- Update `etc/environment.sh` to include provider-specific configuration:
  - Provider type (bedrock, openai, anthropic, litellm, etc.)
  - API endpoints (base URLs for custom/private deployments)
  - Authentication method (API keys, IAM, OAuth)
  - Model availability and pricing
- Store sensitive credentials in AWS Secrets Manager, not in environment files
- Support per-agent provider configuration (some agents may use different providers)
- Environment-specific provider configurations (dev vs production)

### Backend Changes
- **Provider Factory**: Create a factory pattern to instantiate the correct provider based on configuration
- **Provider Registry**: Maintain a registry of available providers and their models
- **Cost Tracking**: Extend cost tracking to support different provider pricing models
  - Token-based pricing (most providers)
  - Request-based pricing
  - Custom pricing for private deployments
- **Response Normalization**: Normalize responses across providers to maintain consistent API contracts
- **Streaming Support**: Ensure streaming works consistently across providers

### Frontend Changes
- **Model Selection UI**: Update model selection dropdown to show:
  - Provider name/logo
  - Model name
  - Model capabilities (context length, vision support, etc.)
  - Estimated pricing
- **Provider Status**: Display provider availability status
- **Error Handling**: Provider-specific error messages and troubleshooting

### Security Considerations
- **API Key Management**: 
  - Store API keys in AWS Secrets Manager
  - Never log or expose API keys in responses
  - Support key rotation
- **VPC Integration**: Private LiteLLM deployments may require VPC connectivity (see Issue #73)
- **Rate Limiting**: Implement provider-specific rate limiting to avoid quota issues
- **Audit Logging**: Log which provider/model was used for each invocation

### Dependencies
- Install provider SDKs as needed:
  - `openai` (Python SDK for OpenAI and compatible endpoints)
  - `anthropic` (Python SDK for Anthropic API)
  - `cohere` (Python SDK for Cohere)
  - `litellm` (Python SDK for LiteLLM router)
  - `google-cloud-aiplatform` (for Vertex AI)

## Acceptance Criteria

### AC1: Provider Abstraction
- [ ] Provider interface defined with clear contract
- [ ] At least 3 providers implemented (Bedrock + 2 others)
- [ ] All providers support streaming responses
- [ ] Provider factory and registry implemented
- [ ] Unit tests for each provider implementation

### AC2: Configuration
- [ ] Provider configuration added to `etc/environment.sh`
- [ ] API keys stored in AWS Secrets Manager
- [ ] Documentation for configuring each provider
- [ ] Support for multiple simultaneous providers
- [ ] Environment-specific provider configs work

### AC3: Model Selection
- [ ] UI displays models from all configured providers
- [ ] Users can select and invoke models from any provider
- [ ] Model metadata (context length, pricing) displayed correctly
- [ ] Provider status/health check visible in UI

### AC4: Cost Tracking
- [ ] Cost tracking works for all providers
- [ ] Provider name shown in cost dashboards
- [ ] Cost formatting uses provider-specific pricing
- [ ] Costs are accurately estimated with `~` prefix

### AC5: Error Handling
- [ ] Provider-specific errors handled gracefully
- [ ] Fallback to other providers when configured
- [ ] Clear error messages for authentication failures
- [ ] Rate limiting properly handled with retries

### AC6: Testing
- [ ] Integration tests with mock provider endpoints
- [ ] Unit tests for provider abstraction layer
- [ ] Manual testing with at least 2 real providers
- [ ] Load testing with different providers

## Implementation Notes

### Suggested Approach

#### Phase 1: Provider Abstraction (Week 1)
1. Design and implement provider interface in `backend/`
2. Refactor existing Bedrock integration to use new interface
3. Create provider factory and registry
4. Add unit tests

#### Phase 2: OpenAI Provider (Week 1-2)
1. Implement `OpenAIProvider` class
2. Add OpenAI SDK dependency with `uv`
3. Configure API key in Secrets Manager
4. Test with OpenAI API and OpenAI-compatible endpoints
5. Update cost tracking for OpenAI pricing

#### Phase 3: LiteLLM Provider (Week 2)
1. Implement `LiteLLMProvider` class
2. Add LiteLLM SDK dependency
3. Support custom endpoint configuration for private deployments
4. Test with public LiteLLM demo endpoint first
5. Document private deployment setup

#### Phase 4: Frontend Integration (Week 2-3)
1. Update model selection API to return multi-provider models
2. Update frontend UI to display provider information
3. Add provider status indicators
4. Update cost dashboard to show provider-specific costs

#### Phase 5: Additional Providers (Week 3)
1. Implement Anthropic API provider
2. Implement other providers as needed
3. Comprehensive testing across all providers

### Key Files to Modify/Create

**Backend:**
- `backend/providers/` (new directory):
  - `base.py`: Abstract provider interface
  - `bedrock.py`: Refactored Bedrock provider
  - `openai.py`: OpenAI provider implementation
  - `litellm.py`: LiteLLM provider implementation
  - `anthropic.py`: Anthropic API provider
  - `factory.py`: Provider factory
  - `registry.py`: Provider registry
- `backend/models/`: Update data models for provider metadata
- `backend/api/`: Update endpoints to support provider selection
- `backend/cost/`: Extend cost calculation for multiple providers
- `backend/requirements.txt` or `pyproject.toml`: Add provider SDKs

**Configuration:**
- `etc/environment.sh`: Add provider configuration
- `iac/`: SAM templates for Secrets Manager secrets

**Frontend:**
- `frontend/src/components/ModelSelector.tsx`: Multi-provider model selection
- `frontend/src/components/ProviderStatus.tsx`: Provider health indicators
- `frontend/src/pages/CostDashboardPage.tsx`: Update to show provider in costs

**Tests:**
- `backend/tests/providers/`: Provider-specific tests
- `backend/tests/integration/`: Multi-provider integration tests

### Provider Priority
1. **OpenAI** - Most widely used, good for testing abstraction
2. **LiteLLM** - Enables private deployments and multi-provider routing
3. **Anthropic API** - Direct access to Claude models outside AWS
4. **Others** - Cohere, Vertex AI (lower priority)

### Cost Tracking Updates
Each provider has different pricing models:
- **OpenAI**: Per 1K tokens (separate input/output pricing)
- **Anthropic API**: Per 1M tokens (separate input/output pricing)
- **Bedrock**: Per 1K tokens (varies by model)
- **LiteLLM**: Depends on backend provider

Update `formatCost` functions to handle provider-specific pricing and maintain consistency across:
- `CostDashboardPage.tsx`
- `InvocationDetailPage.tsx`
- `LatencySummary.tsx`
- `InvocationTable.tsx`

### Testing Strategy
- **Unit Tests**: Mock provider responses and test interface implementation
- **Integration Tests**: Use provider SDK test modes or mock servers
- **Manual Testing**: Test with real provider accounts (low quota/cost)
- **Load Testing**: Verify rate limiting and retry logic
- **Cost Validation**: Ensure cost calculations match provider billing

## References
- [LiteLLM Documentation](https://docs.litellm.ai/)
- [OpenAI API Reference](https://platform.openai.com/docs/api-reference)
- [Anthropic API Documentation](https://docs.anthropic.com/claude/reference/getting-started-with-the-api)
- [Cohere API Documentation](https://docs.cohere.com/)
- [Google Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)

## Dependencies
- Issue #73 (VPC-Enabled Agents) - May be needed for private LiteLLM deployments

## Priority
High - Increases platform flexibility and reduces vendor lock-in

## Estimated Effort
Large (3 weeks) - Requires significant refactoring, new integrations, and thorough testing


Add support for alternate LLM providers #74

Description

Add support for alternate LLM providers

Overview

Requirements

R1: Multi-Provider Model Selection

Technical Considerations

Provider Abstraction Layer

Configuration Management

Backend Changes

Frontend Changes

Security Considerations

Dependencies

Acceptance Criteria

AC1: Provider Abstraction

AC2: Configuration

AC3: Model Selection

AC4: Cost Tracking

AC5: Error Handling

AC6: Testing

Implementation Notes

Suggested Approach

Phase 1: Provider Abstraction (Week 1)

Phase 2: OpenAI Provider (Week 1-2)

Phase 3: LiteLLM Provider (Week 2)

Phase 4: Frontend Integration (Week 2-3)

Phase 5: Additional Providers (Week 3)

Key Files to Modify/Create

Provider Priority

Cost Tracking Updates

Testing Strategy

References

Dependencies

Priority

Estimated Effort

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions