Skip to content

Add favorite_free_models ranking for free-tier routing#104

Merged
BillJr99 merged 3 commits into
mainfrom
claude/ranked-favorite-models-3qta7v
Jun 19, 2026
Merged

Add favorite_free_models ranking for free-tier routing#104
BillJr99 merged 3 commits into
mainfrom
claude/ranked-favorite-models-3qta7v

Conversation

@BillJr99

Copy link
Copy Markdown
Owner

Summary

Adds support for a favorite_free_models configuration option that allows users to specify a ranked list of models to prioritize when routing through free-tier virtual endpoints (*/free and the free tier of llmproxy/loadbalanced).

Key Changes

  • Core routing logic (server.py):

    • Implemented _apply_favorite_free_ordering() function that promotes favorite models to the front of the candidate pool in ranked order
    • Matching is case-insensitive and strips :variant suffixes (e.g., :free, :nitro)
    • Supports both bare model IDs (e.g., gpt-4o-mini) and qualified IDs (e.g., openai/gpt-4o-mini)
    • Silently skips favorites not in the pool (e.g., cost-observed models)
    • Integrated into two routing paths: quality-ordered bucket selection and capability-aware ordering for free virtuals
  • Admin API (admin.py):

    • Added GET /admin/api/favorite-free-models endpoint to retrieve the current list
    • Added PUT /admin/api/favorite-free-models endpoint to update the list with validation
    • Included favorite_free_models in the config export endpoint
  • Admin UI (index.html):

    • Added "Favorite free models" panel in Models & Categorizations tab
    • Grouped model picker by provider for easy selection
    • Reorderable list with up/down buttons to adjust ranking
    • Remove button for each entry
    • Real-time persistence to server
  • Documentation (README.md):

    • Added configuration example showing favorite_free_models usage
    • Documented matching behavior (case-insensitive, variant suffix stripping)
    • Explained cost-observation persistence and re-promotion behavior
    • Noted that favorites only affect free-tier routing, not other virtual endpoints
  • Tests (test_favorite_ordering.py, test_admin_api.py):

    • Comprehensive unit tests for _apply_favorite_free_ordering() covering edge cases
    • Admin API tests for GET/PUT endpoints and validation
    • Tests for case-insensitivity, variant suffix handling, and order preservation

Implementation Details

  • Favorites are only promoted if they are currently believed-free; cost-observed models are automatically skipped without requiring config changes
  • Non-favorite candidates maintain their original relative order after promotion
  • The feature integrates seamlessly with existing routing logic (request-fit triage, capability ordering)
  • Admin UI updates the picker when models are discovered and persists changes immediately

https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ

claude added 3 commits June 19, 2026 01:28
…iority

Introduces a top-level `favorite_free_models` list in config.json. When routing
through any `*/free` virtual endpoint or the free tier of `llmproxy/loadbalanced`,
models in this list are promoted to the front of the candidate pool in ranked order
before the normal capacity/request-fit/capability algorithm handles the rest.

A favorite is only used if it currently passes `_is_model_free()` (believed_free and
not cost-observed); it is silently skipped otherwise. Removing a model from
believed_free (cost observed at runtime) leaves it in favorite_free_models so it
re-promotes automatically when a future sync restores its free status.

Changes:
- server.py: _apply_favorite_free_ordering() helper + call in _proxy_endpoint()
  (after all ordering passes, for is_free_virtual) and inside
  _loadbalanced_ordered_candidates() on the free tier bucket only
- admin.py: GET/PUT /admin/api/favorite-free-models endpoint; favorite_free_models
  included in GET /admin/api/config response
- static/admin/index.html: Favorite free models card in Models & Categorizations tab
  with grouped-by-provider picker, ranked list with up/down/remove, auto-save
- tests/test_favorite_ordering.py: unit tests for _apply_favorite_free_ordering()
- tests/test_admin_api.py: API endpoint tests
- README.md: documents favorite_free_models in config schema and cycling description

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
Specifying "x/y" in favorite_free_models now matches both "x/y" and "x/y:free"
(or any :variant suffix), which is the common pattern for OpenRouter free-tier
model IDs. Matching strips the suffix from the candidate before comparing, so
users don't need to know whether the upstream ID includes the suffix.

Both bare ("gemini-flash") and provider-qualified ("google/gemini-flash") forms
match with or without a suffix on the candidate. Exact matches with a suffix
("google/gemini-flash:free") also continue to work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
… import

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
@BillJr99 BillJr99 merged commit dea4afe into main Jun 19, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants