Add favorite_free_models ranking for free-tier routing#104
Merged
Conversation
…iority Introduces a top-level `favorite_free_models` list in config.json. When routing through any `*/free` virtual endpoint or the free tier of `llmproxy/loadbalanced`, models in this list are promoted to the front of the candidate pool in ranked order before the normal capacity/request-fit/capability algorithm handles the rest. A favorite is only used if it currently passes `_is_model_free()` (believed_free and not cost-observed); it is silently skipped otherwise. Removing a model from believed_free (cost observed at runtime) leaves it in favorite_free_models so it re-promotes automatically when a future sync restores its free status. Changes: - server.py: _apply_favorite_free_ordering() helper + call in _proxy_endpoint() (after all ordering passes, for is_free_virtual) and inside _loadbalanced_ordered_candidates() on the free tier bucket only - admin.py: GET/PUT /admin/api/favorite-free-models endpoint; favorite_free_models included in GET /admin/api/config response - static/admin/index.html: Favorite free models card in Models & Categorizations tab with grouped-by-provider picker, ranked list with up/down/remove, auto-save - tests/test_favorite_ordering.py: unit tests for _apply_favorite_free_ordering() - tests/test_admin_api.py: API endpoint tests - README.md: documents favorite_free_models in config schema and cycling description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
Specifying "x/y" in favorite_free_models now matches both "x/y" and "x/y:free"
(or any :variant suffix), which is the common pattern for OpenRouter free-tier
model IDs. Matching strips the suffix from the candidate before comparing, so
users don't need to know whether the upstream ID includes the suffix.
Both bare ("gemini-flash") and provider-qualified ("google/gemini-flash") forms
match with or without a suffix on the candidate. Exact matches with a suffix
("google/gemini-flash:free") also continue to work.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
… import Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for a
favorite_free_modelsconfiguration option that allows users to specify a ranked list of models to prioritize when routing through free-tier virtual endpoints (*/freeand the free tier ofllmproxy/loadbalanced).Key Changes
Core routing logic (
server.py):_apply_favorite_free_ordering()function that promotes favorite models to the front of the candidate pool in ranked order:variantsuffixes (e.g.,:free,:nitro)gpt-4o-mini) and qualified IDs (e.g.,openai/gpt-4o-mini)Admin API (
admin.py):GET /admin/api/favorite-free-modelsendpoint to retrieve the current listPUT /admin/api/favorite-free-modelsendpoint to update the list with validationfavorite_free_modelsin the config export endpointAdmin UI (
index.html):Documentation (
README.md):favorite_free_modelsusageTests (
test_favorite_ordering.py,test_admin_api.py):_apply_favorite_free_ordering()covering edge casesImplementation Details
https://claude.ai/code/session_01XSnxDp6BPeErjdqembchkZ