feat(router): add Owl Alpha + LongCat model-level routing#9
Conversation
- Add EXCLUDED_FROM_BALANCED set for LongCat platform exclusion - Add EXCLUDED_MODELS_FROM_BALANCED map for Owl Alpha model exclusion - Filter excluded models in balanced mode before scoring - Extract hasValidKeys() helper for key capacity checks - Add Owl Alpha smart preference check in smart mode
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR disables sticky-session pinning for balanced auto-routing via an early empty return from getSessionKey(), adds balanced-mode router exclusions and hasValidKeys() validation, implements smart-mode LongCat/Owl Alpha preference ordering, migrates provider-level bans to model-level skips, and updates tests accordingly. ChangesSticky Session Disable and Model-Level Routing
Sequence DiagramsequenceDiagram
participant Client
participant Proxy
participant Router
participant StickyStore
participant SkipModels
Client->>Proxy: POST request (routingMode)
Proxy->>Proxy: getSessionKey(routingMode)
alt routingMode == "balanced"
Proxy-->>StickyStore: (no lookup — returns '')
else
Proxy->>StickyStore: lookup session key
end
Proxy->>Router: routeRequest(request, skipModels)
Router->>SkipModels: apply balanced exclusions & sticky cooldown checks
Router->>Router: hasValidKeys(platform, modelId) checks
Router-->>Proxy: selected route
Proxy->>StickyStore: set/clear sticky model if needed (skipped on balanced)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Code Review
This pull request disables sticky sessions on the balanced routing endpoint by returning an empty session key, and updates the smart routing endpoint to prefer both LongCat and Owl Alpha models when valid keys exist. It also introduces balanced-mode exclusions for these models and updates the test suite accordingly. The review feedback points out a potential issue in the smart preference sorting logic where the preferred openrouter/owl-alpha entry could be incorrectly placed behind a non-preferred longcat entry if the LongCat preference was not active but a longcat model happened to be sorted first by Thompson sampling, suggesting a more robust tracking of whether the preference was applied.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if (routingMode === 'smart') { | ||
| const longcatEntries = sorted.filter(e => e.platform === 'longcat'); | ||
| if (longcatEntries.length > 0) { | ||
| // Check if any LongCat key passes rate-limit checks | ||
| const lcKeys = db.prepare( | ||
| 'SELECT * FROM api_keys WHERE platform = ? AND enabled = 1 AND status != ?' | ||
| ).all('longcat', 'invalid') as KeyRow[]; | ||
| if (lcKeys.length > 0) { | ||
| const sampleEntry = longcatEntries[0]; | ||
| const lcLimits = { | ||
| rpm: sampleEntry.rpm_limit, | ||
| rpd: sampleEntry.rpd_limit, | ||
| tpm: sampleEntry.tpm_limit, | ||
| tpd: sampleEntry.tpd_limit, | ||
| }; | ||
| const hasCapacity = lcKeys.some(key => | ||
| !isOnCooldown(sampleEntry.platform, sampleEntry.model_id, key.id) && | ||
| canMakeRequest(sampleEntry.platform, sampleEntry.model_id, key.id, lcLimits) && | ||
| canUseTokens(sampleEntry.platform, sampleEntry.model_id, key.id, estimatedTokens, lcLimits) | ||
| ); | ||
| if (hasCapacity) { | ||
| // Move all LongCat entries to front, preserving relative score order | ||
| const others = sorted.filter(e => e.platform !== 'longcat'); | ||
| sorted.length = 0; | ||
| sorted.push(...longcatEntries, ...others); | ||
| const sampleEntry = longcatEntries[0]; | ||
| const lcLimits = { | ||
| rpm: sampleEntry.rpm_limit, | ||
| rpd: sampleEntry.rpd_limit, | ||
| tpm: sampleEntry.tpm_limit, | ||
| tpd: sampleEntry.tpd_limit, | ||
| }; | ||
| // T1.3: Use hasValidKeys() helper | ||
| if (hasValidKeys(sampleEntry.platform, sampleEntry.model_id, lcLimits, estimatedTokens)) { | ||
| // Move all LongCat entries to front, preserving relative score order | ||
| const others = sorted.filter(e => e.platform !== 'longcat'); | ||
| sorted.length = 0; | ||
| sorted.push(...longcatEntries, ...others); | ||
| } | ||
| } | ||
|
|
||
| // T1.4: Owl Alpha smart preference — move openrouter/owl-alpha to front if any key has capacity | ||
| const owlAlphaEntry = sorted.find(e => e.platform === 'openrouter' && e.model_id === 'owl-alpha'); | ||
| if (owlAlphaEntry) { | ||
| const oaLimits = { | ||
| rpm: owlAlphaEntry.rpm_limit, | ||
| rpd: owlAlphaEntry.rpd_limit, | ||
| tpm: owlAlphaEntry.tpm_limit, | ||
| tpd: owlAlphaEntry.tpd_limit, | ||
| }; | ||
| if (hasValidKeys(owlAlphaEntry.platform, owlAlphaEntry.model_id, oaLimits, estimatedTokens)) { | ||
| // Remove owl-alpha from its current position | ||
| const owlIdx = sorted.indexOf(owlAlphaEntry); | ||
| if (owlIdx >= 0) { | ||
| sorted.splice(owlIdx, 1); | ||
| } | ||
| // Insert after LongCat entries (if any are at the front), preserving relative score order | ||
| const lastLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat'); | ||
| if (lastLongcatIdx === -1) { | ||
| // All entries are LongCat — append owl-alpha at the end | ||
| sorted.push(owlAlphaEntry); | ||
| } else { | ||
| // Insert right after the last LongCat entry | ||
| sorted.splice(lastLongcatIdx, 0, owlAlphaEntry); | ||
| } | ||
| console.log('[Router] Owl Alpha preference active — moving openrouter/owl-alpha to front'); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The current logic for placing the preferred openrouter/owl-alpha entry uses sorted.findIndex(e => e.platform !== 'longcat') to find the insertion index. However, if LongCat was not preferred (e.g., due to no valid keys) but a longcat entry happened to be sorted at index 0 by chance of Thompson sampling, lastLongcatIdx will evaluate to 1. This incorrectly places the preferred owl-alpha entry at index 1 (behind a non-preferred longcat entry with no valid keys), leading to an unnecessary key validation check and fallback during routing.
Instead of scanning the array with findIndex, we can track whether the LongCat preference was actually applied (lcPreferred). If it was, we insert owl-alpha right after the preferred LongCat entries (at index longcatEntries.length). Otherwise, we insert it at the very front (index 0). This is more robust and avoids scanning the array.
if (routingMode === 'smart') {
let lcPreferred = false;
const longcatEntries = sorted.filter(e => e.platform === 'longcat');
if (longcatEntries.length > 0) {
const sampleEntry = longcatEntries[0];
const lcLimits = {
rpm: sampleEntry.rpm_limit,
rpd: sampleEntry.rpd_limit,
tpm: sampleEntry.tpm_limit,
tpd: sampleEntry.tpd_limit,
};
// T1.3: Use hasValidKeys() helper
if (hasValidKeys(sampleEntry.platform, sampleEntry.model_id, lcLimits, estimatedTokens)) {
// Move all LongCat entries to front, preserving relative score order
const others = sorted.filter(e => e.platform !== 'longcat');
sorted.length = 0;
sorted.push(...longcatEntries, ...others);
lcPreferred = true;
}
}
// T1.4: Owl Alpha smart preference — move openrouter/owl-alpha to front if any key has capacity
const owlAlphaEntry = sorted.find(e => e.platform === 'openrouter' && e.model_id === 'owl-alpha');
if (owlAlphaEntry) {
const oaLimits = {
rpm: owlAlphaEntry.rpm_limit,
rpd: owlAlphaEntry.rpd_limit,
tpm: owlAlphaEntry.tpm_limit,
tpd: owlAlphaEntry.tpd_limit,
};
if (hasValidKeys(owlAlphaEntry.platform, owlAlphaEntry.model_id, oaLimits, estimatedTokens)) {
// Remove owl-alpha from its current position
const owlIdx = sorted.indexOf(owlAlphaEntry);
if (owlIdx >= 0) {
sorted.splice(owlIdx, 1);
}
// Insert after LongCat entries if they were preferred, otherwise at the front
const insertIdx = lcPreferred ? longcatEntries.length : 0;
sorted.splice(insertIdx, 0, owlAlphaEntry);
console.log('[Router] Owl Alpha preference active — moving openrouter/owl-alpha to front');
}
}
}- Add Owl Alpha sticky cooldown check alongside LongCat - Change all LongCat error handling from provider-level to model-level banning - Add identical model-level handling for Owl Alpha (openrouter/owl-alpha) - Replace banPlatformFromSession + addProviderModelsToSkipModels with skipModels.add(route.modelDbId) - Clear sticky preference when pinned to specific model on errors
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/src/services/router.ts (1)
572-578: 💤 Low valueVariable name
lastLongcatIdxis misleading.The variable actually holds the index of the first non-LongCat entry, not the last LongCat entry. The logic is correct, but the name could confuse future readers.
Suggested rename for clarity
- const lastLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat'); - if (lastLongcatIdx === -1) { + const firstNonLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat'); + if (firstNonLongcatIdx === -1) { // All entries are LongCat — append owl-alpha at the end sorted.push(owlAlphaEntry); } else { // Insert right after the last LongCat entry - sorted.splice(lastLongcatIdx, 0, owlAlphaEntry); + sorted.splice(firstNonLongcatIdx, 0, owlAlphaEntry); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/services/router.ts` around lines 572 - 578, The variable lastLongcatIdx is misnamed because it actually stores the index of the first non-LongCat element; rename the identifier (e.g., lastLongcatIdx -> firstNonLongcatIdx) throughout the block in the function/method where sorted is used and update the surrounding comments to match ("first non-LongCat" / "insert right after the last LongCat" as appropriate) so the name accurately reflects its semantics; ensure the conditional check (=== -1) and the splice/pop logic referencing that variable are updated to the new name as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@server/src/services/router.ts`:
- Around line 572-578: The variable lastLongcatIdx is misnamed because it
actually stores the index of the first non-LongCat element; rename the
identifier (e.g., lastLongcatIdx -> firstNonLongcatIdx) throughout the block in
the function/method where sorted is used and update the surrounding comments to
match ("first non-LongCat" / "insert right after the last LongCat" as
appropriate) so the name accurately reflects its semantics; ensure the
conditional check (=== -1) and the splice/pop logic referencing that variable
are updated to the new name as well.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9942f308-b710-4fd0-94eb-7f1864071b4b
📒 Files selected for processing (9)
.roo/specs/disable-sticky-on-auto/design.md.roo/specs/disable-sticky-on-auto/requirements.md.roo/specs/disable-sticky-on-auto/tasks.md.roo/specs/owl-alpha-longcat-model-routing/design.md.roo/specs/owl-alpha-longcat-model-routing/requirements.md.roo/specs/owl-alpha-longcat-model-routing/tasks.mdserver/src/__tests__/routes/provider-session-ban.test.tsserver/src/routes/proxy.tsserver/src/services/router.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/src/routes/proxy.ts`:
- Line 1234: The code calls skipModels.add(route.modelDbId) but route is only
declared later in the retry loop and is undefined here; replace that usage to
add the already-available DB id from preferredModel (i.e. call
skipModels.add(preferredModel) or skipModels.add(preferredModelModelDbId
depending on the shape of preferredModel) instead of route.modelDbId) so the
cooldown path doesn't throw a ReferenceError; update the skipModels.add call
near where preferredModel is in scope and remove reliance on the later-declared
route variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 12bec45e-da83-4ebc-9de7-7b71c080604a
📒 Files selected for processing (2)
.roo/specs/owl-alpha-longcat-model-routing/tasks.mdserver/src/routes/proxy.ts
✅ Files skipped from review due to trivial changes (1)
- .roo/specs/owl-alpha-longcat-model-routing/tasks.md
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (5 files)
Reviewed by nemotron-3-super-120b-a12b-20230311:free · 495,738 tokens |
Implements the
owl-alpha-longcat-model-routingspec.Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation