Skip to content

feat(router): add Owl Alpha + LongCat model-level routing#9

Merged
vi70x3 merged 5 commits into
mainfrom
spec/owl-alpha-longcat-model-routing
Jun 2, 2026
Merged

feat(router): add Owl Alpha + LongCat model-level routing#9
vi70x3 merged 5 commits into
mainfrom
spec/owl-alpha-longcat-model-routing

Conversation

@vi70x3

@vi70x3 vi70x3 commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Implements the owl-alpha-longcat-model-routing spec.

Summary by CodeRabbit

  • New Features

    • Smart routing now prefers LongCat and Owl Alpha at model level when valid keys exist; balanced auto excludes them.
    • Added sticky cooldown behavior for Owl Alpha.
  • Bug Fixes

    • Disabled sticky-session behavior for balanced auto routing.
    • Error handling changed from provider/platform bans to model-level skips for LongCat and Owl Alpha.
  • Tests

    • Added balanced-mode coverage and updated smart-mode routing tests.
  • Documentation

    • Added design, requirements, and task specs for these behaviors.

vi70x3 added 2 commits June 2, 2026 06:02
- Add EXCLUDED_FROM_BALANCED set for LongCat platform exclusion
- Add EXCLUDED_MODELS_FROM_BALANCED map for Owl Alpha model exclusion
- Filter excluded models in balanced mode before scoring
- Extract hasValidKeys() helper for key capacity checks
- Add Owl Alpha smart preference check in smart mode
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 90ec7e3e-b39a-4762-be9d-f94afb23bcde

📥 Commits

Reviewing files that changed from the base of the PR and between 73d8dda and 8ddc74f.

📒 Files selected for processing (2)
  • server/src/routes/proxy.ts
  • server/src/services/router.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/src/services/router.ts
  • server/src/routes/proxy.ts

📝 Walkthrough

Walkthrough

This PR disables sticky-session pinning for balanced auto-routing via an early empty return from getSessionKey(), adds balanced-mode router exclusions and hasValidKeys() validation, implements smart-mode LongCat/Owl Alpha preference ordering, migrates provider-level bans to model-level skips, and updates tests accordingly.

Changes

Sticky Session Disable and Model-Level Routing

Layer / File(s) Summary
Feature specs and design documentation
.roo/specs/disable-sticky-on-auto/*, .roo/specs/owl-alpha-longcat-model-routing/*
Designs, requirements, and tasks describing disabling sticky for balanced mode and model-level routing/preferences for Owl Alpha and LongCat, including sticky cooldown and testing checklists.
Sticky session disable for balanced mode
server/src/routes/proxy.ts, server/src/__tests__/routes/provider-session-ban.test.ts
getSessionKey() short-circuits to '' when routingMode === 'balanced', preventing downstream sticky operations; tests assert balanced mode ignores sticky entries, returns empty keys, forces ban checks false, and prevents entry creation.
Router model exclusions and key capacity validation
server/src/services/router.ts
Adds balanced-mode exclusion constants for longcat and openrouter/owl-alpha; introduces hasValidKeys() to validate key eligibility (cooldown/capacity); computes a filtered chain for balanced routing and derives intelligence rank range from it.
Smart mode model preferences and sticky cooldown
server/src/services/router.ts
Refactors smart-mode preference ordering: longcat moved to front only when hasValidKeys() succeeds; openrouter/owl-alpha repositioned to follow longcat (or appended); logs when owl-alpha preference activates; sticky cooldown extended to owl-alpha.
Proxy model-level skips and sticky clearing on errors
server/src/routes/proxy.ts
Multiple truncation/5xx/mid-stream/retryable paths now add only the failing model to skipModels (by route.modelDbId) for LongCat and Owl Alpha, clear sticky pins when pinned to those models, and avoid provider-wide bans; adds owl-alpha sticky cooldown exclusion.
Test suite refactoring to smart mode and integration coverage
server/src/__tests__/routes/provider-session-ban.test.ts
Migrated sticky-session tests to smart-mode (session keys, ban detection/creation, TTL expiry, reset behavior); added balanced-mode no-op suite; updated integration tests for ban lifecycle and skipModels; minor truncation sample tweak.

Sequence Diagram

sequenceDiagram
  participant Client
  participant Proxy
  participant Router
  participant StickyStore
  participant SkipModels

  Client->>Proxy: POST request (routingMode)
  Proxy->>Proxy: getSessionKey(routingMode)
  alt routingMode == "balanced"
    Proxy-->>StickyStore: (no lookup — returns '')
  else
    Proxy->>StickyStore: lookup session key
  end
  Proxy->>Router: routeRequest(request, skipModels)
  Router->>SkipModels: apply balanced exclusions & sticky cooldown checks
  Router->>Router: hasValidKeys(platform, modelId) checks
  Router-->>Proxy: selected route
  Proxy->>StickyStore: set/clear sticky model if needed (skipped on balanced)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Hopping through routes with a curious cheer,

Balanced lets go of the sticky-thread gear.
Owl and LongCat take ordered turns to play,
Smart mode prefers while balanced stays away.
Tests hum along as the routing finds its way.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: implementing model-level routing for Owl Alpha and LongCat, which is documented across multiple spec files and implemented in router.ts and proxy.ts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spec/owl-alpha-longcat-model-routing

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request disables sticky sessions on the balanced routing endpoint by returning an empty session key, and updates the smart routing endpoint to prefer both LongCat and Owl Alpha models when valid keys exist. It also introduces balanced-mode exclusions for these models and updates the test suite accordingly. The review feedback points out a potential issue in the smart preference sorting logic where the preferred openrouter/owl-alpha entry could be incorrectly placed behind a non-preferred longcat entry if the LongCat preference was not active but a longcat model happened to be sorted first by Thompson sampling, suggesting a more robust tracking of whether the preference was applied.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 537 to 583
if (routingMode === 'smart') {
const longcatEntries = sorted.filter(e => e.platform === 'longcat');
if (longcatEntries.length > 0) {
// Check if any LongCat key passes rate-limit checks
const lcKeys = db.prepare(
'SELECT * FROM api_keys WHERE platform = ? AND enabled = 1 AND status != ?'
).all('longcat', 'invalid') as KeyRow[];
if (lcKeys.length > 0) {
const sampleEntry = longcatEntries[0];
const lcLimits = {
rpm: sampleEntry.rpm_limit,
rpd: sampleEntry.rpd_limit,
tpm: sampleEntry.tpm_limit,
tpd: sampleEntry.tpd_limit,
};
const hasCapacity = lcKeys.some(key =>
!isOnCooldown(sampleEntry.platform, sampleEntry.model_id, key.id) &&
canMakeRequest(sampleEntry.platform, sampleEntry.model_id, key.id, lcLimits) &&
canUseTokens(sampleEntry.platform, sampleEntry.model_id, key.id, estimatedTokens, lcLimits)
);
if (hasCapacity) {
// Move all LongCat entries to front, preserving relative score order
const others = sorted.filter(e => e.platform !== 'longcat');
sorted.length = 0;
sorted.push(...longcatEntries, ...others);
const sampleEntry = longcatEntries[0];
const lcLimits = {
rpm: sampleEntry.rpm_limit,
rpd: sampleEntry.rpd_limit,
tpm: sampleEntry.tpm_limit,
tpd: sampleEntry.tpd_limit,
};
// T1.3: Use hasValidKeys() helper
if (hasValidKeys(sampleEntry.platform, sampleEntry.model_id, lcLimits, estimatedTokens)) {
// Move all LongCat entries to front, preserving relative score order
const others = sorted.filter(e => e.platform !== 'longcat');
sorted.length = 0;
sorted.push(...longcatEntries, ...others);
}
}

// T1.4: Owl Alpha smart preference — move openrouter/owl-alpha to front if any key has capacity
const owlAlphaEntry = sorted.find(e => e.platform === 'openrouter' && e.model_id === 'owl-alpha');
if (owlAlphaEntry) {
const oaLimits = {
rpm: owlAlphaEntry.rpm_limit,
rpd: owlAlphaEntry.rpd_limit,
tpm: owlAlphaEntry.tpm_limit,
tpd: owlAlphaEntry.tpd_limit,
};
if (hasValidKeys(owlAlphaEntry.platform, owlAlphaEntry.model_id, oaLimits, estimatedTokens)) {
// Remove owl-alpha from its current position
const owlIdx = sorted.indexOf(owlAlphaEntry);
if (owlIdx >= 0) {
sorted.splice(owlIdx, 1);
}
// Insert after LongCat entries (if any are at the front), preserving relative score order
const lastLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat');
if (lastLongcatIdx === -1) {
// All entries are LongCat — append owl-alpha at the end
sorted.push(owlAlphaEntry);
} else {
// Insert right after the last LongCat entry
sorted.splice(lastLongcatIdx, 0, owlAlphaEntry);
}
console.log('[Router] Owl Alpha preference active — moving openrouter/owl-alpha to front');
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic for placing the preferred openrouter/owl-alpha entry uses sorted.findIndex(e => e.platform !== 'longcat') to find the insertion index. However, if LongCat was not preferred (e.g., due to no valid keys) but a longcat entry happened to be sorted at index 0 by chance of Thompson sampling, lastLongcatIdx will evaluate to 1. This incorrectly places the preferred owl-alpha entry at index 1 (behind a non-preferred longcat entry with no valid keys), leading to an unnecessary key validation check and fallback during routing.

Instead of scanning the array with findIndex, we can track whether the LongCat preference was actually applied (lcPreferred). If it was, we insert owl-alpha right after the preferred LongCat entries (at index longcatEntries.length). Otherwise, we insert it at the very front (index 0). This is more robust and avoids scanning the array.

  if (routingMode === 'smart') {
    let lcPreferred = false;
    const longcatEntries = sorted.filter(e => e.platform === 'longcat');
    if (longcatEntries.length > 0) {
      const sampleEntry = longcatEntries[0];
      const lcLimits = {
        rpm: sampleEntry.rpm_limit,
        rpd: sampleEntry.rpd_limit,
        tpm: sampleEntry.tpm_limit,
        tpd: sampleEntry.tpd_limit,
      };
      // T1.3: Use hasValidKeys() helper
      if (hasValidKeys(sampleEntry.platform, sampleEntry.model_id, lcLimits, estimatedTokens)) {
        // Move all LongCat entries to front, preserving relative score order
        const others = sorted.filter(e => e.platform !== 'longcat');
        sorted.length = 0;
        sorted.push(...longcatEntries, ...others);
        lcPreferred = true;
      }
    }

    // T1.4: Owl Alpha smart preference — move openrouter/owl-alpha to front if any key has capacity
    const owlAlphaEntry = sorted.find(e => e.platform === 'openrouter' && e.model_id === 'owl-alpha');
    if (owlAlphaEntry) {
      const oaLimits = {
        rpm: owlAlphaEntry.rpm_limit,
        rpd: owlAlphaEntry.rpd_limit,
        tpm: owlAlphaEntry.tpm_limit,
        tpd: owlAlphaEntry.tpd_limit,
      };
      if (hasValidKeys(owlAlphaEntry.platform, owlAlphaEntry.model_id, oaLimits, estimatedTokens)) {
        // Remove owl-alpha from its current position
        const owlIdx = sorted.indexOf(owlAlphaEntry);
        if (owlIdx >= 0) {
          sorted.splice(owlIdx, 1);
        }
        // Insert after LongCat entries if they were preferred, otherwise at the front
        const insertIdx = lcPreferred ? longcatEntries.length : 0;
        sorted.splice(insertIdx, 0, owlAlphaEntry);
        console.log('[Router] Owl Alpha preference active — moving openrouter/owl-alpha to front');
      }
    }
  }

- Add Owl Alpha sticky cooldown check alongside LongCat
- Change all LongCat error handling from provider-level to model-level banning
- Add identical model-level handling for Owl Alpha (openrouter/owl-alpha)
- Replace banPlatformFromSession + addProviderModelsToSkipModels with skipModels.add(route.modelDbId)
- Clear sticky preference when pinned to specific model on errors

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/src/services/router.ts (1)

572-578: 💤 Low value

Variable name lastLongcatIdx is misleading.

The variable actually holds the index of the first non-LongCat entry, not the last LongCat entry. The logic is correct, but the name could confuse future readers.

Suggested rename for clarity
-        const lastLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat');
-        if (lastLongcatIdx === -1) {
+        const firstNonLongcatIdx = sorted.findIndex(e => e.platform !== 'longcat');
+        if (firstNonLongcatIdx === -1) {
           // All entries are LongCat — append owl-alpha at the end
           sorted.push(owlAlphaEntry);
         } else {
           // Insert right after the last LongCat entry
-          sorted.splice(lastLongcatIdx, 0, owlAlphaEntry);
+          sorted.splice(firstNonLongcatIdx, 0, owlAlphaEntry);
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/services/router.ts` around lines 572 - 578, The variable
lastLongcatIdx is misnamed because it actually stores the index of the first
non-LongCat element; rename the identifier (e.g., lastLongcatIdx ->
firstNonLongcatIdx) throughout the block in the function/method where sorted is
used and update the surrounding comments to match ("first non-LongCat" / "insert
right after the last LongCat" as appropriate) so the name accurately reflects
its semantics; ensure the conditional check (=== -1) and the splice/pop logic
referencing that variable are updated to the new name as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@server/src/services/router.ts`:
- Around line 572-578: The variable lastLongcatIdx is misnamed because it
actually stores the index of the first non-LongCat element; rename the
identifier (e.g., lastLongcatIdx -> firstNonLongcatIdx) throughout the block in
the function/method where sorted is used and update the surrounding comments to
match ("first non-LongCat" / "insert right after the last LongCat" as
appropriate) so the name accurately reflects its semantics; ensure the
conditional check (=== -1) and the splice/pop logic referencing that variable
are updated to the new name as well.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9942f308-b710-4fd0-94eb-7f1864071b4b

📥 Commits

Reviewing files that changed from the base of the PR and between a468601 and c98bb7d.

📒 Files selected for processing (9)
  • .roo/specs/disable-sticky-on-auto/design.md
  • .roo/specs/disable-sticky-on-auto/requirements.md
  • .roo/specs/disable-sticky-on-auto/tasks.md
  • .roo/specs/owl-alpha-longcat-model-routing/design.md
  • .roo/specs/owl-alpha-longcat-model-routing/requirements.md
  • .roo/specs/owl-alpha-longcat-model-routing/tasks.md
  • server/src/__tests__/routes/provider-session-ban.test.ts
  • server/src/routes/proxy.ts
  • server/src/services/router.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/src/routes/proxy.ts`:
- Line 1234: The code calls skipModels.add(route.modelDbId) but route is only
declared later in the retry loop and is undefined here; replace that usage to
add the already-available DB id from preferredModel (i.e. call
skipModels.add(preferredModel) or skipModels.add(preferredModelModelDbId
depending on the shape of preferredModel) instead of route.modelDbId) so the
cooldown path doesn't throw a ReferenceError; update the skipModels.add call
near where preferredModel is in scope and remove reliance on the later-declared
route variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 12bec45e-da83-4ebc-9de7-7b71c080604a

📥 Commits

Reviewing files that changed from the base of the PR and between c98bb7d and 73d8dda.

📒 Files selected for processing (2)
  • .roo/specs/owl-alpha-longcat-model-routing/tasks.md
  • server/src/routes/proxy.ts
✅ Files skipped from review due to trivial changes (1)
  • .roo/specs/owl-alpha-longcat-model-routing/tasks.md

Comment thread server/src/routes/proxy.ts Outdated
@kilo-code-bot

kilo-code-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (5 files)
  • .roo/specs/owl-alpha-longcat-model-routing/requirements.md
  • .roo/specs/owl-alpha-longcat-model-routing/tasks.md
  • server/src/tests/routes/provider-session-ban.test.ts
  • server/src/routes/proxy.ts
  • server/src/services/router.ts

Reviewed by nemotron-3-super-120b-a12b-20230311:free · 495,738 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant