feat(proxy): generalize provider session ban to all providers via 5xx consecutive failures#4
feat(proxy): generalize provider session ban to all providers via 5xx consecutive failures#4vi70x3 wants to merge 5 commits into
Conversation
Implements the LongCat session ban feature in proxy.ts. When LongCat detects multiple API key use (auth/rate-limit errors) or returns truncated responses, the platform is banned from the sticky session. Future requests in that session route to non-LongCat models. Changes: - Extend stickySessionMap with bannedPlatforms?: Set<string> - Add isSessionBannedFromPlatform() to check session bans - Add banPlatformFromSession() to record platform bans - Add addLongcatModelsToSkipModels() to skip all LongCat models - Add isTruncatedResponse() to detect truncation keywords - Update getStickyKey() to return undefined for banned platforms - Update setStickyModel() to preserve bannedPlatforms across updates - Update pre-routing logic to check bans before routing - Update error handling to ban LongCat on auth/rate-limit/truncation - Add truncation detection after stream completes - Add truncation detection in mid-stream error handling
… consecutive failures - Extend stickySessionMap with consecutiveFailures tracking per provider - Add recordConsecutiveFailure(), resetConsecutiveFailures(), resetAllConsecutiveFailures() - Replace addLongcatModelsToSkipModels with generic addProviderModelsToSkipModels - Replace LongCat-specific auth/rate-limit ban with general 5xx consecutive failure detection (threshold: 2) - Generalize truncation detection to all providers (post-stream + mid-stream) - Update getStickyKey() to check bannedPlatforms for any platform - Update pre-routing ban check to be generic (any banned platform) - Add success path counter reset on both streaming and non-streaming paths - Remove LongCat-specific auth error ban, rate limit ban, and addLongcatModelsToSkipModels - Rename and rewrite tests from longcat-session-ban to provider-session-ban (32 test cases) - TypeScript compiles cleanly, all 150 tests pass
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds LongCat-specific design docs and generalizes them to provider-agnostic session bans: sticky-session entries now track ChangesProvider Session Ban & Fallback Flow
Sequence DiagramsequenceDiagram
participant Client
participant Proxy
participant StickySessionMap
participant ModelDB
Client->>Proxy: POST /chat/completions
Proxy->>StickySessionMap: getStickyModel(sessionId)
Proxy->>ModelDB: Lookup sticky model platform
alt Platform banned
Proxy->>Proxy: addProviderModelsToSkipModels(provider)
Proxy->>Proxy: clear preferredModel/preferredKeyId
end
Proxy->>Proxy: Route attempt (skipModels applied)
alt 5xx Error
Proxy->>StickySessionMap: recordConsecutiveFailure(provider)
alt Threshold reached (2)
Proxy->>StickySessionMap: banPlatformFromSession(provider)
Proxy->>Proxy: Clear preferred sticky selections
end
else Truncation detected
Proxy->>StickySessionMap: banPlatformFromSession(provider)
else Success
Proxy->>StickySessionMap: resetAllConsecutiveFailures(sessionId)
end
Proxy->>Client: Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Code Review
This pull request generalizes the session ban and fallback mechanism from being LongCat-specific to working for all providers. It introduces a general 5xx consecutive failure ban (banning a provider after 2 consecutive 5xx errors) and generalizes truncation detection to all providers. The review identified three key issues: clearing the preferred model/key on any 5xx error instead of waiting for the ban threshold, failing to add all banned platforms to skipModels at the start of a request (which could lead to routing to banned platforms during fallback), and a bug in isTruncatedResponse where JSON.stringify fails to extract messages from Error objects due to non-enumerable properties.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const errStatus = getErrorStatus(err); | ||
| if (errStatus && errStatus >= 500 && errStatus < 600) { | ||
| recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId); | ||
| // If this provider was just banned, clear preferredModel/preferredKeyId if they point to it | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; | ||
| if (prefRow?.platform === route.platform) { | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation clears preferredModel and preferredKeyId on any 5xx error (even the first one), rather than only when the provider is actually banned (after 2 consecutive failures). This prematurely clears the sticky session and prevents the retry loop from trying other keys/models of the same provider on the first failure, defeating the purpose of the consecutive failure threshold.
We should check if the provider is actually banned using isSessionBannedFromPlatform before clearing the preferred model and key.
| const errStatus = getErrorStatus(err); | |
| if (errStatus && errStatus >= 500 && errStatus < 600) { | |
| recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId); | |
| // If this provider was just banned, clear preferredModel/preferredKeyId if they point to it | |
| if (preferredModel) { | |
| const db = getDb(); | |
| const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; | |
| if (prefRow?.platform === route.platform) { | |
| preferredModel = undefined; | |
| preferredKeyId = undefined; | |
| } | |
| } | |
| } | |
| const errStatus = getErrorStatus(err); | |
| if (errStatus && errStatus >= 500 && errStatus < 600) { | |
| recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId); | |
| // If this provider was just banned, clear preferredModel/preferredKeyId if they point to it | |
| if (preferredModel && isSessionBannedFromPlatform(normalizedMessages, routingMode, route.platform)) { | |
| const db = getDb(); | |
| const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; | |
| if (prefRow?.platform === route.platform) { | |
| preferredModel = undefined; | |
| preferredKeyId = undefined; | |
| } | |
| } | |
| } |
| const skipModels = new Set<number>(); | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; | ||
| if (prefRow && isSessionBannedFromPlatform(normalizedMessages, routingMode, prefRow.platform)) { | ||
| addProviderModelsToSkipModels(skipModels, prefRow.platform); | ||
| console.log(`[Sticky] skipping preferredModel=${preferredModel} (${prefRow.platform} banned for session)`); | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } |
There was a problem hiding this comment.
Currently, only the preferred model's platform is checked for bans and added to skipModels. If the preferred model is not banned, but other platforms are banned for this session, those other banned platforms are not added to skipModels at the start of the request. If the preferred model fails and fallback routing is triggered, the router could route the request to one of those previously banned platforms.
To prevent this, we should retrieve all banned platforms for the active session and add their models to skipModels at the very beginning of the request.
const skipModels = new Set<number>();
const sessionKey = getSessionKey(normalizedMessages, routingMode);
if (sessionKey) {
const entry = stickySessionMap.get(sessionKey);
if (entry) {
if (Date.now() - entry.lastUsed > STICKY_TTL_MS) {
stickySessionMap.delete(sessionKey);
} else if (entry.bannedPlatforms) {
for (const platform of entry.bannedPlatforms) {
addProviderModelsToSkipModels(skipModels, platform);
}
}
}
}
if (preferredModel) {
const db = getDb();
const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
if (prefRow && isSessionBannedFromPlatform(normalizedMessages, routingMode, prefRow.platform)) {
console.log(`[Sticky] skipping preferredModel=${preferredModel} (${prefRow.platform} banned for session)`);
preferredModel = undefined;
preferredKeyId = undefined;
}
}| function isTruncatedResponse(errOrContent: any): boolean { | ||
| if (!errOrContent) return false; | ||
| let text: string; | ||
| if (typeof errOrContent === 'string') { | ||
| text = errOrContent; | ||
| } else if (typeof errOrContent === 'object') { | ||
| try { text = JSON.stringify(errOrContent); } catch { return false; } | ||
| } else { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
In JavaScript/TypeScript, JSON.stringify(new Error('...')) returns "{}" because Error properties (like message) are non-enumerable. If an Error object is passed to isTruncatedResponse, the function will fail to detect any truncation keywords.
We should explicitly handle Error objects by checking errOrContent instanceof Error and extracting its message property.
| function isTruncatedResponse(errOrContent: any): boolean { | |
| if (!errOrContent) return false; | |
| let text: string; | |
| if (typeof errOrContent === 'string') { | |
| text = errOrContent; | |
| } else if (typeof errOrContent === 'object') { | |
| try { text = JSON.stringify(errOrContent); } catch { return false; } | |
| } else { | |
| return false; | |
| } | |
| function isTruncatedResponse(errOrContent: any): boolean { | |
| if (!errOrContent) return false; | |
| let text: string; | |
| if (typeof errOrContent === 'string') { | |
| text = errOrContent; | |
| } else if (errOrContent instanceof Error) { | |
| text = errOrContent.message; | |
| } else if (typeof errOrContent === 'object') { | |
| try { text = JSON.stringify(errOrContent); } catch { return false; } | |
| } else { | |
| return false; | |
| } |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
server/src/__tests__/routes/provider-session-ban.test.ts (1)
196-205: ⚡ Quick winEnsure DB/PRAGMA cleanup always executes in this test.
If an assertion throws before
ROLLBACK/FK restore, later tests can inherit corrupted DB state.Suggested hardening
- db.prepare('PRAGMA foreign_keys = OFF').run(); - db.prepare('BEGIN').run(); - db.prepare("DELETE FROM api_keys WHERE platform = 'longcat'").run(); - db.prepare("DELETE FROM models WHERE platform = 'longcat'").run(); - const skipModels = new Set<number>(); - expect(() => addProviderModelsToSkipModels(skipModels, 'longcat')).not.toThrow(); - expect(skipModels.size).toBe(0); - db.prepare('ROLLBACK').run(); - db.prepare('PRAGMA foreign_keys = ON').run(); + db.prepare('PRAGMA foreign_keys = OFF').run(); + db.prepare('BEGIN').run(); + try { + db.prepare("DELETE FROM api_keys WHERE platform = 'longcat'").run(); + db.prepare("DELETE FROM models WHERE platform = 'longcat'").run(); + const skipModels = new Set<number>(); + expect(() => addProviderModelsToSkipModels(skipModels, 'longcat')).not.toThrow(); + expect(skipModels.size).toBe(0); + } finally { + db.prepare('ROLLBACK').run(); + db.prepare('PRAGMA foreign_keys = ON').run(); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/__tests__/routes/provider-session-ban.test.ts` around lines 196 - 205, Wrap the DB PRAGMA/BEGIN/ROLLBACK cleanup in a try/finally so ROLLBACK and restoring PRAGMA foreign_keys = ON always run even if assertions throw: perform PRAGMA foreign_keys = OFF and BEGIN, run the test logic including calling addProviderModelsToSkipModels(skipModels, 'longcat') and the expects inside try, and ensure the db.prepare('ROLLBACK').run() and db.prepare('PRAGMA foreign_keys = ON').run() calls are executed in the finally block to guarantee cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/src/routes/proxy.ts`:
- Around line 1367-1368: The current check treats any 5xx (500–599) as
ban-eligible but the spec only wants 500, 502, 503, 504; change the predicate to
a small centralized helper (e.g., isBanEligibleStatus(status)) and replace the
inline range checks with calls to that helper where
recordConsecutiveFailure(...) is invoked (references: recordConsecutiveFailure,
normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId);
ensure both occurrences (the one around the current 1367 usage and the one near
1460) use the helper so the allowed statuses exactly match {500,502,503,504}.
- Around line 1371-1374: The truncation check currently only calls
isTruncatedResponse(streamErr.message), which misses provider truncation signals
embedded elsewhere; update the logic to aggregate the full error and recent
stream content before deciding to ban: construct a combined text from
streamErr.message plus any available streamErr.response.data / streamErr.body /
streamErr.toString() and run isTruncatedResponse against that combined string,
and also run isTruncatedResponse against the most-recent stream chunk (e.g.
lastChunk or lastReceivedChunk variable if present) before calling
banPlatformFromSession(normalizedMessages, routingMode, route.platform,
route.modelDbId) so content-based truncation signals are detected even when the
error.message lacks keywords.
- Around line 1208-1219: The current logic only checks bans for preferredModel's
platform, allowing other already-banned platforms to be retried; update the
routing pre-check to also detect and skip any platform that the session is
banned from (not just the preferredModel platform) by calling
isSessionBannedFromPlatform for each relevant platform and adding those
platforms' model IDs into skipModels via addProviderModelsToSkipModels; use
normalizedMessages and routingMode as inputs to isSessionBannedFromPlatform,
keep the existing preferredModel/preferredKeyId clearing behavior when the
preferred platform is banned, and log each skipped platform for debugging.
---
Nitpick comments:
In `@server/src/__tests__/routes/provider-session-ban.test.ts`:
- Around line 196-205: Wrap the DB PRAGMA/BEGIN/ROLLBACK cleanup in a
try/finally so ROLLBACK and restoring PRAGMA foreign_keys = ON always run even
if assertions throw: perform PRAGMA foreign_keys = OFF and BEGIN, run the test
logic including calling addProviderModelsToSkipModels(skipModels, 'longcat') and
the expects inside try, and ensure the db.prepare('ROLLBACK').run() and
db.prepare('PRAGMA foreign_keys = ON').run() calls are executed in the finally
block to guarantee cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b8e7d3db-5a6b-4039-bba1-32015d65eb55
📒 Files selected for processing (8)
.roo/specs/longcat-session-ban/design.md.roo/specs/longcat-session-ban/requirements.md.roo/specs/longcat-session-ban/tasks.md.roo/specs/provider-5xx-session-ban/design.md.roo/specs/provider-5xx-session-ban/requirements.md.roo/specs/provider-5xx-session-ban/tasks.mdserver/src/__tests__/routes/provider-session-ban.test.tsserver/src/routes/proxy.ts
- Add isBanEligibleStatus() helper restricting to {500,502,503,504}
- Improve mid-stream truncation detection with aggregated error sources
- Pre-routing ban check now skips ALL banned platforms, not just preferredModel's
- Only clear preferredModel when provider is actually banned (not on first 5xx)
- Handle Error objects in isTruncatedResponse (instanceof check before JSON.stringify)
Summary
Generalizes the LongCat-specific session ban to all providers. When a sticky session receives 2 consecutive 5xx errors (500, 502, 503, 504) from the same provider, that provider is banned for the session. The retry loop falls back to the next best model. Ban lasts 30-min TTL.
Two Independent Ban Triggers
What Changed
server/src/routes/proxy.tsstickySessionMapwithconsecutiveFailures?: Map<string, number>recordConsecutiveFailure()— increments counter, bans at threshold 2resetConsecutiveFailures()/resetAllConsecutiveFailures()— reset on successaddLongcatModelsToSkipModels()with genericaddProviderModelsToSkipModels(skipModels, provider)getStickyKey()to checkbannedPlatformsfor any platformresetAllConsecutiveFailures()in both streaming and non-streaming success pathssetStickyModel()to preserveconsecutiveFailuresTests
longcat-session-ban.test.ts→provider-session-ban.test.tsValidation
npx tsc --noEmit— 0 errorsSummary by CodeRabbit
New Features
Documentation
Tests