Skip to content

feat(proxy): generalize provider session ban to all providers via 5xx consecutive failures#4

Closed
vi70x3 wants to merge 5 commits into
mainfrom
spec/longcat-session-ban
Closed

feat(proxy): generalize provider session ban to all providers via 5xx consecutive failures#4
vi70x3 wants to merge 5 commits into
mainfrom
spec/longcat-session-ban

Conversation

@vi70x3

@vi70x3 vi70x3 commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Generalizes the LongCat-specific session ban to all providers. When a sticky session receives 2 consecutive 5xx errors (500, 502, 503, 504) from the same provider, that provider is banned for the session. The retry loop falls back to the next best model. Ban lasts 30-min TTL.

Two Independent Ban Triggers

  1. 5xx consecutive failure ban — 2 consecutive 5xx errors from the same provider → ban
  2. Truncation detection ban — truncated response from any provider → ban (generalized from LongCat-only)

What Changed

server/src/routes/proxy.ts

  • Extended stickySessionMap with consecutiveFailures?: Map<string, number>
  • Added recordConsecutiveFailure() — increments counter, bans at threshold 2
  • Added resetConsecutiveFailures() / resetAllConsecutiveFailures() — reset on success
  • Replaced addLongcatModelsToSkipModels() with generic addProviderModelsToSkipModels(skipModels, provider)
  • Replaced LongCat-specific auth/rate-limit error ban with general 5xx consecutive failure detection
  • Generalized truncation detection to all providers (post-stream + mid-stream)
  • Updated getStickyKey() to check bannedPlatforms for any platform
  • Updated pre-routing ban check to be generic
  • Added resetAllConsecutiveFailures() in both streaming and non-streaming success paths
  • Updated setStickyModel() to preserve consecutiveFailures

Tests

  • Renamed longcat-session-ban.test.tsprovider-session-ban.test.ts
  • 32 test cases covering all new functions and integration scenarios

Validation

  • ✅ TypeScript: npx tsc --noEmit — 0 errors
  • ✅ Tests: 150/150 pass across 15 test files

Summary by CodeRabbit

  • New Features

    • Session-level provider bans: after repeated 5xx errors, truncation, or similar failures, a failing provider is skipped for the session and requests fall back to other models; streaming truncation is detected and handled gracefully.
  • Documentation

    • Added detailed design, requirements, and task specs describing ban/fallback behavior and lifecycle.
  • Tests

    • Added unit and integration tests for ban triggers, TTL/expiry, skip-model selection, consecutive-failure tracking, and truncation detection.

vi70x3 added 4 commits June 1, 2026 22:00
Implements the LongCat session ban feature in proxy.ts. When LongCat
detects multiple API key use (auth/rate-limit errors) or returns
truncated responses, the platform is banned from the sticky session.
Future requests in that session route to non-LongCat models.

Changes:
- Extend stickySessionMap with bannedPlatforms?: Set<string>
- Add isSessionBannedFromPlatform() to check session bans
- Add banPlatformFromSession() to record platform bans
- Add addLongcatModelsToSkipModels() to skip all LongCat models
- Add isTruncatedResponse() to detect truncation keywords
- Update getStickyKey() to return undefined for banned platforms
- Update setStickyModel() to preserve bannedPlatforms across updates
- Update pre-routing logic to check bans before routing
- Update error handling to ban LongCat on auth/rate-limit/truncation
- Add truncation detection after stream completes
- Add truncation detection in mid-stream error handling
… consecutive failures

- Extend stickySessionMap with consecutiveFailures tracking per provider
- Add recordConsecutiveFailure(), resetConsecutiveFailures(), resetAllConsecutiveFailures()
- Replace addLongcatModelsToSkipModels with generic addProviderModelsToSkipModels
- Replace LongCat-specific auth/rate-limit ban with general 5xx consecutive failure detection (threshold: 2)
- Generalize truncation detection to all providers (post-stream + mid-stream)
- Update getStickyKey() to check bannedPlatforms for any platform
- Update pre-routing ban check to be generic (any banned platform)
- Add success path counter reset on both streaming and non-streaming paths
- Remove LongCat-specific auth error ban, rate limit ban, and addLongcatModelsToSkipModels
- Rename and rewrite tests from longcat-session-ban to provider-session-ban (32 test cases)
- TypeScript compiles cleanly, all 150 tests pass
@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 98de692d-0b25-450d-a988-3eaa220af4ab

📥 Commits

Reviewing files that changed from the base of the PR and between 7f5e0b2 and f212b00.

📒 Files selected for processing (1)
  • server/src/routes/proxy.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/src/routes/proxy.ts

📝 Walkthrough

Walkthrough

Adds LongCat-specific design docs and generalizes them to provider-agnostic session bans: sticky-session entries now track bannedPlatforms and per-provider consecutiveFailures; proxy adds helpers to ban/reset/skip provider models, detects truncation mid/post-stream, and updates retry logic to fallback via skipModels; tests cover helpers and lifecycle.

Changes

Provider Session Ban & Fallback Flow

Layer / File(s) Summary
LongCat session-ban specs
.roo/specs/longcat-session-ban/design.md, .roo/specs/longcat-session-ban/requirements.md, .roo/specs/longcat-session-ban/tasks.md
Design, requirements, and tasks for LongCat-specific sticky-session banning, ban helpers, truncation detection, pre-routing suppression, retry/error handling, post-/mid-stream truncation handling, and TTL/edge-case notes.
Provider 5xx session-ban specs
.roo/specs/provider-5xx-session-ban/design.md, .roo/specs/provider-5xx-session-ban/requirements.md, .roo/specs/provider-5xx-session-ban/tasks.md
Generalizes LongCat behavior to all providers: adds consecutiveFailures per-provider counters, defines record/reset helpers, provider-agnostic truncation detection, DB-driven provider model skipping, and success-path counter resets.
Proxy implementation
server/src/routes/proxy.ts
Extends stickySessionMap entries with bannedPlatforms?: Set<string> and consecutiveFailures?: Map<string, number>; adds and exports helpers (isSessionBannedFromPlatform, banPlatformFromSession, addProviderModelsToSkipModels, recordConsecutiveFailure, resetAllConsecutiveFailures, isTruncatedResponse); preserves ban/counter state on setStickyModel; expands skipModels and clears preferredModel/preferredKeyId when platform banned; records/resets counters on errors/success; detects truncation mid- and post-stream and bans platform accordingly.
Tests (unit + integration)
server/src/__tests__/routes/provider-session-ban.test.ts
Comprehensive Vitest coverage for all helpers and lifecycle scenarios: ban checks/creation/expiry, skip-model population, consecutive-5xx counting and ban threshold, per-provider/global resets, truncation detection formats, and integration flows validating fallback and expiry behavior.

Sequence Diagram

sequenceDiagram
  participant Client
  participant Proxy
  participant StickySessionMap
  participant ModelDB
  Client->>Proxy: POST /chat/completions
  Proxy->>StickySessionMap: getStickyModel(sessionId)
  Proxy->>ModelDB: Lookup sticky model platform
  alt Platform banned
    Proxy->>Proxy: addProviderModelsToSkipModels(provider)
    Proxy->>Proxy: clear preferredModel/preferredKeyId
  end
  Proxy->>Proxy: Route attempt (skipModels applied)
  alt 5xx Error
    Proxy->>StickySessionMap: recordConsecutiveFailure(provider)
    alt Threshold reached (2)
      Proxy->>StickySessionMap: banPlatformFromSession(provider)
      Proxy->>Proxy: Clear preferred sticky selections
    end
  else Truncation detected
    Proxy->>StickySessionMap: banPlatformFromSession(provider)
  else Success
    Proxy->>StickySessionMap: resetAllConsecutiveFailures(sessionId)
  end
  Proxy->>Client: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • vi70x3/freellmapi#2: Touches sticky key/preferredKeyId selection in proxy.ts; related to sticky routing logic extended here with ban tracking.

Poem

🐰 I hopped through session maps today,

Marked platforms that went astray.
Two five-double-zeros, or truncated ends,
I skip those vendors and make new friends.
A carrot fallback, sorted in play.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: generalizing provider session bans to all providers using 5xx consecutive failures. It reflects the core feature change in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spec/longcat-session-ban

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request generalizes the session ban and fallback mechanism from being LongCat-specific to working for all providers. It introduces a general 5xx consecutive failure ban (banning a provider after 2 consecutive 5xx errors) and generalizes truncation detection to all providers. The review identified three key issues: clearing the preferred model/key on any 5xx error instead of waiting for the ban threshold, failing to add all banned platforms to skipModels at the start of a request (which could lead to routing to banned platforms during fallback), and a bug in isTruncatedResponse where JSON.stringify fails to extract messages from Error objects due to non-enumerable properties.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +1459 to +1471
const errStatus = getErrorStatus(err);
if (errStatus && errStatus >= 500 && errStatus < 600) {
recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId);
// If this provider was just banned, clear preferredModel/preferredKeyId if they point to it
if (preferredModel) {
const db = getDb();
const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
if (prefRow?.platform === route.platform) {
preferredModel = undefined;
preferredKeyId = undefined;
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation clears preferredModel and preferredKeyId on any 5xx error (even the first one), rather than only when the provider is actually banned (after 2 consecutive failures). This prematurely clears the sticky session and prevents the retry loop from trying other keys/models of the same provider on the first failure, defeating the purpose of the consecutive failure threshold.

We should check if the provider is actually banned using isSessionBannedFromPlatform before clearing the preferred model and key.

Suggested change
const errStatus = getErrorStatus(err);
if (errStatus && errStatus >= 500 && errStatus < 600) {
recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId);
// If this provider was just banned, clear preferredModel/preferredKeyId if they point to it
if (preferredModel) {
const db = getDb();
const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
if (prefRow?.platform === route.platform) {
preferredModel = undefined;
preferredKeyId = undefined;
}
}
}
const errStatus = getErrorStatus(err);
if (errStatus && errStatus >= 500 && errStatus < 600) {
recordConsecutiveFailure(normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId);
// If this provider was just banned, clear preferredModel/preferredKeyId if they point to it
if (preferredModel && isSessionBannedFromPlatform(normalizedMessages, routingMode, route.platform)) {
const db = getDb();
const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
if (prefRow?.platform === route.platform) {
preferredModel = undefined;
preferredKeyId = undefined;
}
}
}

Comment on lines +1210 to 1220
const skipModels = new Set<number>();
if (preferredModel) {
const db = getDb();
const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
if (prefRow && isSessionBannedFromPlatform(normalizedMessages, routingMode, prefRow.platform)) {
addProviderModelsToSkipModels(skipModels, prefRow.platform);
console.log(`[Sticky] skipping preferredModel=${preferredModel} (${prefRow.platform} banned for session)`);
preferredModel = undefined;
preferredKeyId = undefined;
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Currently, only the preferred model's platform is checked for bans and added to skipModels. If the preferred model is not banned, but other platforms are banned for this session, those other banned platforms are not added to skipModels at the start of the request. If the preferred model fails and fallback routing is triggered, the router could route the request to one of those previously banned platforms.

To prevent this, we should retrieve all banned platforms for the active session and add their models to skipModels at the very beginning of the request.

  const skipModels = new Set<number>();
  const sessionKey = getSessionKey(normalizedMessages, routingMode);
  if (sessionKey) {
    const entry = stickySessionMap.get(sessionKey);
    if (entry) {
      if (Date.now() - entry.lastUsed > STICKY_TTL_MS) {
        stickySessionMap.delete(sessionKey);
      } else if (entry.bannedPlatforms) {
        for (const platform of entry.bannedPlatforms) {
          addProviderModelsToSkipModels(skipModels, platform);
        }
      }
    }
  }

  if (preferredModel) {
    const db = getDb();
    const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
    if (prefRow && isSessionBannedFromPlatform(normalizedMessages, routingMode, prefRow.platform)) {
      console.log(`[Sticky] skipping preferredModel=${preferredModel} (${prefRow.platform} banned for session)`);
      preferredModel = undefined;
      preferredKeyId = undefined;
    }
  }

Comment on lines +198 to +207
function isTruncatedResponse(errOrContent: any): boolean {
if (!errOrContent) return false;
let text: string;
if (typeof errOrContent === 'string') {
text = errOrContent;
} else if (typeof errOrContent === 'object') {
try { text = JSON.stringify(errOrContent); } catch { return false; }
} else {
return false;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In JavaScript/TypeScript, JSON.stringify(new Error('...')) returns "{}" because Error properties (like message) are non-enumerable. If an Error object is passed to isTruncatedResponse, the function will fail to detect any truncation keywords.

We should explicitly handle Error objects by checking errOrContent instanceof Error and extracting its message property.

Suggested change
function isTruncatedResponse(errOrContent: any): boolean {
if (!errOrContent) return false;
let text: string;
if (typeof errOrContent === 'string') {
text = errOrContent;
} else if (typeof errOrContent === 'object') {
try { text = JSON.stringify(errOrContent); } catch { return false; }
} else {
return false;
}
function isTruncatedResponse(errOrContent: any): boolean {
if (!errOrContent) return false;
let text: string;
if (typeof errOrContent === 'string') {
text = errOrContent;
} else if (errOrContent instanceof Error) {
text = errOrContent.message;
} else if (typeof errOrContent === 'object') {
try { text = JSON.stringify(errOrContent); } catch { return false; }
} else {
return false;
}

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
server/src/__tests__/routes/provider-session-ban.test.ts (1)

196-205: ⚡ Quick win

Ensure DB/PRAGMA cleanup always executes in this test.

If an assertion throws before ROLLBACK/FK restore, later tests can inherit corrupted DB state.

Suggested hardening
-      db.prepare('PRAGMA foreign_keys = OFF').run();
-      db.prepare('BEGIN').run();
-      db.prepare("DELETE FROM api_keys WHERE platform = 'longcat'").run();
-      db.prepare("DELETE FROM models WHERE platform = 'longcat'").run();
-      const skipModels = new Set<number>();
-      expect(() => addProviderModelsToSkipModels(skipModels, 'longcat')).not.toThrow();
-      expect(skipModels.size).toBe(0);
-      db.prepare('ROLLBACK').run();
-      db.prepare('PRAGMA foreign_keys = ON').run();
+      db.prepare('PRAGMA foreign_keys = OFF').run();
+      db.prepare('BEGIN').run();
+      try {
+        db.prepare("DELETE FROM api_keys WHERE platform = 'longcat'").run();
+        db.prepare("DELETE FROM models WHERE platform = 'longcat'").run();
+        const skipModels = new Set<number>();
+        expect(() => addProviderModelsToSkipModels(skipModels, 'longcat')).not.toThrow();
+        expect(skipModels.size).toBe(0);
+      } finally {
+        db.prepare('ROLLBACK').run();
+        db.prepare('PRAGMA foreign_keys = ON').run();
+      }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/__tests__/routes/provider-session-ban.test.ts` around lines 196 -
205, Wrap the DB PRAGMA/BEGIN/ROLLBACK cleanup in a try/finally so ROLLBACK and
restoring PRAGMA foreign_keys = ON always run even if assertions throw: perform
PRAGMA foreign_keys = OFF and BEGIN, run the test logic including calling
addProviderModelsToSkipModels(skipModels, 'longcat') and the expects inside try,
and ensure the db.prepare('ROLLBACK').run() and db.prepare('PRAGMA foreign_keys
= ON').run() calls are executed in the finally block to guarantee cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/src/routes/proxy.ts`:
- Around line 1367-1368: The current check treats any 5xx (500–599) as
ban-eligible but the spec only wants 500, 502, 503, 504; change the predicate to
a small centralized helper (e.g., isBanEligibleStatus(status)) and replace the
inline range checks with calls to that helper where
recordConsecutiveFailure(...) is invoked (references: recordConsecutiveFailure,
normalizedMessages, routingMode, route.platform, skipModels, route.modelDbId);
ensure both occurrences (the one around the current 1367 usage and the one near
1460) use the helper so the allowed statuses exactly match {500,502,503,504}.
- Around line 1371-1374: The truncation check currently only calls
isTruncatedResponse(streamErr.message), which misses provider truncation signals
embedded elsewhere; update the logic to aggregate the full error and recent
stream content before deciding to ban: construct a combined text from
streamErr.message plus any available streamErr.response.data / streamErr.body /
streamErr.toString() and run isTruncatedResponse against that combined string,
and also run isTruncatedResponse against the most-recent stream chunk (e.g.
lastChunk or lastReceivedChunk variable if present) before calling
banPlatformFromSession(normalizedMessages, routingMode, route.platform,
route.modelDbId) so content-based truncation signals are detected even when the
error.message lacks keywords.
- Around line 1208-1219: The current logic only checks bans for preferredModel's
platform, allowing other already-banned platforms to be retried; update the
routing pre-check to also detect and skip any platform that the session is
banned from (not just the preferredModel platform) by calling
isSessionBannedFromPlatform for each relevant platform and adding those
platforms' model IDs into skipModels via addProviderModelsToSkipModels; use
normalizedMessages and routingMode as inputs to isSessionBannedFromPlatform,
keep the existing preferredModel/preferredKeyId clearing behavior when the
preferred platform is banned, and log each skipped platform for debugging.

---

Nitpick comments:
In `@server/src/__tests__/routes/provider-session-ban.test.ts`:
- Around line 196-205: Wrap the DB PRAGMA/BEGIN/ROLLBACK cleanup in a
try/finally so ROLLBACK and restoring PRAGMA foreign_keys = ON always run even
if assertions throw: perform PRAGMA foreign_keys = OFF and BEGIN, run the test
logic including calling addProviderModelsToSkipModels(skipModels, 'longcat') and
the expects inside try, and ensure the db.prepare('ROLLBACK').run() and
db.prepare('PRAGMA foreign_keys = ON').run() calls are executed in the finally
block to guarantee cleanup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b8e7d3db-5a6b-4039-bba1-32015d65eb55

📥 Commits

Reviewing files that changed from the base of the PR and between d9c5a73 and 7f5e0b2.

📒 Files selected for processing (8)
  • .roo/specs/longcat-session-ban/design.md
  • .roo/specs/longcat-session-ban/requirements.md
  • .roo/specs/longcat-session-ban/tasks.md
  • .roo/specs/provider-5xx-session-ban/design.md
  • .roo/specs/provider-5xx-session-ban/requirements.md
  • .roo/specs/provider-5xx-session-ban/tasks.md
  • server/src/__tests__/routes/provider-session-ban.test.ts
  • server/src/routes/proxy.ts

Comment thread server/src/routes/proxy.ts Outdated
Comment thread server/src/routes/proxy.ts Outdated
Comment thread server/src/routes/proxy.ts
- Add isBanEligibleStatus() helper restricting to {500,502,503,504}
- Improve mid-stream truncation detection with aggregated error sources
- Pre-routing ban check now skips ALL banned platforms, not just preferredModel's
- Only clear preferredModel when provider is actually banned (not on first 5xx)
- Handle Error objects in isTruncatedResponse (instanceof check before JSON.stringify)
@vi70x3 vi70x3 closed this Jun 1, 2026
@vi70x4 vi70x4 deleted the spec/longcat-session-ban branch June 7, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant