feat: generalized thread protection scanner by vi70x3 · Pull Request #10 · animaios/api-llm-localhost

vi70x3 · 2026-06-02T12:45:10Z

Implements the generalized-thread-protection spec. Replaces hardcoded LongCat and Owl Alpha cooldown blocks with a dynamic thread protection scanner.

Summary by CodeRabbit

Release Notes

New Features
- Added automatic heartbeat and stall detection for streaming connections to prevent timeouts
- Enabled detection and handling of wrapped errors in HTTP 200 responses
- Implemented dynamic model exclusion during provider cooldown windows
Tests
- Added comprehensive test coverage for stream heartbeat and stall protection mechanisms
Documentation
- Added design specifications for thread protection and wrapped error handling

…alized thread protection scanner

coderabbitai · 2026-06-02T12:45:22Z

📝 Walkthrough

Walkthrough

This PR implements two independent resilience features and establishes thread-protection infrastructure: (1) wrapped-error detection for upstream payloads returned with HTTP 200 status codes, (2) SSE stream keepalive heartbeats and stall timeouts to prevent indefinite stream hangs, and (3) generalized thread-protection types and sticky-cooldown refactoring to replace hardcoded provider-specific branching.

Changes

Error Handling and Resilience

Layer / File(s)	Summary
Wrapped Error Interception: Contract and Base Provider `server/src/providers/base.ts`, `.roo/specs/wrapped-error-interception/*`	Design and requirements documented for detecting upstream error payloads in HTTP 200 responses. `BaseProvider` gains protected helpers `isWrappedError()` and `throwWrappedError()`, and `extractErrorMessage()` visibility changed from `private` to `protected` for code reuse.
Wrapped Error Interception: Provider Implementation `server/src/providers/openai-compat.ts`, `server/src/providers/cohere.ts`, `server/src/providers/cloudflare.ts`, `server/src/providers/google.ts`	Both non-streaming `chatCompletion()` and streaming `streamChatCompletion()` paths in all four providers add wrapped-error detection immediately after JSON parsing, throwing `ProviderApiError` to terminate streams on error detection.
Stream Keepalive and Stall Protection `server/src/routes/proxy.ts`, `server/src/__tests__/routes/stream-heartbeat-stall.test.ts`	Exports `streamKeepaliveConfig` and extends streaming retry loop with heartbeat intervals and stall detection: tracks last chunk timestamp, emits SSE keep-alive comments, aborts streams exceeding `MAX_STREAM_STALL_MS`, responds with error frames (in-stream) or HTTP 504 (pre-stream), and cleans up on completion or disconnect. Comprehensive test suite validates all scenarios.
Thread Protection Framework: Types and Integration `server/src/services/threadProtection.ts`, `server/src/routes/proxy.ts`, `.roo/specs/generalized-thread-protection/*`, `server/src/__tests__/routes/proxy-tools.test.ts`	Thread-protection service module introduces `ProtectionLevel`, `ErrorContext`, and `ThreadProtectionAction` types, with placeholder `evaluateThreadProtection()`. Proxy imports the function and replaces LongCat-specific sticky-cooldown logic with generic thread-protection mechanism. Test setup updated to clear transient model cooldowns.
Tooling: Streaming Proxy Script `fix_streaming.py`	Python script to locate and modify the streaming conditional block in `proxy.ts` by scanning for the `if (stream) {` marker.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ProxyRoute
  participant StreamSource
  ProxyRoute->>StreamSource: start streaming
  StreamSource-->>ProxyRoute: first chunk (after delay)
  ProxyRoute->>ProxyRoute: track lastChunkTimestamp
  activate ProxyRoute
  ProxyRoute->>ProxyRoute: start keepalive interval
  alt chunk arrives before stall threshold
    StreamSource-->>ProxyRoute: chunk
    ProxyRoute->>ProxyRoute: update lastChunkTimestamp
    ProxyRoute-->>Client: SSE data
  else stall exceeds MAX_STREAM_STALL_MS
    ProxyRoute->>StreamSource: abort
    alt headers already sent
      ProxyRoute-->>Client: SSE error frame + [DONE]
    else pre-stream stall
      ProxyRoute-->>Client: HTTP 504 + error payload
    end
  end
  deactivate ProxyRoute
  ProxyRoute->>ProxyRoute: clear keepalive interval

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

vi70x3/freellmapi#9: Both PRs modify server/src/routes/proxy.ts to replace provider-level hardcoded branching with generalized logic for model exclusion and retry handling.
vi70x3/freellmapi#8: Directly related in removing hardcoded longcat branching from proxy.ts sticky-cooldown logic in favor of generalized thread-protection decision engine.
vi70x3/freellmapi#2: Overlaps with sticky-session retry and pinning behavior changes that the generalized thread-protection logic is now replacing.

Poem

🐰 Stream keepalive and wrapped errors caught,
Thread protection rises, provider branching fought,
No more hardcoded paths for just the one,
Resilience blooms—the heartbeat's begun! 💚

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: generalized thread protection scanner' directly and accurately describes the main feature being implemented—replacing hardcoded platform-specific cooldown logic with a generalized, provider-agnostic thread protection mechanism.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch spec/generalized-thread-protection

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a generalized thread protection scanner, wrapped error payload detection on HTTP 200 responses, and stream heartbeat/stall protection. While the wrapped error handling and heartbeat features are well-structured, several critical issues must be addressed. Specifically, the thread protection scanner is incomplete as the main evaluation function is a placeholder and the proxy integration is missing. Additionally, there is a test suite crash due to an invalid import in the proxy tests, a potential logic bug in the exhaustion protection check, a potential NaN status code issue in the base provider, and both the helper Python script and requirements document are truncated.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-02T12:46:54Z

+    // Dynamic import to get the same module instance used by the running app
+    const { transientModelCooldowns: cooldowns } = await import('../../routes/proxy.js');
+    (cooldowns as Map<any, any>).clear();


The import of transientModelCooldowns from ../../routes/proxy.js will fail because transientModelCooldowns is not defined or exported in server/src/routes/proxy.ts. This will throw a TypeError when attempting to call .clear() on it, causing the test suite to crash.

Suggested change

// Dynamic import to get the same module instance used by the running app

const { transientModelCooldowns: cooldowns } = await import('../../routes/proxy.js');

(cooldowns as Map<any, any>).clear();

(stickySessionMap as Map<any, any>).clear();

const db = getDb();

gemini-code-assist · 2026-06-02T12:46:54Z

+      // Check if cooldown would exhaust all models
+      let wouldExhaustAll = true;
+      for (const id of allEnabledIds) {
+        if (!activeCooldownModels.has(id)) {
+          wouldExhaustAll = false;
+          break;
+        }
+      }


The exhaustion protection check wouldExhaustAll only checks if activeCooldownModels contains all enabled models. It does not take into account the models that are already present in skipModels (e.g., models from banned platforms or previously failed attempts).

If skipModels already contains some models, adding activeCooldownModels to skipModels can exhaust all enabled models, causing the router to fail and return a 503/502 error to the user.

The check should verify if the union of activeCooldownModels and skipModels would exhaust all enabled models.

Suggested change

// Check if cooldown would exhaust all models

let wouldExhaustAll = true;

for (const id of allEnabledIds) {

if (!activeCooldownModels.has(id)) {

wouldExhaustAll = false;

break;

}

}

// Check if cooldown would exhaust all models

let wouldExhaustAll = true;

for (const id of allEnabledIds) {

if (!activeCooldownModels.has(id) && !skipModels.has(id)) {

wouldExhaustAll = false;

break;

}

}

gemini-code-assist · 2026-06-02T12:46:54Z

 import { Router } from 'express';
 import type { Request, Response } from 'express';
 import { z } from 'zod';
+import { evaluateThreadProtection } from '../services/threadProtection.js';


The evaluateThreadProtection function is imported here but never used in this file. The 6 hardcoded longcat blocks described in the design spec (.roo/specs/generalized-thread-protection/design.md) have not been replaced with calls to evaluateThreadProtection.

gemini-code-assist · 2026-06-02T12:46:54Z

+export function evaluateThreadProtection(_ctx: ErrorContext): ThreadProtectionAction {
+  // Placeholder implementation: no protection
+  return { banProvider: false, skipModel: false, clearStickyIfPinned: false, reason: 'off' };
+}


The evaluateThreadProtection function is currently just a placeholder returning a static action with no protection. The actual scanner logic and decision matrix described in the design spec have not been implemented.

gemini-code-assist · 2026-06-02T12:46:54Z

+# Find the if(stream) line
+stream_start = None
+for i, line in{


The Python script is truncated and contains a syntax error at line 15 (for i, line in{). This will cause the script to fail with a SyntaxError if executed.

gemini-code-assist · 2026-06-02T12:46:54Z

+    error.status =
+      typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)
+        ? Number((errPayload as Record<string, unknown>).code)
+        : 200;


If errPayload.code is a non-numeric string (e.g., "rate_limit_error"), calling Number(...) on it will return NaN. This can cause unexpected behavior in downstream error helpers like isRateLimitError or isRetryableError which expect a valid numeric status code.

Consider checking if the parsed number is a valid number, and default to 200 if it is NaN.

Suggested change

error.status =

typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)

? Number((errPayload as Record<string, unknown>).code)

: 200;

const parsedCode = typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)

? Number((errPayload as Record<string, unknown>).code)

: 200;

error.status = isNaN(parsedCode) ? 200 : parsedCode;

gemini-code-assist · 2026-06-02T12:46:54Z

+
+## Problem Statement
+
+The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{


The markdown file is truncated and ends abruptly with longcat{`. Please complete the requirements specification.

kilo-code-bot · 2026-06-02T12:49:44Z

+                  });
+                } else {
+                  res.write(`data: ${JSON.stringify(payload)}\n\n`);
+                  res.write('data: [DONE]\n\n');;


SUGGESTION: Double semicolon on this line - res.write('data: [DONE]\n\n');; should be res.write('data: [DONE]\n\n');

kilo-code-bot · 2026-06-02T12:51:16Z

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	0
WARNING	0
SUGGESTION	1

Issue Details (click to expand)

SUGGESTION

File	Line	Issue
`server/src/routes/proxy.ts`	1347	Double semicolon typo - `;;` should be `;`

Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File	Line	Issue
`server/src/routes/proxy.ts`	5	Imported `evaluateThreadProtection` from `threadProtection.js` is never used in this file
`server/src/services/threadProtection.ts`	20-22	Placeholder function `evaluateThreadProtection` exists but is never called; consider removing if unused

Files Reviewed (15 files)

server/src/providers/base.ts - No issues
server/src/providers/openai-compat.ts - No issues
server/src/providers/cohere.ts - No issues
server/src/providers/cloudflare.ts - No issues
server/src/providers/google.ts - No issues
server/src/routes/proxy.ts - 2 issues
server/src/services/threadProtection.ts - No issues (placeholder)
server/src/__tests__/routes/proxy-tools.test.ts - No issues
server/src/__tests__/routes/stream-heartbeat-stall.test.ts - No issues
.roo/specs/wrapped-error-interception/design.md - No issues (spec)
.roo/specs/wrapped-error-interception/requirements.md - No issues (spec)
.roo/specs/wrapped-error-interception/tasks.md - No issues (spec)
.roo/specs/generalized-thread-protection/design.md - No issues (spec)
.roo/specs/generalized-thread-protection/requirements.md - No issues (spec)
fix_streaming.py - No issues (incomplete script)

_{Reviewed by laguna-m.1-20260312:free · 1,266,531 tokens}

coderabbitai

Actionable comments posted: 13

🧹 Nitpick comments (1)

.roo/specs/generalized-thread-protection/design.md (1)
43-45: 💤 Low value

Add language identifier to code fence.

The code fence at line 43 is missing a language identifier, which improves syntax highlighting and accessibility.
📝 Proposed fix
-```
+```bash
 THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"
</details>

As per coding guidelines: markdownlint-cli2 flagged missing language specifier.

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.roo/specs/generalized-thread-protection/design.md around lines 43 - 45, The
code fence containing
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" lacks a
language identifier; update the fenced block that contains the
THREAD_PROTECTION_PLATFORMS assignment to include a language tag (e.g., bash) so
the fence becomes bash ... , ensuring the block around the
THREAD_PROTECTION_PLATFORMS string is modified accordingly to satisfy
markdownlint-cli2.
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.roo/specs/generalized-thread-protection/requirements.md:

Around line 1-5: The requirements document ends mid-sentence ("special-case
the longcat{") and must be completed or the fragment removed; update the
Requirements: Generalized Thread Protection Scanner doc to finish the Problem
Statement by describing that the proxy route handler has 6+ hardcoded branches
that special-case longcat, then add clear objectives, scope, and acceptance
criteria for the generalized scanner (use the design.md as the source to draft
the missing text), or delete the unfinished sentence if unintentional—ensure
mention of the proxy route handler and the longcat special-case so reviewers
can trace the requirement to the implementation.

In @.roo/specs/wrapped-error-interception/design.md:

Around line 108-120: The try/catch around JSON.parse currently swallows any
synchronous throws inside it (including ProviderApiError thrown by
throwWrappedError), so in openai-compat.ts, cohere.ts and cloudflare.ts move the
wrapped-error detection and throw out of the parse try block: only JSON.parse
and parsing-related validations remain inside try/catch (to skip malformed
chunks), then after successful parse call this.isWrappedError(parsed) and, if
true, call this.throwWrappedError(parsed) so ProviderApiError propagates; mirror
the safe pattern used in google.ts and ensure generator yield only occurs after
the wrapped-error check.

In @fix_streaming.py:

Around line 13-15: The file fix_streaming.py currently contains an incomplete
for loop ("for i, line in{") causing a SyntaxError; either finish the script or
remove it. To fix, complete the iteration over the source lines (replace the
malformed "for i, line in{" with a proper iterable, e.g., enumerate(lines) or
similar), implement the logic that searches for the "if(stream)" marker and
sets/uses the stream_start variable, and ensure the loop and file end with valid
Python syntax (close blocks and handle cases where "if(stream)" is not found).
Alternatively, if this file is not needed, delete it to avoid introducing a
broken file into the repo.

In @server/src/__tests__/routes/proxy-tools.test.ts:

Around line 813-817: The test fails because transientModelCooldowns is not
exported from the proxy router; either export the actual cooldown Map from the
router module (e.g., add and export a Map named transientModelCooldowns
alongside stickySessionMap and the set/clear helpers in proxy.ts) so the test
can import and call .clear(), or add a dedicated exported helper (e.g.,
clearTransientModelCooldowns) in proxy.ts and update the test to call that
helper instead; ensure the exported symbol is the same runtime instance the
router uses so clearing it affects the router state.

In @server/src/providers/base.ts:

Around line 135-149: The wrapped-error path in throwWrappedError uses
extractErrorMessage which currently ignores string-valued error payloads (Format
2), causing the fallback message to be used; update extractErrorMessage to
detect when body.error (or the extracted err.error) is a string and return that
string as the message (instead of attempting to read .message), then leave
throwWrappedError as-is so ProviderApiError contains the actual string message
for formats like {"error": "Rate limit exceeded"}; reference the
extractErrorMessage function and the throwWrappedError method to locate and
change the logic.

Around line 142-145: The wrapped-error handling should guard against
non-numeric codes and string error payloads: in the block that sets
error.status (the code that currently does Number((errPayload as
Record<string, unknown>).code)), validate that errPayload.code exists and is a
finite numeric value before using it, otherwise fall back to 200 (or undefined)
so proxy.ts’s getErrorStatus() and isRateLimitError() work correctly; also
update extractErrorMessage() to handle body.error being a string (use the
string directly) as well as the object form error.message so message-based
rate-limit detection sees the real text. Ensure you reference and fix the error
assignment logic in the provider (error.status setting) and the
extractErrorMessage routine to cover both string and object error shapes.

In @server/src/providers/cloudflare.ts:

Around line 123-131: The parsing try/catch currently wraps both JSON.parse and
the wrapped-error check so throwWrappedError(parsed) can be swallowed; refactor
so JSON.parse(data) is the only operation inside the try/catch (catch continues
to skip malformed chunks), then after a successful parse check
this.isWrappedError(parsed) and call this.throwWrappedError(parsed) outside the
catch block so the thrown stream error is not suppressed.

In @server/src/providers/cohere.ts:

Around line 114-122: The JSON.parse try/catch is currently swallowing
exceptions thrown by throwWrappedError; separate parsing from the wrapped-error
check by limiting the try/catch to only JSON.parse (in the block that
returns/continues on parse failure), then after a successful parse call
this.isWrappedError(parsed) and, if true, call this.throwWrappedError(parsed)
outside the catch so the thrown stream error propagates; only yield parsed when
it passed the wrapped-error check. Ensure you update the logic around
ChatCompletionChunk parsing, isWrappedError, throwWrappedError, and the
generator yield accordingly.

In @server/src/providers/openai-compat.ts:

Around line 130-138: The current try/catch around JSON.parse swallows
exceptions thrown by throwWrappedError(), so wrapped ProviderApiError never
propagates; refactor the block in openai-compat.ts that parses
ChatCompletionChunk so JSON parsing and wrapped-error handling are separated:
first attempt JSON.parse(data) inside a try that only catches/parses malformed
JSON and yields/continues on parse failure, but after a successful parse call
this.isWrappedError(parsed) and, if true, call this.throwWrappedError(parsed)
outside the parse catch so the error bubbles; reference the parsed variable,
isWrappedError, and throwWrappedError to locate the code to change.

In @server/src/routes/proxy.ts:

Line 5: The file imports evaluateThreadProtection but still uses manual
longcat/cooldown branches; replace the thread protection block (and the other 6
hardcoded longcat branches) by calling evaluateThreadProtection({ platform:
prefRow?.platform ?? '', kind: 'retryable', midStream: false, modelDbId:
preferredModel }) and then apply its directives: if action.banProvider &&
prefRow?.platform call addProviderModelsToSkipModels(skipModels,
prefRow.platform) and log the reason, and if action.skipModel do
skipModels.add(preferredModel); update the block around the existing
skipModels/preferredModel/prefRow logic to use this pattern and remove the
manual cooldown/longcat branching.

Around line 1207-1260: Replace the in-place cooldown loop with a call to the
shared scanner: remove the manual logic that iterates stickySessionMap and
computes activeCooldownModels (the entire Thread protection block) and instead
call evaluateThreadProtection(...) to populate skipModels; pass in
normalizedMessages and routingMode (for getSessionKey behavior), the
stickySessionMap and any TTL/COOLDOWN constants (STICKY_TTL_MS,
THREAD_COOLDOWN_MS), and the current preferredModel so the scanner can apply
exhaustion protection; ensure evaluateThreadProtection returns or mutates
skipModels (or adapt its return to be merged into skipModels) and remove the
console logs in favor of the scanner’s logging.

Line 1347: Remove the stray extra semicolon after the response write call:
locate the res.write('data: [DONE]\n\n');; statement (the res.write call) and
delete the duplicate semicolon so it ends with a single semicolon; ensure no
other trailing punctuation was accidentally removed.

In @server/src/services/threadProtection.ts:

Around line 20-23: The evaluateThreadProtection function is a placeholder;
implement the scanner to read THREAD_PROTECTION_PLATFORMS (or fall back to a
default map) and determine the ProtectionLevel for ctx.platform, then return
actions per the decision matrix: 'off' => {banProvider:false, skipModel:false,
clearStickyIfPinned:false, reason:'off'}, 'provider-ban' => {banProvider:true,
skipModel:false, clearStickyIfPinned:true, reason:${ctx.platform} ${ctx.kind} error — provider-ban level}, and 'model-skip' => {banProvider:false,
skipModel:true, clearStickyIfPinned:false, reason:${ctx.platform} ${ctx.kind} error — model-skip level}; add a helper getProtectionLevel(platform: string) to
parse the env var (e.g. JSON or comma map) and return the configured level or
defaults (default: longcat → provider-ban, others → model-skip).

Nitpick comments:
In @.roo/specs/generalized-thread-protection/design.md:

Around line 43-45: The code fence containing
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" lacks a
language identifier; update the fenced block that contains the
THREAD_PROTECTION_PLATFORMS assignment to include a language tag (e.g., bash) so
the fence becomes bash ... , ensuring the block around the
THREAD_PROTECTION_PLATFORMS string is modified accordingly to satisfy
markdownlint-cli2.
</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ]  Push a commit to this branch (recommended)
- [ ]  Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro Plus

**Run ID**: `15fc3aaa-cc25-4f42-b85e-2b2da7d3a8c0`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 233e03142895cd8c18d441461d5c9568186c7123 and 09bffb08a0cfa590a4de3513bf06af00e58be60b.

</details>

<details>
<summary>📒 Files selected for processing (15)</summary>

* `.roo/specs/generalized-thread-protection/design.md`
* `.roo/specs/generalized-thread-protection/requirements.md`
* `.roo/specs/wrapped-error-interception/design.md`
* `.roo/specs/wrapped-error-interception/requirements.md`
* `.roo/specs/wrapped-error-interception/tasks.md`
* `fix_streaming.py`
* `server/src/__tests__/routes/proxy-tools.test.ts`
* `server/src/__tests__/routes/stream-heartbeat-stall.test.ts`
* `server/src/providers/base.ts`
* `server/src/providers/cloudflare.ts`
* `server/src/providers/cohere.ts`
* `server/src/providers/google.ts`
* `server/src/providers/openai-compat.ts`
* `server/src/routes/proxy.ts`
* `server/src/services/threadProtection.ts`

</details>

</details>

coderabbitai · 2026-06-02T12:54:13Z

+# Requirements: Generalized Thread Protection Scanner
+
+## Problem Statement
+
+The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Requirements document appears incomplete.

The file ends mid-sentence at line 5 with "special-case the longcat{". This should be completed or removed if it's unintentional.

Would you like me to help complete this requirements document based on the design.md file?

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/generalized-thread-protection/requirements.md around lines 1 - 5, The requirements document ends mid-sentence ("special-case the `longcat`{") and must be completed or the fragment removed; update the Requirements: Generalized Thread Protection Scanner doc to finish the Problem Statement by describing that the proxy route handler has 6+ hardcoded branches that special-case `longcat`, then add clear objectives, scope, and acceptance criteria for the generalized scanner (use the design.md as the source to draft the missing text), or delete the unfinished sentence if unintentional—ensure mention of the proxy route handler and the `longcat` special-case so reviewers can trace the requirement to the implementation.

coderabbitai · 2026-06-02T12:54:13Z

+```typescript
+try {
+  const parsed = JSON.parse(data) as ChatCompletionChunk;
+  if (this.isWrappedError(parsed)) {
+    this.throwWrappedError(parsed);
+  }
+  yield parsed;
+} catch {
+  // Skip malformed chunks
+}
+```
+
+**Note**: The `catch` block already skips malformed chunks. The `throwWrappedError()` call throws before `yield`, so the generator terminates immediately. The `try/catch` around `JSON.parse` does NOT catch the `ProviderApiError` thrown by `throwWrappedError()` because that throw happens after successful parsing — it propagates out of the generator to the consumer in `proxy.ts`.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Incorrect claim: the catch block does catch throwWrappedError().

The note states the try/catch around JSON.parse does NOT catch the ProviderApiError because the throw happens after successful parsing. That is wrong — a try/catch catches every synchronous throw inside the try body regardless of where it occurs. As written in openai-compat.ts, cohere.ts, and cloudflare.ts, throwWrappedError() is called inside the try, so the empty catch {} swallows it and the wrapped error is silently skipped. The check must be moved outside the parse try/catch (as already done in google.ts). See the corresponding code comments.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/wrapped-error-interception/design.md around lines 108 - 120, The try/catch around JSON.parse currently swallows any synchronous throws inside it (including ProviderApiError thrown by throwWrappedError), so in openai-compat.ts, cohere.ts and cloudflare.ts move the wrapped-error detection and throw out of the parse try block: only JSON.parse and parsing-related validations remain inside try/catch (to skip malformed chunks), then after successful parse call this.isWrappedError(parsed) and, if true, call this.throwWrappedError(parsed) so ProviderApiError propagates; mirror the safe pattern used in google.ts and ensure generator yield only occurs after the wrapped-error check.

coderabbitai · 2026-06-02T12:54:13Z

+# Find the if(stream) line
+stream_start = None
+for i, line in{


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash python - <<'PY' from pathlib import Path import ast path = Path("fix_streaming.py") source = path.read_text() try: ast.parse(source, filename=str(path)) print("OK: syntax is valid") except SyntaxError as e: print(f"SyntaxError: {e.msg} at line {e.lineno}, column {e.offset}") print(source.splitlines()[e.lineno - 1]) PY

Repository: vi70x3/freellmapi

Length of output: 133

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- wc -l ----" wc -l fix_streaming.py echo "---- sed -n '8,20p' with numbers ----" nl -ba fix_streaming.py | sed -n '8,20p' echo "---- tail (last 5 lines) ----" tail -n 5 fix_streaming.py | nl -ba

Repository: vi70x3/freellmapi

Length of output: 196

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- cat -n (entire file) ----" cat -n fix_streaming.py echo "---- python: total lines + last line repr + newline presence ----" python - <<'PY' from pathlib import Path p = Path("fix_streaming.py") s = p.read_text() print("len(text) =", len(s)) print("endswith_newline =", s.endswith("\n")) lines = s.splitlines(True) # keepends print("line_count =", len(lines)) for i, line in enumerate(lines, start=1): if i >= len(lines)-3: print(f"line {i} repr: {line!r}") PY

Repository: vi70x3/freellmapi

Length of output: 859

🏁 Script executed:

python - <<'PY' from pathlib import Path import ast p = Path("fix_streaming.py") ast.parse(p.read_text(), filename=str(p)) print("OK") PY

Repository: vi70x3/freellmapi

Length of output: 398

Critical: fix_streaming.py is incomplete/invalid Python (ends mid-for loop).

fix_streaming.py stops at line 15 (for i, line in{), causing SyntaxError: '{' was never closed; finish the script so it fully parses (or remove the partial file) before merging.

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 15-15: unexpected EOF while parsing

(invalid-syntax)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@fix_streaming.py` around lines 13 - 15, The file fix_streaming.py currently contains an incomplete for loop ("for i, line in{") causing a SyntaxError; either finish the script or remove it. To fix, complete the iteration over the source lines (replace the malformed "for i, line in{" with a proper iterable, e.g., enumerate(lines) or similar), implement the logic that searches for the "if(stream)" marker and sets/uses the stream_start variable, and ensure the loop and file end with valid Python syntax (close blocks and handle cases where "if(stream)" is not found). Alternatively, if this file is not needed, delete it to avoid introducing a broken file into the repo.

coderabbitai · 2026-06-02T12:54:13Z

+  beforeEach(async () => {
    (stickySessionMap as Map<any, any>).clear();
+    // Dynamic import to get the same module instance used by the running app
+    const { transientModelCooldowns: cooldowns } = await import('../../routes/proxy.js');
+    (cooldowns as Map<any, any>).clear();


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for transientModelCooldowns export in proxy.ts rg -n 'export.*transientModelCooldowns|transientModelCooldowns.*=' server/src/routes/proxy.ts

Repository: vi70x3/freellmapi

Length of output: 43

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Confirm whether the symbol exists anywhere (definition/usage) rg -n "transientModelCooldowns" server/src/routes/proxy.ts server/src/routes || true # 2) Broader search in repo for any export or re-export patterns rg -n "transientModelCooldowns" server/src || true # 3) Check for export-block patterns that might not match the previous regex rg -n "export\s*\{[^}]*transientModelCooldowns|export\s+default\s+.*transientModelCooldowns|transientModelCooldowns\s*:" server/src/routes/proxy.ts || true # 4) Print the most relevant sections around where it is defined/exported (if found) # If rg above finds nothing, this will help locate any similarly named cooldown maps. if rg -n "transientModelCooldowns" server/src/routes/proxy.ts >/dev/null 2>&1; then python3 - <<'PY' import re, subprocess, os, textwrap, sys, pathlib, json path="server/src/routes/proxy.ts" out=subprocess.check_output(["rg","-n","transientModelCooldowns",path]).decode() lines=out.strip().splitlines() # Collect line numbers lns=[] for l in lines: m=re.match(r"(\d+):", l) if m: lns.append(int(m.group(1))) # Expand contexts for ln in sorted(set(lns)): start=max(1, ln-20) end=ln+20 subprocess.run(["sed","-n",f"{start},{end}p",path],check=False) PY fi

Repository: vi70x3/freellmapi

Length of output: 203

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "cooldown" server/src/routes/proxy.ts || true rg -n "^export\s" server/src/routes/proxy.ts || true rg -n "export\s*\{" server/src/routes/proxy.ts || true rg -n "transient" server/src/routes/proxy.ts || true

Repository: vi70x3/freellmapi

Length of output: 684

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== occurrences of transientModelCooldowns in server/src ==" rg -n "transientModelCooldowns" server/src || true echo "== cooldown-related identifiers in proxy.ts ==" rg -n "(Cooldowns|cooldowns|cooldownExclusions|cooldownWindow|cooldown.*Map|ModelCooldown)" server/src/routes/proxy.ts || true echo "== export statements in proxy.ts ==" rg -n "^export (const|function|class|type)" server/src/routes/proxy.ts || true rg -n "^export \{" server/src/routes/proxy.ts || true rg -n "^export default" server/src/routes/proxy.ts || true echo "== show around top-level export block (around line 1-260) ==" sed -n '1,260p' server/src/routes/proxy.ts echo "== show around cooldown-related logic (around 1180-1320) ==" sed -n '1160,1320p' server/src/routes/proxy.ts

Repository: vi70x3/freellmapi

Length of output: 17075

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== list server/src/routes ==" ls -la server/src/routes || true echo "== list any proxy.js/proxy.ts in server/src/routes ==" fd -a "proxy\.js$|proxy\.ts$" server/src/routes || true echo "== search entire repo for transientModelCooldowns ==" rg -n "transientModelCooldowns" . || true echo "== locate ratelimit service files ==" fd -a "ratelimit\.ts$|ratelimit\.js$" server/src/services || true echo "== search ratelimit for transientModelCooldowns ==" rg -n "transientModelCooldowns" server/src/services || true echo "== show ratelimit exports around suspected cooldown maps ==" # Print a small window around any exports from ratelimit rg -n "^export" server/src/services/ratelimit.* 2>/dev/null || true

Repository: vi70x3/freellmapi

Length of output: 1619

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== proxy-tools.test.ts around the transientModelCooldowns import ==" sed -n '760,860p' server/src/__tests__/routes/proxy-tools.test.ts echo "== ratelimit.ts contents (focused on cooldown implementation) ==" sed -n '1,220p' server/src/services/ratelimit.ts echo "== find any exported cooldown Maps/objects (broader) ==" rg -n "cooldown|cooldowns|Cooldown|transient" server/src/services/ratelimit.ts server/src/routes/proxy.ts server/src/__tests__/routes/proxy-tools.test.ts || true

Repository: vi70x3/freellmapi

Length of output: 17048

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '150,220p' server/src/routes/proxy.ts

Repository: vi70x3/freellmapi

Length of output: 2784

Fix: transientModelCooldowns is not exported from server/src/routes/proxy.ts.

The test imports { transientModelCooldowns } and immediately calls (cooldowns as Map<any, any>).clear(), but server/src/routes/proxy.ts only exports the sticky/session helpers (stickySessionMap, setStickyModel, clearStickyModel, etc.)—there’s no transientModelCooldowns export—so cooldowns would be undefined and the test will fail at runtime. Update proxy.ts to export/reset the intended cooldown Map, or change the test to clear the actual cooldown state used by the router.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/__tests__/routes/proxy-tools.test.ts` around lines 813 - 817, The test fails because transientModelCooldowns is not exported from the proxy router; either export the actual cooldown Map from the router module (e.g., add and export a Map named transientModelCooldowns alongside stickySessionMap and the set/clear helpers in proxy.ts) so the test can import and call .clear(), or add a dedicated exported helper (e.g., clearTransientModelCooldowns) in proxy.ts and update the test to call that helper instead; ensure the exported symbol is the same runtime instance the router uses so clearing it affects the router state.

coderabbitai · 2026-06-02T12:54:13Z

+  protected throwWrappedError(body: unknown): void {
+    const obj = body as Record<string, unknown>;
+    const errPayload = obj.error;
+    const message = this.extractErrorMessage(body, 'Unknown wrapped error');
+    const error = new Error(
+      `${this.name} API error (wrapped in 200): ${message}`,
+    ) as ProviderApiError;
+    error.status =
+      typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)
+        ? Number((errPayload as Record<string, unknown>).code)
+        : 200;
+    error.provider = this.name;
+    error.responseBody = body;
+    throw error;
+  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

String-form wrapped error loses its message.

For the documented Format 2 ({"error": "Rate limit exceeded"}), extractErrorMessage(body, ...) returns the fallback: err.error is a string, so err.error?.message is undefined, falling through to 'Unknown wrapped error'. This violates FR-4 ("when the value is a string, the string itself must be used") and contradicts design Format 2. Handle string-valued error in extractErrorMessage:

🐛 Extract string-valued error messages

protected extractErrorMessage(body: unknown, fallback: string): string { if (typeof body === 'string') return body || fallback; if (!body || typeof body !== 'object') return fallback; - const err = body as { error?: { message?: string }; errors?: Array<{ message?: string }>; message?: string }; + const err = body as { error?: string | { message?: string }; errors?: Array<{ message?: string }>; message?: string }; + if (typeof err.error === 'string') return err.error || fallback; return err.error?.message ?? err.errors?.[0]?.message ?? err.message ?? fallback; }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/providers/base.ts` around lines 135 - 149, The wrapped-error path in throwWrappedError uses extractErrorMessage which currently ignores string-valued error payloads (Format 2), causing the fallback message to be used; update extractErrorMessage to detect when body.error (or the extracted err.error) is a string and return that string as the message (instead of attempting to read .message), then leave throwWrappedError as-is so ProviderApiError contains the actual string message for formats like {"error": "Rate limit exceeded"}; reference the extractErrorMessage function and the throwWrappedError method to locate and change the logic.

coderabbitai · 2026-06-02T12:54:14Z

        try {
-          yield JSON.parse(data) as ChatCompletionChunk;
+          const parsed = JSON.parse(data) as ChatCompletionChunk;
+          if (this.isWrappedError(parsed)) {
+            this.throwWrappedError(parsed);
+          }
+          yield parsed;
        } catch {
          // Skip malformed chunks
        }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: wrapped stream errors are silently swallowed.

throwWrappedError() throws inside the try, and the catch {} (intended only for malformed JSON) catches and discards it. The ProviderApiError never propagates, so wrapped errors in streaming are skipped instead of aborting the stream (violates FR-6). Move the check outside the parse try/catch, mirroring google.ts.

🐛 Separate parsing from the wrapped-error check

- try { - const parsed = JSON.parse(data) as ChatCompletionChunk; - if (this.isWrappedError(parsed)) { - this.throwWrappedError(parsed); - } - yield parsed; - } catch { - // Skip malformed chunks - } + let parsed: ChatCompletionChunk; + try { + parsed = JSON.parse(data) as ChatCompletionChunk; + } catch { + continue; // Skip malformed chunks + } + if (this.isWrappedError(parsed)) { + this.throwWrappedError(parsed); + } + yield parsed;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

yield JSON.parse(data) as ChatCompletionChunk;

const parsed = JSON.parse(data) as ChatCompletionChunk;

if (this.isWrappedError(parsed)) {

this.throwWrappedError(parsed);

}

yield parsed;

} catch {

// Skip malformed chunks

}

let parsed: ChatCompletionChunk;

try {

parsed = JSON.parse(data) as ChatCompletionChunk;

} catch {

continue; // Skip malformed chunks

}

if (this.isWrappedError(parsed)) {

this.throwWrappedError(parsed);

}

yield parsed;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/providers/openai-compat.ts` around lines 130 - 138, The current try/catch around JSON.parse swallows exceptions thrown by throwWrappedError(), so wrapped ProviderApiError never propagates; refactor the block in openai-compat.ts that parses ChatCompletionChunk so JSON parsing and wrapped-error handling are separated: first attempt JSON.parse(data) inside a try that only catches/parses malformed JSON and yields/continues on parse failure, but after a successful parse call this.isWrappedError(parsed) and, if true, call this.throwWrappedError(parsed) outside the parse catch so the error bubbles; reference the parsed variable, isWrappedError, and throwWrappedError to locate the code to change.

coderabbitai · 2026-06-02T12:54:14Z

 import { Router } from 'express';
 import type { Request, Response } from 'express';
 import { z } from 'zod';
+import { evaluateThreadProtection } from '../services/threadProtection.js';


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Imported function is never used.

evaluateThreadProtection is imported but never called. According to the design spec (.roo/specs/generalized-thread-protection/design.md), this function should replace the 6+ hardcoded longcat branches in this file, including the thread protection block at lines 1207-1260. The current implementation still uses manual cooldown logic instead of calling this function.

🔧 Integration guidance

Based on the design doc, replace the manual thread protection block (lines 1207-1260) with:

const action = evaluateThreadProtection({ platform: prefRow?.platform ?? '', kind: 'retryable', // or appropriate kind based on context midStream: false, modelDbId: preferredModel, }); if (action.banProvider && prefRow?.platform) { addProviderModelsToSkipModels(skipModels, prefRow.platform); console.log(`[ThreadProtection] ${action.reason}`); } if (action.skipModel) { skipModels.add(preferredModel); }

This pattern should also replace the 6 other hardcoded longcat branches mentioned in the design doc.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/routes/proxy.ts` at line 5, The file imports evaluateThreadProtection but still uses manual longcat/cooldown branches; replace the thread protection block (and the other 6 hardcoded longcat branches) by calling evaluateThreadProtection({ platform: prefRow?.platform ?? '', kind: 'retryable', midStream: false, modelDbId: preferredModel }) and then apply its directives: if action.banProvider && prefRow?.platform call addProviderModelsToSkipModels(skipModels, prefRow.platform) and log the reason, and if action.skipModel do skipModels.add(preferredModel); update the block around the existing skipModels/preferredModel/prefRow logic to use this pattern and remove the manual cooldown/longcat branching.

coderabbitai · 2026-06-02T12:54:14Z

+  // ── Thread protection: dynamically exclude models actively used by other sessions ──
+  // If another sticky session has used a model within THREAD_COOLDOWN_MS, that model
+  // becomes a soft-exclusion candidate. Exhaustion protection ensures we never block
+  // all available models — preferring shared access over outright failure.
+  {
+    const currentSessionKey = getSessionKey(normalizedMessages, routingMode);
+    const activeCooldownModels = new Set<number>();
+    const threadNow = Date.now();
+
+    for (const [key, entry] of stickySessionMap) {
+      // Self-preservation: never exclude the current session's own pinned model
+      if (currentSessionKey && key === currentSessionKey) continue;
+
+      // Expired entries are irrelevant — the session has gone quiet
+      if (threadNow - entry.lastUsed > STICKY_TTL_MS) continue;
+
+      // Only consider entries within the cooldown window
+      if (threadNow - entry.lastUsed < THREAD_COOLDOWN_MS) {
+        activeCooldownModels.add(entry.modelDbId);
+      }
+    }
+
+    // Exhaustion protection: if cooldown would block ALL available models,
+    // clear the set and let the request through rather than failing outright.
+    if (activeCooldownModels.size > 0) {
+      const db = getDb();
+      const allEnabled = db.prepare('SELECT id FROM models WHERE enabled = 1').all() as Array<{ id: number }>;
+      const allEnabledIds = new Set(allEnabled.map(m => m.id));
+
+      // Remove the current preferred model from cooldown consideration
+      if (preferredModel !== undefined) {
+        activeCooldownModels.delete(preferredModel);
+      }
+
+      // Check if cooldown would exhaust all models
+      let wouldExhaustAll = true;
+      for (const id of allEnabledIds) {
+        if (!activeCooldownModels.has(id)) {
+          wouldExhaustAll = false;
+          break;
+        }
+      }
+
+      if (wouldExhaustAll) {
+        console.log(`[ThreadProtection] cooldown would exhaust all ${allEnabled.length} models — clearing cooldown exclusions`);
+        activeCooldownModels.clear();
+      } else {
+        for (const modelDbId of activeCooldownModels) {
+          skipModels.add(modelDbId);
+        }
+        console.log(`[ThreadProtection] excluding ${activeCooldownModels.size} model(s) from routing: [${[...activeCooldownModels].join(',')}]`);
      }
    }
  }


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Thread protection implementation should use the imported scanner.

This manual cooldown logic should be replaced with the generalized evaluateThreadProtection() scanner as specified in the design doc. The current implementation hardcodes the cooldown behavior instead of using the configurable protection-level system.

Refer to the integration guidance in the earlier comment on line 5.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/routes/proxy.ts` around lines 1207 - 1260, Replace the in-place cooldown loop with a call to the shared scanner: remove the manual logic that iterates stickySessionMap and computes activeCooldownModels (the entire Thread protection block) and instead call evaluateThreadProtection(...) to populate skipModels; pass in normalizedMessages and routingMode (for getSessionKey behavior), the stickySessionMap and any TTL/COOLDOWN constants (STICKY_TTL_MS, THREAD_COOLDOWN_MS), and the current preferredModel so the scanner can apply exhaustion protection; ensure evaluateThreadProtection returns or mutates skipModels (or adapt its return to be merged into skipModels) and remove the console logs in favor of the scanner’s logging.

coderabbitai · 2026-06-02T12:54:14Z

+                  });
+                } else {
+                  res.write(`data: ${JSON.stringify(payload)}\n\n`);
+                  res.write('data: [DONE]\n\n');;


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove double semicolon.

Line 1347 has a double semicolon at the end of the statement.

🔧 Proposed fix

- res.write('data: [DONE]\n\n');; + res.write('data: [DONE]\n\n');

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

res.write('data: [DONE]\n\n');;

res.write('data: [DONE]\n\n');

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/routes/proxy.ts` at line 1347, Remove the stray extra semicolon after the response write call: locate the res.write('data: [DONE]\n\n');; statement (the res.write call) and delete the duplicate semicolon so it ends with a single semicolon; ensure no other trailing punctuation was accidentally removed.

coderabbitai · 2026-06-02T12:54:14Z

+export function evaluateThreadProtection(_ctx: ErrorContext): ThreadProtectionAction {
+  // Placeholder implementation: no protection
+  return { banProvider: false, skipModel: false, clearStickyIfPinned: false, reason: 'off' };
+}


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Implement the thread protection scanner logic.

The evaluateThreadProtection function is currently a placeholder that returns 'off' for all inputs. According to the design doc (.roo/specs/generalized-thread-protection/design.md), this function should:

Load protection configuration from THREAD_PROTECTION_PLATFORMS env var (or use default map)

Lookup the platform's protection level

Return the appropriate action based on the decision matrix in the design doc

Without this implementation, the thread protection refactoring in proxy.ts cannot proceed.

💡 Implementation guidance

Based on the design doc's decision matrix:

export function evaluateThreadProtection(ctx: ErrorContext): ThreadProtectionAction { const level = getProtectionLevel(ctx.platform); if (level === 'off') { return { banProvider: false, skipModel: false, clearStickyIfPinned: false, reason: 'off' }; } if (level === 'provider-ban') { return { banProvider: true, skipModel: false, clearStickyIfPinned: true, reason: `${ctx.platform} ${ctx.kind} error — provider-ban level` }; } // level === 'model-skip' return { banProvider: false, skipModel: true, clearStickyIfPinned: false, reason: `${ctx.platform} ${ctx.kind} error — model-skip level` }; } function getProtectionLevel(platform: string): ProtectionLevel { // Parse THREAD_PROTECTION_PLATFORMS env var or use default map // Default: longcat → provider-ban, others → model-skip }

Would you like me to generate the full implementation based on the design spec?

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/services/threadProtection.ts` around lines 20 - 23, The evaluateThreadProtection function is a placeholder; implement the scanner to read THREAD_PROTECTION_PLATFORMS (or fall back to a default map) and determine the ProtectionLevel for ctx.platform, then return actions per the decision matrix: 'off' => {banProvider:false, skipModel:false, clearStickyIfPinned:false, reason:'off'}, 'provider-ban' => {banProvider:true, skipModel:false, clearStickyIfPinned:true, reason:`${ctx.platform} ${ctx.kind} error — provider-ban level`}, and 'model-skip' => {banProvider:false, skipModel:true, clearStickyIfPinned:false, reason:`${ctx.platform} ${ctx.kind} error — model-skip level`}; add a helper getProtectionLevel(platform: string) to parse the env var (e.g. JSON or comma map) and return the configured level or defaults (default: longcat → provider-ban, others → model-skip).

vi70x3 added 2 commits June 2, 2026 15:12

feat(providers): intercept wrapped error payloads on HTTP 200 responses

2fd34ba

feat(proxy): replace hardcoded LongCat/Owl Alpha cooldowns with gener…

09bffb0

…alized thread protection scanner

gemini-code-assist Bot reviewed Jun 2, 2026

View reviewed changes

kilo-code-bot Bot reviewed Jun 2, 2026

View reviewed changes

coderabbitai Bot suggested changes Jun 2, 2026

View reviewed changes

vi70x3 closed this Jun 2, 2026


		## Problem Statement

		The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{ No newline at end of file

	res.write('data: [DONE]\n\n');;
	res.write('data: [DONE]\n\n');

Conversation

vi70x3 commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

SUGGESTION

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

vi70x3 commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

kilo-code-bot Bot commented Jun 2, 2026 •

edited

Loading