Feat/realtime sticky#15
Conversation
…alized thread protection scanner
… longcat branches
…tracking for LongCat and Owl Alpha
- Change activeRequests from Map to Set to allow concurrent requests from same session - Add stale active request cleanup with 10-minute TTL - Cache owl-alpha model ID to avoid repeated DB lookups - Fix active request iteration to use Set-compatible syntax
Reviewer's GuideImplements generalized realtime sticky / thread protection: disables sticky sessions in balanced mode, adds shared transient model cooldowns and active-request-based protection for LongCat and Owl Alpha, extends router scoring with recency-biased stats and balanced-pool exclusions, introduces wrapped-error interception in providers, adds SSE heartbeat/stall protection for streams, and updates fallback UI to surface pools, along with comprehensive tests and design docs. Sequence diagram for active-request-based LongCat/Owl Alpha protectionsequenceDiagram
actor ClientA
actor ClientB
participant Proxy as handleChatCompletion
participant Router as routeRequest
participant Provider as provider.streamChatCompletion
ClientA->>Proxy: POST /chat (sessionKey A, LongCat)
Proxy->>Router: routeRequest(..., routingMode)
Router-->>Proxy: route (platform=longcat)
Proxy->>Proxy: activeRequests.add({sessionKey A, platform longcat, modelId})
Proxy->>Provider: streamChatCompletion(...)
ClientB->>Proxy: POST /chat (sessionKey B, auto-smart)
Proxy->>Router: routeRequest(..., routingMode)
Router-->>Proxy: candidate routes incl. LongCat
Proxy->>Proxy: scan activeRequests
alt otherSessionUsingLongCat
Proxy->>Proxy: addProviderModelsToSkipModels(skipModels, longcat)
Proxy->>Router: routeRequest retries with skipModels
end
Provider-->>Proxy: stream ends for ClientA
Proxy->>Proxy: delete activeRequests entry for sessionKey A
Proxy-->>ClientA: SSE complete
Flow diagram for updated routing with pools and recency biasflowchart TD
A[routeRequest] --> B[load enabled models chain]
B --> C{routingMode == balanced?}
C -- yes --> D[filter chain using EXCLUDED_FROM_BALANCED and EXCLUDED_MODELS_FROM_BALANCED]
C -- no --> E[use full chain]
D --> F[compute intelligenceRanks]
E --> F
F --> G[refreshStatsCache with recency-weighted total/successes]
G --> H[build ModelStats with rawTotal, total, successes]
H --> I[compute effectiveScore per entry]
I --> J[sort entries by effectiveScore]
J --> K{routingMode == smart?}
K -- no --> L[iterate sorted routes normally]
K -- yes --> M[LongCat preference: hasValidKeys for longcat]
M -->|true| N[move longcat entries to front]
M -->|false| O[leave order]
N --> P[Owl Alpha preference: hasValidKeys for owl-alpha]
O --> P
P -->|true| Q[insert owl-alpha just after any longcat entries]
P -->|false| R[keep order]
Q --> L
R --> L
L --> S[return best route to proxy]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughLarge cross-functional PR implementing sticky-session disabling for balanced routing, wrapped-error detection across providers, model-level banning for LongCat/Owl Alpha, recency-weighted analytics, transient model cooldown tracking, SSE stream stall protection, client pool grouping, and comprehensive design specifications for planned features. ChangesSpecification & Design Documentation
Core Implementation
Test Coverage
Helper Scripts & Utilities
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
Review Summary by QodoRealtime Sticky Sessions with Active-Request Tracking, Transient Cooldowns, and Wrapped Error Interception
WalkthroughsDescription**Core Features:** • Replaced LongCat cooldown mechanism with real-time active-request tracking for concurrent session protection, enabling model-level bans instead of provider-level bans • Implemented identical active-request pattern for Owl Alpha model protection • Added transient model cooldown system with 15-second window to share failure state across concurrent requests, preventing redundant retries during outages • Introduced recency-weighted analytics scoring using 7-day decay function for fresher provider statistics • Excluded LongCat and Owl Alpha from balanced routing mode entirely, with smart-mode preference when valid keys exist • Disabled sticky sessions for balanced/auto endpoint by returning empty string from getSessionKey() in that mode **Error Handling & Resilience:** • Added wrapped error detection helpers (isWrappedError(), throwWrappedError()) to BaseProvider for HTTP 200 responses containing error payloads • Implemented wrapped error interception across all provider implementations (OpenAI-compat, Cohere, Cloudflare, Google) • Added SSE stream heartbeat keep-alive comments (15-second interval) to prevent intermediate proxy idle timeouts • Implemented stall detection (45-second threshold) with graceful termination of hung upstream connections • Added sticky session clearing logic when LongCat or Owl Alpha encounters 5xx errors or retryable failures **Configuration & Routing:** • Introduced generalized thread protection configuration module with configurable protection levels (provider-ban, model-skip, off) per platform • Added hasValidKeys() helper to check key capacity for platform/model combinations • Updated router to compute weighted success/total counts while tracking raw unweighted totals for dashboard transparency **UI Improvements:** • Added pool-based grouping to Fallback Configuration Page, organizing models by pool (fast, balanced, smart) • Added pool and speedRank properties to fallback API response with validation **Testing:** • Comprehensive test suite for transient model cooldown lifecycle, injection, and auto-recovery • New test suite for SSE stream heartbeat and stall detection with client disconnect cleanup • Updated provider-session-ban tests to validate balanced mode sticky session disabling • Updated proxy-tools tests to clear transient cooldowns in cleanup • Fixed router test infrastructure with proper imports and SQL syntax Diagramflowchart LR
A["HTTP 200 Response"] -->|"isWrappedError()"| B["Wrapped Error Detected"]
B -->|"throwWrappedError()"| C["ProviderApiError Thrown"]
C -->|"Retry Loop"| D["Request Retried"]
E["Concurrent Requests"] -->|"5xx/Connection Error"| F["transientModelCooldowns Map"]
F -->|"15s Cooldown"| G["Model Skipped Globally"]
G -->|"Sticky Override"| H["Preferred Model Bypassed"]
I["Balanced Mode"] -->|"getSessionKey()"| J["Empty String"]
J -->|"Cascades"| K["All Sticky Ops Disabled"]
L["LongCat/Owl Alpha"] -->|"Active Requests"| M["Concurrent Protection"]
M -->|"Model-Level Ban"| N["Per-Model Tracking"]
O["SSE Stream"] -->|"15s Interval"| P["Heartbeat Comment"]
O -->|"45s Stall"| Q["Stream Terminated"]
File Changes1. server/src/routes/proxy.ts
|
Code Review by Qodo
Context used✅ Compliance rules (platform):
6 rules 1. Extra paren breaks SQL
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- There are several scratch/automation scripts and temp files checked in (e.g. write_test.py, fix_streaming.py, fix.py, do_fix.py, server/write_tests.py, new_streaming_block.txt); these should either be removed from the repo or moved under an appropriate tooling directory and excluded from the production code path.
- The changes to server/src/tests/services/router.test.ts appear incomplete/corrupted (e.g. the malformed INSERT INTO api_keys statement with duplicate VALUES clause and the truncated test body ending in
const groqKey = encrypt); please restore this file to a syntactically valid state and ensure the intended test modifications are fully applied.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- There are several scratch/automation scripts and temp files checked in (e.g. write_test.py, fix_streaming.py, fix.py, do_fix.py, server/write_tests.py, new_streaming_block.txt); these should either be removed from the repo or moved under an appropriate tooling directory and excluded from the production code path.
- The changes to server/src/__tests__/services/router.test.ts appear incomplete/corrupted (e.g. the malformed INSERT INTO api_keys statement with duplicate VALUES clause and the truncated test body ending in `const groqKey = encrypt`); please restore this file to a syntactically valid state and ensure the intended test modifications are fully applied.
## Individual Comments
### Comment 1
<location path="fix_streaming.py" line_range="8-17" />
<code_context>
+with open('/tmp/before.ts', 'r') as f:
+ before = f.read()
+
+with open('/tmp/after.ts', 'r') as f:
+ after = f.read() # starts with "} else {"
+
+# The new streaming block
+new_streaming = r''' if (stream) {
+ // SSE headers set immediately so keep-alive works during TTFB.
+ // Pre-stream errors stay retryable; mid-stream errors emit an SSE error frame.
+ let totalOutputTokens = 0;
+ let streamedText = '';
+ let sawToolCalls = false;
+ let streamStarted = false;
+ let ttfbMs: number | null = null;
+ let lastChunkTimestamp = Date.now();
+ let heartbeatInterval: ReturnType<typeof setInterval>{
\ No newline at end of file
</code_context>
<issue_to_address>
**issue:** Helper script appears incomplete/truncated and may not be runnable as-is.
This script only reads `before.ts`/`after.ts` and defines `new_streaming`; it never performs a replacement or writes output. It also ends mid-line at `let heartbeatInterval: ReturnType<typeof setInterval>{`, so it isn’t syntactically valid.
If it’s just a local helper, either remove it from the repo or finish it so it compiles and has a clear usage, to avoid confusion for future maintainers and tooling.
</issue_to_address>
### Comment 2
<location path="fix.py" line_range="4" />
<code_context>
+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+ content = f.read()
+
</code_context>
<issue_to_address>
**issue:** This script is effectively a no-op and looks like an unfinished helper.
`fix.py` only reads `proxy.ts` and doesn’t implement the described replacement. If this script isn’t used, consider removing it to avoid confusion; if you intend to keep it, either add the replacement logic or a clear TODO indicating it’s incomplete and how it should be used.
</issue_to_address>
### Comment 3
<location path="do_fix.py" line_range="4-7" />
<code_context>
+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+ content = f.read()
+
+{
\ No newline at end of file
</code_context>
<issue_to_address>
**issue (bug_risk):** do_fix.py ends with a bare `{` and is syntactically invalid Python.
The trailing `{` makes this file invalid Python and will raise a syntax error if run. If this was just a scratch file, please either delete it from the repo or update it to be valid so it doesn't break tools that scan or import Python modules.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| with open('/tmp/after.ts', 'r') as f: | ||
| after = f.read() # starts with "} else {" | ||
|
|
||
| # The new streaming block | ||
| new_streaming = r''' if (stream) { | ||
| // SSE headers set immediately so keep-alive works during TTFB. | ||
| // Pre-stream errors stay retryable; mid-stream errors emit an SSE error frame. | ||
| let totalOutputTokens = 0; | ||
| let streamedText = ''; | ||
| let sawToolCalls = false; |
There was a problem hiding this comment.
issue: Helper script appears incomplete/truncated and may not be runnable as-is.
This script only reads before.ts/after.ts and defines new_streaming; it never performs a replacement or writes output. It also ends mid-line at let heartbeatInterval: ReturnType<typeof setInterval>{, so it isn’t syntactically valid.
If it’s just a local helper, either remove it from the repo or finish it so it compiles and has a clear usage, to avoid confusion for future maintainers and tooling.
| #!/usr/bin/env python3 | ||
| """Replace the streaming block in proxy.ts with Promise.race-based stall detection.""" | ||
|
|
||
| with open('server/src/routes/proxy.ts', 'r') as f: |
There was a problem hiding this comment.
issue: This script is effectively a no-op and looks like an unfinished helper.
fix.py only reads proxy.ts and doesn’t implement the described replacement. If this script isn’t used, consider removing it to avoid confusion; if you intend to keep it, either add the replacement logic or a clear TODO indicating it’s incomplete and how it should be used.
| with open('server/src/routes/proxy.ts', 'r') as f: | ||
| content = f.read() | ||
|
|
||
| { No newline at end of file |
There was a problem hiding this comment.
issue (bug_risk): do_fix.py ends with a bare { and is syntactically invalid Python.
The trailing { makes this file invalid Python and will raise a syntax error if run. If this was just a scratch file, please either delete it from the repo or update it to be valid so it doesn't break tools that scan or import Python modules.
There was a problem hiding this comment.
Code Review
This pull request introduces several enhancements, including disabling sticky sessions on the balanced endpoint, implementing model-level routing and smart preferences for Owl Alpha and LongCat, adding recency-biased Thompson Sampling, and handling wrapped error payloads on HTTP 200 responses. However, several critical issues and missing implementations were identified. There are SQL syntax errors in the time-decay query in router.ts, and the router.test.ts file is syntactically invalid due to a duplicated VALUES clause and truncation. Additionally, the SSE stream heartbeat/stall protection and shared temporary cooldown features are completely unimplemented in proxy.ts, causing test compilation failures, while the new threadProtection.ts module remains unused. Finally, several optimizations are recommended in proxy.ts to eliminate redundant database queries, simplify duplicate truncation-handling branches, and remove global mutable cache variables that could cause test flakiness.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const rows = db.prepare(` | ||
| SELECT platform, model_id, | ||
| COUNT(*) as total, | ||
| SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes, | ||
| COUNT(*) as raw_total, | ||
| SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total, | ||
| SUM(CASE WHEN status = 'success' | ||
| THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) | ||
| ELSE 0 END) as successes, |
There was a problem hiding this comment.
There are SQL syntax errors in the query due to mismatched parentheses on lines 184 and 186, which will cause a runtime crash when db.prepare() is executed.
- On line 184, there is an extra closing parenthesis:
SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))))(4 closing parentheses instead of 3). - On line 186, there is also an extra closing parenthesis:
THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))(3 closing parentheses instead of 2).
| const rows = db.prepare(` | |
| SELECT platform, model_id, | |
| COUNT(*) as total, | |
| SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes, | |
| COUNT(*) as raw_total, | |
| SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total, | |
| SUM(CASE WHEN status = 'success' | |
| THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) | |
| ELSE 0 END) as successes, | |
| const rows = db.prepare(` | |
| SELECT platform, model_id, | |
| COUNT(*) as raw_total, | |
| SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) as total, | |
| SUM(CASE WHEN status = 'success' | |
| THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)) | |
| ELSE 0 END) as successes, |
| const { encrypted, iv, authTag } = encrypt('test-groq-key'); | ||
| db.prepare(` | ||
| INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) | ||
| VALUES (?, |
| expect(result.platform).toBe('groq'); | ||
| }); | ||
| }); | ||
| const groqKey = encrypt |
There was a problem hiding this comment.
The test file is truncated at the end, leaving the file syntactically invalid and missing several tests and closing braces. Restoring the rest of the file fixes the issue.
const groqKey = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('groq');
});
it('should skip invalid keys', () => {
const db = getDb();
const invalidKey = encrypt('invalid-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('google', 'invalid', invalidKey.encrypted, invalidKey.iv, invalidKey.authTag, 'invalid', 1);
const groqKey = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('groq');
});
});| import type { Express } from 'express'; | ||
| import { createApp } from '../../app.js'; | ||
| import { initDb, getDb, getUnifiedApiKey } from '../../db/index.js'; | ||
| import { streamKeepaliveConfig } from '../../routes/proxy.js'; |
There was a problem hiding this comment.
This test file imports streamKeepaliveConfig from ../../routes/proxy.js, but streamKeepaliveConfig is neither defined nor exported in server/src/routes/proxy.ts. The "SSE Stream Heartbeats and Stall Protection" feature is completely unimplemented in proxy.ts, making this test file fail to compile. Please ensure the implementation of this feature is committed and exported.
| import { | ||
| transientModelCooldowns, | ||
| TRANSIENT_COOLDOWN_MS, | ||
| stickySessionMap, | ||
| addProviderModelsToSkipModels, | ||
| } from '../../routes/proxy.js'; |
There was a problem hiding this comment.
This test file imports transientModelCooldowns and TRANSIENT_COOLDOWN_MS from ../../routes/proxy.js, but they are neither defined nor exported in server/src/routes/proxy.ts. The "Shared Temporary Cooldowns" feature is completely unimplemented in proxy.ts, making this test file fail to compile. Please ensure the implementation of this feature is committed and exported.
| // Cached owl-alpha model ID: undefined = not yet looked up, null = not found, number = found | ||
| let cachedOwlAlphaModelId: number | null | undefined = undefined; |
| // Pure Active-Request Owl Alpha Safeguard: Exclude openrouter/owl-alpha | ||
| // from bandit routing for this request ONLY if another session is actively using it right now. | ||
| let otherSessionUsingOwl = false; | ||
|
|
||
| for (const active of activeRequests) { | ||
| if (active.sessionKey !== sessionKey && active.platform === 'openrouter' && active.modelId === 'owl-alpha') { | ||
| otherSessionUsingOwl = true; | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| if (otherSessionUsingOwl) { | ||
| if (cachedOwlAlphaModelId === undefined) { | ||
| const db = getDb(); | ||
| const owlRow = db.prepare("SELECT id FROM models WHERE platform = 'openrouter' AND model_id = 'owl-alpha'").get() as { id: number } | undefined; | ||
| cachedOwlAlphaModelId = owlRow ? owlRow.id : null; | ||
| } | ||
| if (cachedOwlAlphaModelId !== null) { | ||
| skipModels.add(cachedOwlAlphaModelId); | ||
| console.log(`[Sticky] Owl Alpha protection active — excluding openrouter/owl-alpha from bandit routing because another session is actively using it`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Caching cachedOwlAlphaModelId in a global mutable variable without resetting it can cause test flakiness or bugs when the database is re-initialized (e.g., in-memory DBs recreated between tests). Since querying the model ID by platform and model ID is extremely fast, it is safer and cleaner to query it directly when needed.
if (otherSessionUsingOwl) {
const db = getDb();
const owlRow = db.prepare("SELECT id FROM models WHERE platform = 'openrouter' AND model_id = 'owl-alpha'").get() as { id: number } | undefined;
if (owlRow) {
skipModels.add(owlRow.id);
console.log(`[Sticky] Owl Alpha protection active — excluding openrouter/owl-alpha from bandit routing because another session is actively using it`);
}
}| if (route.platform === 'longcat') { | ||
| // LongCat: exclude entire provider immediately on truncation | ||
| console.warn(`[Proxy] Truncated stream content detected from LongCat — banning LongCat provider for session`); | ||
| banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId); | ||
| addProviderModelsToSkipModels(skipModels, 'longcat'); | ||
| // LongCat: model-level ban | ||
| console.warn(`[Proxy] Truncated stream content detected from LongCat — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { | ||
| // Owl Alpha: model-level ban | ||
| console.warn(`[Proxy] Truncated stream content detected from Owl Alpha — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| } else { | ||
| // Non-LongCat: skip only this specific model, other models from same provider remain available | ||
| // Other providers: model-level ban | ||
| console.warn(`[Proxy] Truncated stream content detected from ${route.platform} — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| } |
There was a problem hiding this comment.
The conditional branches for longcat, owl-alpha, and other providers on truncation detection perform the exact same action (skipModels.add(route.modelDbId)). We can simplify this block significantly by combining them into a single dynamic log and action.
if (isTruncatedResponse(streamTextToCheck)) {
console.warn(`[Proxy] Truncated stream content detected from ${route.displayName} — skipping model ${route.modelId} for session`);
skipModels.add(route.modelDbId);
}| // Clear sticky if pinned to LongCat | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; | ||
| if (prefRow?.model_id === 'LongCat-2.0-Preview') { | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } | ||
| } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { | ||
| console.warn(`[Proxy] Mid-stream 5xx from Owl Alpha — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| // Clear sticky if pinned to Owl Alpha | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; | ||
| if (prefRow?.model_id === 'owl-alpha') { | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } | ||
| } else { |
There was a problem hiding this comment.
Inside the error handling blocks, the code queries the database to retrieve the model_id of preferredModel to check if it matches 'LongCat-2.0-Preview' or 'owl-alpha'. This is redundant because the active route that just failed (route) already contains the modelDbId and modelId. We can simply check if preferredModel === route.modelDbId to clear the sticky preference, avoiding any database queries. This optimization applies to all 6 occurrences of this pattern in the file.
if (route.platform === 'longcat') {
console.warn(`[Proxy] Mid-stream 5xx from LongCat — skipping model ${route.modelId} for session`);
skipModels.add(route.modelDbId);
// Clear sticky if pinned to LongCat
if (preferredModel === route.modelDbId) {
preferredModel = undefined;
preferredKeyId = undefined;
}
} else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
console.warn(`[Proxy] Mid-stream 5xx from Owl Alpha — skipping model ${route.modelId} for session`);
skipModels.add(route.modelDbId);
// Clear sticky if pinned to Owl Alpha
if (preferredModel === route.modelDbId) {
preferredModel = undefined;
preferredKeyId = undefined;
}
}| if (route.platform === 'longcat') { | ||
| console.warn(`[Proxy] Retryable error from LongCat — excluding entire LongCat provider for session`); | ||
| banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId); | ||
| addProviderModelsToSkipModels(skipModels, 'longcat'); | ||
| console.warn(`[Proxy] Retryable error from LongCat — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| // Clear sticky if pinned to LongCat | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; | ||
| if (prefRow?.model_id === 'LongCat-2.0-Preview') { | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } | ||
| } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { | ||
| console.warn(`[Proxy] Retryable error from Owl Alpha — skipping model ${route.modelId} for session`); | ||
| skipModels.add(route.modelDbId); | ||
| // Clear sticky if pinned to Owl Alpha | ||
| if (preferredModel) { | ||
| const db = getDb(); | ||
| const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; | ||
| if (prefRow?.platform === 'longcat') { | ||
| const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; | ||
| if (prefRow?.model_id === 'owl-alpha') { | ||
| preferredModel = undefined; | ||
| preferredKeyId = undefined; | ||
| } | ||
| } | ||
| } else { |
There was a problem hiding this comment.
We can optimize the sticky preference clearing here by directly comparing preferredModel === route.modelDbId instead of querying the database. This avoids redundant queries on every retryable error.
if (route.platform === 'longcat') {
console.warn(`[Proxy] Retryable error from LongCat — skipping model ${route.modelId} for session`);
skipModels.add(route.modelDbId);
// Clear sticky if pinned to LongCat
if (preferredModel === route.modelDbId) {
preferredModel = undefined;
preferredKeyId = undefined;
}
} else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
console.warn(`[Proxy] Retryable error from Owl Alpha — skipping model ${route.modelId} for session`);
skipModels.add(route.modelDbId);
// Clear sticky if pinned to Owl Alpha
if (preferredModel === route.modelDbId) {
preferredModel = undefined;
preferredKeyId = undefined;
}
}
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Other Observations (not in diff)
Files Reviewed (6 files)
Reviewed by step-3.7-flash-20260528 · 2,262,052 tokens |
| COUNT(*) as total, | ||
| SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes, | ||
| COUNT(*) as raw_total, | ||
| SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total, |
There was a problem hiding this comment.
1. Extra paren breaks sql 🐞 Bug ≡ Correctness
refreshStatsCache() contains an unbalanced weighted SUM expression (... / 7.0))))) that will fail SQLite statement preparation. Since routeRequest() calls refreshStatsCache() unconditionally, routing can throw before selecting any model.
Agent Prompt
## Issue description
`refreshStatsCache()` builds an invalid SQLite query due to an extra closing parenthesis in the recency-weighted `SUM(MAX(0, MIN(...))))` expression.
## Issue Context
`routeRequest()` calls `refreshStatsCache(db)` on every routing decision, so this SQL prepare error will break request routing globally.
## Fix Focus Areas
- server/src/services/router.ts[178-188]
- server/src/services/router.ts[503-506]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| import { PoolSection } from '@/components/pool-section' | ||
| import type { PoolType } from '@/components/pool-badge' |
There was a problem hiding this comment.
2. Missing pool components 🐞 Bug ≡ Correctness
FallbackPage imports @/components/pool-section and @/components/pool-badge, but no such modules exist in the client codebase. This causes immediate client build failures due to unresolved module imports.
Agent Prompt
## Issue description
`client/src/pages/FallbackPage.tsx` imports `PoolSection` and `PoolType` from modules that are not present in the repository, which will fail module resolution during TS/build.
## Issue Context
The page renders `<PoolSection>` and types entries with `PoolType`, so the imports cannot be trivially removed without adjusting rendering/typing.
## Fix Focus Areas
- client/src/pages/FallbackPage.tsx[6-7]
- client/src/pages/FallbackPage.tsx[67-72]
- client/src/pages/FallbackPage.tsx[335-371]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| const poolGroups = poolOrder | ||
| .map(pool => ({ pool, entries: displayEntries.filter(e => e.pool === pool) })) | ||
| .filter(group => group.entries.length > 0) |
There was a problem hiding this comment.
3. Fallback api lacks pool 🐞 Bug ≡ Correctness
FallbackPage groups/filters entries by entry.pool and the new fallback API test asserts a pool field, but the /api/fallback response objects do not include pool. This will render no entries in the UI and fail the new test.
Agent Prompt
## Issue description
The client and tests now require a `pool` field on `/api/fallback` entries, but the server route currently omits it.
## Issue Context
- Client: builds `poolGroups` by filtering `displayEntries` on `e.pool`.
- Test: asserts each entry has `pool` and that it is in the expected enum.
## Fix Focus Areas
- server/src/routes/fallback.ts[37-78]
- client/src/pages/FallbackPage.tsx[288-301]
- server/src/__tests__/routes/fallback.test.ts[43-62]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| @@ -1,4 +1,5 @@ | |||
| import { describe, it, expect, beforeAll } from 'vitest'; | |||
| import { ModelPool } from '@freellmapi/shared/types.js'; | |||
There was a problem hiding this comment.
4. Shared types missing modelpool 🐞 Bug ≡ Correctness
fallback.test.ts imports ModelPool from @freellmapi/shared/types.js, but the shared workspace package entrypoint is shared/types.ts, which does not export ModelPool. This breaks compilation of the new fallback test (and any code trying to use ModelPool).
Agent Prompt
## Issue description
`server/src/__tests__/routes/fallback.test.ts` imports `ModelPool` from `@freellmapi/shared/types.js`, but the workspace `@freellmapi/shared` package points to `shared/types.ts` and that file does not define/export `ModelPool`.
## Issue Context
Because `shared/package.json` uses `main`/`types` = `./types.ts`, any symbol imported from `@freellmapi/shared/types.js` must be exported from `shared/types.ts`.
## Fix Focus Areas
- server/src/__tests__/routes/fallback.test.ts[1-62]
- shared/package.json[1-7]
- shared/types.ts[1-80]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| VALUES (?, | ||
| VALUES (?, ?, ?, ?, ?, ?, ?) |
There was a problem hiding this comment.
5. Router.test.ts sql malformed 🐞 Bug ≡ Correctness
server/src/__tests__/services/router.test.ts contains a broken INSERT template string with a duplicated VALUES line, producing invalid SQL and invalid test setup. This will fail the router test (and may block the overall test suite depending on runner settings).
Agent Prompt
## Issue description
The router test inserts an API key using a template string that includes `VALUES` twice, making the SQL invalid.
## Issue Context
The first `it('should route to highest priority model...')` test depends on this INSERT to seed the DB.
## Fix Focus Areas
- server/src/__tests__/services/router.test.ts[26-34]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| error.status = | ||
| typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>) | ||
| ? Number((errPayload as Record<string, unknown>).code) | ||
| : 200; |
There was a problem hiding this comment.
6. Nan wrapped error status 🐞 Bug ☼ Reliability
BaseProvider.throwWrappedError() sets error.status using Number(errPayload.code), which becomes NaN for non-numeric codes. proxy.ts treats NaN as a number in getErrorStatus(), but NaN never matches the numeric status checks, misclassifying wrapped provider errors as non-retryable/non-ban-eligible.
Agent Prompt
## Issue description
`throwWrappedError()` can assign `status: NaN` when the upstream `error.code` is non-numeric, causing downstream status-based logic to fail to recognize retryable/ban-eligible cases.
## Issue Context
`proxy.ts` uses strict numeric comparisons for retry/ban decisions, and `getErrorStatus()` currently returns any `number`, including NaN.
## Fix Focus Areas
- server/src/providers/base.ts[135-148]
- server/src/routes/proxy.ts[496-545]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| #!/usr/bin/env python3 | ||
| """Replace the streaming block in proxy.ts with Promise.race-based stall detection.""" | ||
|
|
||
| with open('server/src/routes/proxy.ts', 'r') as f: | ||
| content = f.read() | ||
|
|
||
| { No newline at end of file |
There was a problem hiding this comment.
7. Broken helper scripts committed 🐞 Bug ⚙ Maintainability
Multiple newly added Python helper scripts are committed in a truncated/syntax-error state and include hard-coded absolute paths. These files add repo noise and can break if accidentally executed by developers or CI tooling.
Agent Prompt
## Issue description
Helper scripts appear accidentally committed: they are incomplete (syntax errors) and include machine-specific absolute paths.
## Issue Context
These files are not referenced by the app but remain in the repo root and `server/`, increasing maintenance burden and risk of accidental execution.
## Fix Focus Areas
- do_fix.py[1-7]
- fix.py[1-8]
- fix_streaming.py[1-21]
- server/write_test.py[1-29]
- server/write_tests.py[1-45]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/routes/proxy.ts (1)
1505-1537:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winCritical: Mid-stream retryable error handling still uses provider-level banning for LongCat and is missing for Owl Alpha.
This code path (lines 1506-1518) contradicts the completed tasks T2.8 and T2.9, and violates requirements REQ-4 and REQ-5:
- LongCat: Still uses
banPlatformFromSession('longcat')+addProviderModelsToSkipModels(skipModels, 'longcat')(provider-level) instead ofskipModels.add(route.modelDbId)(model-level)- Owl Alpha: No mid-stream retryable error handling at all—the code only checks for LongCat
This means:
- LongCat retryable errors ban the entire platform instead of just the specific model (inconsistent with other error paths)
- Owl Alpha retryable errors fall through to generic handling without model-level skipping
🔧 Proposed fix for model-level banning
- // Mid-stream retryable error handling for LongCat - if (route.platform === 'longcat' && isRetryableStreamError(streamErr)) { - console.warn(`[Proxy] Mid-stream retryable error from LongCat — excluding entire LongCat provider for session`); - banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId); - addProviderModelsToSkipModels(skipModels, 'longcat'); - // Clear sticky preference if pinned to LongCat - if (preferredModel) { - const db = getDb(); - const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined; - if (prefRow?.platform === 'longcat') { - preferredModel = undefined; - preferredKeyId = undefined; - } + // Mid-stream retryable error handling + if (isRetryableStreamError(streamErr)) { + if (route.platform === 'longcat') { + console.warn(`[Proxy] Mid-stream retryable error from LongCat — skipping model ${route.modelId} for session`); + skipModels.add(route.modelDbId); + // Clear sticky if pinned to this specific model + if (preferredModel) { + const db = getDb(); + const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; + if (prefRow?.model_id === 'LongCat-2.0-Preview') { + preferredModel = undefined; + preferredKeyId = undefined; + } + } + } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { + console.warn(`[Proxy] Mid-stream retryable error from Owl Alpha — skipping model ${route.modelId} for session`); + skipModels.add(route.modelDbId); + // Clear sticky if pinned to Owl Alpha + if (preferredModel) { + const db = getDb(); + const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined; + if (prefRow?.model_id === 'owl-alpha') { + preferredModel = undefined; + preferredKeyId = undefined; + } + } } - try { + // ... rest of error response handling🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/routes/proxy.ts` around lines 1505 - 1537, The mid-stream retryable error branch currently bans entire providers via banPlatformFromSession and addProviderModelsToSkipModels for LongCat and omits Owl Alpha; change it to perform model-level skipping instead: when route.platform === 'longcat' and isRetryableStreamError(streamErr) replace banPlatformFromSession('longcat') and addProviderModelsToSkipModels(skipModels, 'longcat') with skipModels.add(route.modelDbId) (so only the failing model is skipped), and ensure the sticky-preference clear logic still checks prefRow?.platform === 'longcat' before clearing preferredModel/preferredKeyId; additionally add an identical mid-stream branch for route.platform === 'owl-alpha' that uses skipModels.add(route.modelDbId) and the same sticky-preference clearing for 'owl-alpha' so Owl Alpha retryable stream errors are handled at model level rather than falling through.server/src/__tests__/services/router.test.ts (1)
63-64:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winFix truncated/incomplete test file:
router.test.tswon’t parse
server/src/__tests__/services/router.test.tsends at line 63 withconst groqKey = encrypt, leaving the expression unfinished and the file syntactically invalid. Complete theencrypt(...)call and restore the rest of the test block.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/__tests__/services/router.test.ts` around lines 63 - 64, The test file is truncated at "const groqKey = encrypt" so complete the encrypt(...) call and restore the test block; specifically finish the expression for groqKey by calling encrypt with the intended plaintext or test mock (e.g., encrypt('test-groq-key') or the project’s GROQ key/mocked value), add the missing semicolon, and re-add the remaining assertions and closing braces of the test/describe block so the file parses; look for the encrypt identifier and the surrounding it/describe that reference groqKey to ensure you restore the original assertions and closures.
🧹 Nitpick comments (3)
server/src/services/router.ts (1)
184-187: ⚡ Quick winAvoid hardcoding the decay window in SQL.
The recency divisor is duplicated as
7.0; this can silently diverge fromANALYTICS_WINDOW_MS. Derive and bind window days once, then reuse in both weighted expressions.Proposed diff
const ANALYTICS_WINDOW_MS = 7 * 24 * 60 * 60 * 1000; +const ANALYTICS_WINDOW_DAYS = ANALYTICS_WINDOW_MS / (24 * 60 * 60 * 1000); const ANALYTICS_CACHE_TTL_MS = 60 * 1000; ... - SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total, + SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / ?)))) as total, SUM(CASE WHEN status = 'success' - THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) + THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / ?))) ELSE 0 END) as successes, ... - `).all(since) as Array<{ + `).all(ANALYTICS_WINDOW_DAYS, ANALYTICS_WINDOW_DAYS, since) as Array<{🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/services/router.ts` around lines 184 - 187, Replace the duplicated literal 7.0 recency divisor with a single derived-and-bound value calculated from ANALYTICS_WINDOW_MS (e.g., const windowDays = ANALYTICS_WINDOW_MS / (1000*60*60*24)) and reuse that variable in both weighted expressions (the MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / <window>)) for total and successes; bind it into the SQL once (use a named parameter like :windowDays or a positional placeholder) so the same value is used in SUM(MAX(...)) and SUM(CASE WHEN ... THEN MAX(...)) and cannot diverge.server/write_test.py (1)
4-4: ⚡ Quick winUse a repository-relative output path to keep this script portable.
Hardcoding
/home/vi/...makes the generator machine-specific.Proposed fix
+from pathlib import Path -path = '/home/vi/freellmapi/server/src/__tests__/services/router.test.ts' +path = Path(__file__).resolve().parent / 'src/__tests__/services/router.test.ts'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/write_test.py` at line 4, Replace the hardcoded absolute string assigned to the variable path in write_test.py with a repository-relative construction: build the path from the script's location (e.g., using Path(__file__) / os.path.dirname(__file__) and os.path.join or pathlib.Path with .resolve()/parents to reach the repository root) and then append src/__tests__/services/router.test.ts so the path variable is portable across machines; update the assignment to use that computed path instead of '/home/vi/...'.server/write_tests.py (1)
4-4: ⚡ Quick winSwitch to a relative path instead of a user-specific absolute path.
This keeps the script usable in CI and on other developer machines.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/write_tests.py` at line 4, Replace the hard-coded user-specific absolute path in the variable path with a machine-independent relative or programmatically-resolved path; update the assignment to compute the test file location using pathlib or os.path (e.g., Path(__file__).resolve().parents[...] / "src" / "__tests__" / "services" / "router.test.ts" or os.path.join relative to the repository/script location) so the variable path in server/write_tests.py is portable across CI and other developer machines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.roo/specs/generalized-thread-protection/design.md:
- Around line 43-45: The fenced code block containing
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" is missing a
language tag; update that fenced block (the block surrounding the
THREAD_PROTECTION_PLATFORMS line) to include a language identifier such as bash
so the fence becomes ```bash and resolves markdownlint MD040 while keeping the
same content.
In @.roo/specs/generalized-thread-protection/requirements.md:
- Line 5: Fix the malformed requirement sentence on line 5 by removing the stray
"{", closing the inline code delimiter around `longcat`, and rewriting the
truncated fragment into a complete sentence that clearly states the issue (e.g.,
that the proxy route handler’s exported function in proxy.ts contains 6+
hardcoded branches that special-case the `longcat` route). Ensure the sentence
is grammatical, the backticks are balanced, and the requirement precisely
describes the special-casing problem so it’s unambiguous.
In @.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md:
- Around line 47-48: The requirement is ambiguous about retry semantics for
stalls: update the document to explicitly distinguish "pre-stream" stalls
(before any response bytes are sent) from "mid-/post-stream" stalls (after
partial delivery). Amend the bullet "5. Return from the handler (no retry on
stall — the stream is already partially delivered)" and the paragraph that marks
stalled-stream retry logic out of scope to state that 504/timeouts occurring
before any data is flushed should allow a retry per the companion tests, whereas
any stall detected after the first byte is sent must return without retry. Also
mirror this clarification where the doc references stalled-stream retry logic
(the section referencing out-of-scope behavior) so the pre-stream 504/retry
behavior and mid-stream no-retry behavior are consistent throughout.
In @.roo/specs/wrapped-error-interception/design.md:
- Line 120: The note is incorrect about exception flow: with the current
structure throwWrappedError() and ProviderApiError can be caught by the
surrounding try/catch, so update the text to require a parse-only try/catch and
to perform wrapped-error checks outside that block; specifically describe that
JSON.parse should be inside a minimal try/catch that only handles malformed
chunk parsing, while throwWrappedError() (and any ProviderApiError checks) must
occur after the parse block (outside the try) so they propagate to the consumer
(e.g., proxy.ts) rather than being swallowed.
In @.roo/specs/wrapped-error-interception/tasks.md:
- Around line 28-31: The current task instructs throwing wrapped errors inside
the parse try/catch which causes them to be swallowed; update the instructions
for OpenAICompatProvider.streamChatCompletion() so the try/catch only surrounds
JSON.parse: assign the result of JSON.parse(data) to a variable inside the try,
exit the try/catch, then immediately call this.isWrappedError(parsed) and if
true call this.throwWrappedError(parsed) (outside the parse try/catch), and only
after that yield the parsed value; apply the same change to the other listed
task steps.
In `@do_fix.py`:
- Around line 4-7: The script do_fix.py is syntactically invalid due to a stray
"{" after reading the file; locate the block containing "with
open('server/src/routes/proxy.ts', 'r') as f: content = f.read()" and remove the
extra "{" (or replace it with the intended Python logic/statement) so the file
is valid Python and will compile with python -m py_compile do_fix.py.
In `@fix_streaming.py`:
- Around line 12-21: The variable new_streaming is assigned with an unterminated
raw triple-quoted string (new_streaming = r''' ...) causing a SyntaxError; fix
by closing that raw triple-quoted string with a matching r''' terminator and
ensure any internal triple quotes are escaped or changed so the literal ends
properly; locate the new_streaming assignment in fix_streaming.py and add the
missing closing triple quotes (or adjust surrounding quotes/escaping) so the
Python file parses.
In `@new_streaming_block.txt`:
- Around line 22-28: The snippet for stallTimeout is truncated leaving the
setTimeout callback and Promise executor unclosed; fix the function stallTimeout
by properly closing the setTimeout callback and the new Promise executor and
providing a timeout duration (e.g. setTimeout(() => {
reject(Object.assign(...)); }, timeoutMs)); ensure the timer variable is
declared/used consistently (const timer = setTimeout(...)) and close with the
matching braces and parentheses so stallTimeout returns a well-formed
Promise<never>.
In `@server/src/__tests__/routes/fallback.test.ts`:
- Around line 56-61: The test exercising '/api/fallback' currently only asserts
pool values; extend it to cover auth boundaries: add an assertion that calling
request(app, 'GET', '/api/fallback') without the admin key returns 401, that
calling it with a valid admin key header returns 200 and the existing body
checks (the existing request(...) call can be reused or duplicated), and that
calling '/api/fallback' with the unified API key header (the key used elsewhere
for '/v1/*' tests) returns 401. Reference the existing test title "GET
/api/fallback pool values are valid ModelPool enum values", the request(app,
'GET', '/api/fallback') invocation, and the ModelPool checks when adding these
auth assertions.
In `@server/src/__tests__/routes/stream-heartbeat-stall.test.ts`:
- Around line 7-23: The test helper function request(...) can leak the server if
fetch() or res.text() throws; modify the request function to start the server
the same way but wrap the fetch and response handling in a try/finally, and in
finally await server closure by converting server.close into a Promise (e.g.,
await new Promise(resolve => server.close(resolve))) so the server is always
closed and awaited even on errors; update the function named request to ensure
deterministic shutdown.
- Around line 14-17: The helper request() currently injects only the unified API
key for paths starting with '/v1/' (uses getUnifiedApiKey()), causing setup
calls to '/api/*' (e.g., POST to '/api/keys') to run without admin credentials;
update the headers logic to apply the admin credential for '/api/' routes by
adding a branch that sets Authorization: `Bearer ${getAdminApiKey()}` when
path.startsWith('/api/'), keep the existing unified-key branch for '/v1/', and
ensure the Content-Type behavior remains unchanged so admin-gated setup calls
succeed.
In `@server/src/__tests__/services/router.test.ts`:
- Around line 31-33: The INSERT in the "highest priority model" test is
malformed: replace the duplicated/broken VALUES clauses in the db.prepare call
with a single VALUES (?, ?, ?, ?, ?, ?, ?) placeholder list and call .run(...)
with the parameters (e.g., 'groq', 'test', encrypted, iv, authTag, 'healthy',
1). Also finish the truncated test by completing the groqKey assignment (e.g.,
const groqKey = encrypt(...)), ensuring you pass the resulting encrypted, iv,
authTag into the prepared statement, execute the insert, and add the necessary
assertions/cleanup so router.test.ts compiles and the test executes.
In `@server/src/providers/base.ts`:
- Around line 138-145: throwWrappedError builds the Error message and status
incorrectly for some wrapped payloads; update the logic in throwWrappedError to
derive the message from the actual wrapped payload (use this.extractErrorMessage
on errPayload as well as body so string-form payloads like { error: "..." } are
captured) and set error.status only after validating the numeric code (parse
Number((errPayload as Record<string, unknown>).code) and check isFinite or
Number.isFinite before assigning, otherwise default to 200). Ensure you
reference the same symbols: throwWrappedError, this.extractErrorMessage,
errPayload, body, and error.status when applying these fixes.
In `@server/src/providers/cloudflare.ts`:
- Around line 123-130: The local catch in the Cloudflare stream parser is
swallowing wrapped 200 errors thrown by throwWrappedError(parsed), so they never
reach retry/cooldown handling. Update the logic in the stream parsing path
around isWrappedError and throwWrappedError so wrapped errors are detected and
rethrown outside the malformed-chunk catch, while keeping only JSON parse
failures or truly invalid chunks suppressed. Use the existing parsing flow in
the provider method to separate wrapped-error handling from the generic catch.
In `@server/src/providers/cohere.ts`:
- Around line 114-121: The current try/catch around JSON.parse in the stream
loop swallows errors thrown by throwWrappedError(parsed), preventing
retry/fallback logic from seeing wrapped errors; update the logic in the loop
that uses isWrappedError and throwWrappedError so that JSON.parse remains inside
a narrow try for parse-only failures but any detected wrapped error is thrown
outside that catch (e.g., parse into a local variable in try, then after the try
check isWrappedError(parsed) and call throwWrappedError(parsed) so it escapes
the parse-catch), ensuring throwWrappedError is not caught by the parse error
handler.
In `@server/src/providers/openai-compat.ts`:
- Around line 130-137: The catch is swallowing ProviderApiError because
throwWrappedError(parsed) is inside the try that catches all errors; change the
control flow so JSON.parse errors are handled but ProviderApiError thrown by
throwWrappedError is propagated: parse the chunk inside a small try/catch that
only handles JSON.parse failures (skip malformed), then after successful parse
call this.isWrappedError(parsed) and, if true, call
this.throwWrappedError(parsed) outside the parse-only catch (or rethrow if the
caught error is a ProviderApiError). This involves updating the parsing block
around ChatCompletionChunk so throwWrappedError and its ProviderApiError are not
swallowed.
In `@server/write_test.py`:
- Line 29: The test contains an unterminated/invalid string literal "
expect(() => route{ which looks like leftover JS—fix by replacing that line with
a valid Python assertion or properly terminated string/statement (e.g., use
assert <condition> or close the quotes and parentheses) so server/write_test.py
parses; also remove the hardcoded absolute path '/home/vi/...' and build the
output path portably using pathlib or os.path (use Path(__file__).parent /
"relative_output_dir" or os.path.join(os.path.dirname(__file__),
"relative_output_dir")) and replace the hardcoded variable (the literal
'/home/vi/...') with that portable path.
In `@server/write_tests.py`:
- Around line 7-45: The variable part1 (a triple-quoted Python string) is
unterminated because the JS template literal stops mid-line ("const groqKey{");
close the Python triple-quoted string and restore the remainder of the JS test
template so part1 contains a valid complete test file. Locate the part1
assignment in server/write_tests.py, add the terminating triple quotes (""") and
ensure the included JS snippet completes the broken line (finish the "const
groqKey..." insertion and any missing test blocks) so the generated string is
syntactically valid JavaScript when written out.
---
Outside diff comments:
In `@server/src/__tests__/services/router.test.ts`:
- Around line 63-64: The test file is truncated at "const groqKey = encrypt" so
complete the encrypt(...) call and restore the test block; specifically finish
the expression for groqKey by calling encrypt with the intended plaintext or
test mock (e.g., encrypt('test-groq-key') or the project’s GROQ key/mocked
value), add the missing semicolon, and re-add the remaining assertions and
closing braces of the test/describe block so the file parses; look for the
encrypt identifier and the surrounding it/describe that reference groqKey to
ensure you restore the original assertions and closures.
In `@server/src/routes/proxy.ts`:
- Around line 1505-1537: The mid-stream retryable error branch currently bans
entire providers via banPlatformFromSession and addProviderModelsToSkipModels
for LongCat and omits Owl Alpha; change it to perform model-level skipping
instead: when route.platform === 'longcat' and isRetryableStreamError(streamErr)
replace banPlatformFromSession('longcat') and
addProviderModelsToSkipModels(skipModels, 'longcat') with
skipModels.add(route.modelDbId) (so only the failing model is skipped), and
ensure the sticky-preference clear logic still checks prefRow?.platform ===
'longcat' before clearing preferredModel/preferredKeyId; additionally add an
identical mid-stream branch for route.platform === 'owl-alpha' that uses
skipModels.add(route.modelDbId) and the same sticky-preference clearing for
'owl-alpha' so Owl Alpha retryable stream errors are handled at model level
rather than falling through.
---
Nitpick comments:
In `@server/src/services/router.ts`:
- Around line 184-187: Replace the duplicated literal 7.0 recency divisor with a
single derived-and-bound value calculated from ANALYTICS_WINDOW_MS (e.g., const
windowDays = ANALYTICS_WINDOW_MS / (1000*60*60*24)) and reuse that variable in
both weighted expressions (the MIN(1.0, 1.0 - (julianday('now') -
julianday(created_at)) / <window>)) for total and successes; bind it into the
SQL once (use a named parameter like :windowDays or a positional placeholder) so
the same value is used in SUM(MAX(...)) and SUM(CASE WHEN ... THEN MAX(...)) and
cannot diverge.
In `@server/write_test.py`:
- Line 4: Replace the hardcoded absolute string assigned to the variable path in
write_test.py with a repository-relative construction: build the path from the
script's location (e.g., using Path(__file__) / os.path.dirname(__file__) and
os.path.join or pathlib.Path with .resolve()/parents to reach the repository
root) and then append src/__tests__/services/router.test.ts so the path variable
is portable across machines; update the assignment to use that computed path
instead of '/home/vi/...'.
In `@server/write_tests.py`:
- Line 4: Replace the hard-coded user-specific absolute path in the variable
path with a machine-independent relative or programmatically-resolved path;
update the assignment to compute the test file location using pathlib or os.path
(e.g., Path(__file__).resolve().parents[...] / "src" / "__tests__" / "services"
/ "router.test.ts" or os.path.join relative to the repository/script location)
so the variable path in server/write_tests.py is portable across CI and other
developer machines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b69a8c89-5b04-46c0-a2cc-f4d866a07ebe
📒 Files selected for processing (43)
.roo/specs/disable-sticky-on-auto/design.md.roo/specs/disable-sticky-on-auto/requirements.md.roo/specs/disable-sticky-on-auto/tasks.md.roo/specs/generalized-thread-protection/design.md.roo/specs/generalized-thread-protection/requirements.md.roo/specs/generalized-thread-protection/tasks.md.roo/specs/owl-alpha-longcat-model-routing/design.md.roo/specs/owl-alpha-longcat-model-routing/requirements.md.roo/specs/owl-alpha-longcat-model-routing/tasks.md.roo/specs/recency-biased-thompson-sampling/design.md.roo/specs/recency-biased-thompson-sampling/requirements.md.roo/specs/recency-biased-thompson-sampling/tasks.md.roo/specs/sse-stream-heartbeat-stall-protection/design.md.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md.roo/specs/sse-stream-heartbeat-stall-protection/tasks.md.roo/specs/transient-model-cooldown/design.md.roo/specs/transient-model-cooldown/requirements.md.roo/specs/transient-model-cooldown/tasks.md.roo/specs/wrapped-error-interception/design.md.roo/specs/wrapped-error-interception/requirements.md.roo/specs/wrapped-error-interception/tasks.mdclient/src/pages/FallbackPage.tsxdo_fix.pyfix.pyfix_streaming.pyfix{new_streaming_block.txtserver/src/__tests__/routes/fallback.test.tsserver/src/__tests__/routes/provider-session-ban.test.tsserver/src/__tests__/routes/proxy-tools.test.tsserver/src/__tests__/routes/stream-heartbeat-stall.test.tsserver/src/__tests__/routes/transient-cooldown.test.tsserver/src/__tests__/services/router.test.tsserver/src/providers/base.tsserver/src/providers/cloudflare.tsserver/src/providers/cohere.tsserver/src/providers/google.tsserver/src/providers/openai-compat.tsserver/src/routes/proxy.tsserver/src/services/router.tsserver/src/services/threadProtection.tsserver/write_test.pyserver/write_tests.py
| ``` | ||
| THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced code block.
The fence at Line 43 is missing a language identifier, which triggers markdownlint MD040.
Proposed fix
-```
+```bash
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 43-43: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/generalized-thread-protection/design.md around lines 43 - 45, The
fenced code block containing
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" is missing a
language tag; update that fenced block (the block surrounding the
THREAD_PROTECTION_PLATFORMS line) to include a language identifier such as bash
so the fence becomes ```bash and resolves markdownlint MD040 while keeping the
same content.
|
|
||
| ## Problem Statement | ||
|
|
||
| The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{ No newline at end of file |
There was a problem hiding this comment.
Fix the truncated requirement sentence.
Line 5 is malformed (longcat{) and the problem statement is incomplete, so the requirement is currently ambiguous.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/generalized-thread-protection/requirements.md at line 5, Fix the
malformed requirement sentence on line 5 by removing the stray "{", closing the
inline code delimiter around `longcat`, and rewriting the truncated fragment
into a complete sentence that clearly states the issue (e.g., that the proxy
route handler’s exported function in proxy.ts contains 6+ hardcoded branches
that special-case the `longcat` route). Ensure the sentence is grammatical, the
backticks are balanced, and the requirement precisely describes the
special-casing problem so it’s unambiguous.
| 5. Return from the handler (no retry on stall — the stream is already partially delivered) | ||
|
|
There was a problem hiding this comment.
Clarify pre-stream stall retry semantics.
Line 47 says stall handling should not retry, and Line 132 marks stalled-stream retry logic out of scope. That conflicts with the pre-stream 504/retry behavior described in companion tasks/tests. Please make pre-stream vs mid-stream stall behavior explicit and consistent here.
Also applies to: 128-132
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md around
lines 47 - 48, The requirement is ambiguous about retry semantics for stalls:
update the document to explicitly distinguish "pre-stream" stalls (before any
response bytes are sent) from "mid-/post-stream" stalls (after partial
delivery). Amend the bullet "5. Return from the handler (no retry on stall — the
stream is already partially delivered)" and the paragraph that marks
stalled-stream retry logic out of scope to state that 504/timeouts occurring
before any data is flushed should allow a retry per the companion tests, whereas
any stall detected after the first byte is sent must return without retry. Also
mirror this clarification where the doc references stalled-stream retry logic
(the section referencing out-of-scope behavior) so the pre-stream 504/retry
behavior and mid-stream no-retry behavior are consistent throughout.
| } | ||
| ``` | ||
|
|
||
| **Note**: The `catch` block already skips malformed chunks. The `throwWrappedError()` call throws before `yield`, so the generator terminates immediately. The `try/catch` around `JSON.parse` does NOT catch the `ProviderApiError` thrown by `throwWrappedError()` because that throw happens after successful parsing — it propagates out of the generator to the consumer in `proxy.ts`. |
There was a problem hiding this comment.
Fix the streaming propagation note; it currently states incorrect exception behavior.
Line 120 says the surrounding try/catch will not catch throwWrappedError(), but with the shown structure it will be caught and swallowed. Please update this note to require parse-only try/catch, with wrapped-error checks outside it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/wrapped-error-interception/design.md at line 120, The note is
incorrect about exception flow: with the current structure throwWrappedError()
and ProviderApiError can be caught by the surrounding try/catch, so update the
text to require a parse-only try/catch and to perform wrapped-error checks
outside that block; specifically describe that JSON.parse should be inside a
minimal try/catch that only handles malformed chunk parsing, while
throwWrappedError() (and any ProviderApiError checks) must occur after the parse
block (outside the try) so they propagate to the consumer (e.g., proxy.ts)
rather than being swallowed.
| - [x] 5. Add wrapped-error check in `OpenAICompatProvider.streamChatCompletion()` in `server/src/providers/openai-compat.ts` | ||
| - Inside the `try` block at line 126, after `JSON.parse(data)` succeeds: | ||
| - Insert: `if (this.isWrappedError(parsed)) { this.throwWrappedError(parsed); }` | ||
| - Note: assign the result of `JSON.parse` to a variable first, then check, then yield |
There was a problem hiding this comment.
Update streaming task steps to avoid swallowing wrapped errors.
These steps currently instruct placing throwWrappedError() inside the parse try block. That causes wrapped errors to be consumed by the generic catch. The tasks should require parse-only try/catch, then wrapped-error check/throw outside it.
Also applies to: 37-40, 46-49
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/wrapped-error-interception/tasks.md around lines 28 - 31, The
current task instructs throwing wrapped errors inside the parse try/catch which
causes them to be swallowed; update the instructions for
OpenAICompatProvider.streamChatCompletion() so the try/catch only surrounds
JSON.parse: assign the result of JSON.parse(data) to a variable inside the try,
exit the try/catch, then immediately call this.isWrappedError(parsed) and if
true call this.throwWrappedError(parsed) (outside the parse try/catch), and only
after that yield the parsed value; apply the same change to the other listed
task steps.
| try { | ||
| yield JSON.parse(data) as ChatCompletionChunk; | ||
| const parsed = JSON.parse(data) as ChatCompletionChunk; | ||
| if (this.isWrappedError(parsed)) { | ||
| this.throwWrappedError(parsed); | ||
| } | ||
| yield parsed; | ||
| } catch { | ||
| // Skip malformed chunks |
There was a problem hiding this comment.
Wrapped stream errors are currently swallowed by the local catch.
Because throwWrappedError(parsed) is inside this try, the catch treats it as a malformed chunk and suppresses it. This prevents wrapped 200 errors from propagating to retry/cooldown handling.
Suggested fix
- try {
- const parsed = JSON.parse(data) as ChatCompletionChunk;
- if (this.isWrappedError(parsed)) {
- this.throwWrappedError(parsed);
- }
- yield parsed;
- } catch {
- // Skip malformed chunks
- }
+ let parsed: ChatCompletionChunk;
+ try {
+ parsed = JSON.parse(data) as ChatCompletionChunk;
+ } catch {
+ // Skip malformed chunks
+ continue;
+ }
+ if (this.isWrappedError(parsed)) {
+ this.throwWrappedError(parsed);
+ }
+ yield parsed;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/providers/cloudflare.ts` around lines 123 - 130, The local catch
in the Cloudflare stream parser is swallowing wrapped 200 errors thrown by
throwWrappedError(parsed), so they never reach retry/cooldown handling. Update
the logic in the stream parsing path around isWrappedError and throwWrappedError
so wrapped errors are detected and rethrown outside the malformed-chunk catch,
while keeping only JSON parse failures or truly invalid chunks suppressed. Use
the existing parsing flow in the provider method to separate wrapped-error
handling from the generic catch.
| try { | ||
| yield JSON.parse(data) as ChatCompletionChunk; | ||
| const parsed = JSON.parse(data) as ChatCompletionChunk; | ||
| if (this.isWrappedError(parsed)) { | ||
| this.throwWrappedError(parsed); | ||
| } | ||
| yield parsed; | ||
| } catch { | ||
| // Skip malformed chunks |
There was a problem hiding this comment.
Streaming wrapped errors won’t escape this block due to catch-all handling.
throwWrappedError(parsed) is executed inside the same try and gets swallowed by catch, so retry/fallback logic never sees it.
Suggested fix
- try {
- const parsed = JSON.parse(data) as ChatCompletionChunk;
- if (this.isWrappedError(parsed)) {
- this.throwWrappedError(parsed);
- }
- yield parsed;
- } catch {
- // Skip malformed chunks
- }
+ let parsed: ChatCompletionChunk;
+ try {
+ parsed = JSON.parse(data) as ChatCompletionChunk;
+ } catch {
+ // Skip malformed chunks
+ continue;
+ }
+ if (this.isWrappedError(parsed)) {
+ this.throwWrappedError(parsed);
+ }
+ yield parsed;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/providers/cohere.ts` around lines 114 - 121, The current try/catch
around JSON.parse in the stream loop swallows errors thrown by
throwWrappedError(parsed), preventing retry/fallback logic from seeing wrapped
errors; update the logic in the loop that uses isWrappedError and
throwWrappedError so that JSON.parse remains inside a narrow try for parse-only
failures but any detected wrapped error is thrown outside that catch (e.g.,
parse into a local variable in try, then after the try check
isWrappedError(parsed) and call throwWrappedError(parsed) so it escapes the
parse-catch), ensuring throwWrappedError is not caught by the parse error
handler.
| try { | ||
| yield JSON.parse(data) as ChatCompletionChunk; | ||
| const parsed = JSON.parse(data) as ChatCompletionChunk; | ||
| if (this.isWrappedError(parsed)) { | ||
| this.throwWrappedError(parsed); | ||
| } | ||
| yield parsed; | ||
| } catch { | ||
| // Skip malformed chunks |
There was a problem hiding this comment.
Streaming wrapped errors are suppressed instead of propagated.
throwWrappedError(parsed) runs inside the try, so the local catch absorbs the ProviderApiError and the retry loop never sees it.
Suggested fix
- try {
- const parsed = JSON.parse(data) as ChatCompletionChunk;
- if (this.isWrappedError(parsed)) {
- this.throwWrappedError(parsed);
- }
- yield parsed;
- } catch {
- // Skip malformed chunks
- }
+ let parsed: ChatCompletionChunk;
+ try {
+ parsed = JSON.parse(data) as ChatCompletionChunk;
+ } catch {
+ // Skip malformed chunks
+ continue;
+ }
+ if (this.isWrappedError(parsed)) {
+ this.throwWrappedError(parsed);
+ }
+ yield parsed;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| yield JSON.parse(data) as ChatCompletionChunk; | |
| const parsed = JSON.parse(data) as ChatCompletionChunk; | |
| if (this.isWrappedError(parsed)) { | |
| this.throwWrappedError(parsed); | |
| } | |
| yield parsed; | |
| } catch { | |
| // Skip malformed chunks | |
| let parsed: ChatCompletionChunk; | |
| try { | |
| parsed = JSON.parse(data) as ChatCompletionChunk; | |
| } catch { | |
| // Skip malformed chunks | |
| continue; | |
| } | |
| if (this.isWrappedError(parsed)) { | |
| this.throwWrappedError(parsed); | |
| } | |
| yield parsed; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/providers/openai-compat.ts` around lines 130 - 137, The catch is
swallowing ProviderApiError because throwWrappedError(parsed) is inside the try
that catches all errors; change the control flow so JSON.parse errors are
handled but ProviderApiError thrown by throwWrappedError is propagated: parse
the chunk inside a small try/catch that only handles JSON.parse failures (skip
malformed), then after successful parse call this.isWrappedError(parsed) and, if
true, call this.throwWrappedError(parsed) outside the parse-only catch (or
rethrow if the caught error is a ProviderApiError). This involves updating the
parsing block around ChatCompletionChunk so throwWrappedError and its
ProviderApiError are not swallowed.
| " });", | ||
| "", | ||
| " it('should throw when no keys are configured', () => {", | ||
| " expect(() => route{ No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="server/write_test.py"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE"
exit 1
fi
echo "== $FILE =="
nl -ba "$FILE" | sed -n '1,120p'Repository: vi70x3/freellmapi
Length of output: 150
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="server/write_test.py"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE"
exit 1
fi
echo "== $FILE (lines 1-120) =="
cat -n "$FILE" | sed -n '1,120p'
echo
echo "== $FILE (lines 15-45) =="
cat -n "$FILE" | sed -n '15,45p'Repository: vi70x3/freellmapi
Length of output: 2266
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="server/write_test.py"
echo "== $FILE =="
wc -l "$FILE"
echo
echo "== $FILE (lines 20-end) =="
cat -n "$FILE" | sed -n '20,$p'Repository: vi70x3/freellmapi
Length of output: 734
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="server/write_test.py"
echo "== py_compile check =="
python3 -m py_compile "$FILE" 2>&1 || true
echo
echo "== tail -n 10 =="
tail -n 10 "$FILE" | cat -n -w2Repository: vi70x3/freellmapi
Length of output: 323
Critical: fix server/write_test.py syntax error and hardcoded absolute output path
server/write_test.pyline 29 has an unterminated string literal (" expect(() => route{), causing aSyntaxErrorand preventing the generator from running.server/write_test.pyline 4 hardcodes/home/vi/...as the output path, hurting portability.
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 29-29: missing closing quote in string literal
(invalid-syntax)
[warning] 29-29: unexpected EOF while parsing
(invalid-syntax)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/write_test.py` at line 29, The test contains an unterminated/invalid
string literal " expect(() => route{ which looks like leftover JS—fix by
replacing that line with a valid Python assertion or properly terminated
string/statement (e.g., use assert <condition> or close the quotes and
parentheses) so server/write_test.py parses; also remove the hardcoded absolute
path '/home/vi/...' and build the output path portably using pathlib or os.path
(use Path(__file__).parent / "relative_output_dir" or
os.path.join(os.path.dirname(__file__), "relative_output_dir")) and replace the
hardcoded variable (the literal '/home/vi/...') with that portable path.
| part1 = """import { describe, it, expect, beforeAll, beforeEach } from 'vitest'; | ||
| import { initDb, getDb } from '../../db/index.js'; | ||
| import { encrypt } from '../../lib/crypto.js'; | ||
| import { routeRequest, refreshStatsCache, getAnalyticsScores } from '../../services/router.js'; | ||
|
|
||
| describe('Router', () => { | ||
| beforeAll(() => { | ||
| process.env.ENCRYPTION_KEY = '0'.repeat(64); | ||
| initDb(':memory:'); | ||
| }); | ||
|
|
||
| beforeEach(() => { | ||
| const db = getDb(); | ||
| db.prepare('DELETE FROM api_keys').run(); | ||
| const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[]; | ||
| const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?'); | ||
| for (let i = 0; i < models.length; i++) { | ||
| update.run(i + 1, models[i].id); | ||
| } | ||
| }); | ||
|
|
||
| it('should throw when no keys are configured', () => { | ||
| expect(() => routeRequest()).toThrow(/exhausted/i); | ||
| }); | ||
|
|
||
| it('should route to highest priority model with available key', () => { | ||
| const db = getDb(); | ||
| const { encrypted, iv, authTag } = encrypt('test-groq-key'); | ||
| db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('groq', 'test', encrypted, iv, authTag, 'healthy', 1); | ||
| const result = routeRequest(); | ||
| expect(result.platform).toBe('groq'); | ||
| expect(result.apiKey).toBe('test-groq-key'); | ||
| }); | ||
|
|
||
| it('should route to an available model when keys exist for multiple platforms', () => { | ||
| const db = getDb(); | ||
| const googleKey = encrypt('test-google-key'); | ||
| db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('google', 'test', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 1); | ||
| const groqKey{ No newline at end of file |
There was a problem hiding this comment.
Fix SyntaxError: part1 triple-quoted string is unterminated in server/write_tests.py
File: server/write_tests.py (lines 7-45)
part1 = """ ... is cut off before the closing """ (the snippet ends mid-statement like const groqKey{), which would make the generator script fail to parse. Add the missing closing triple quotes and ensure the full template string content is included.
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 7-45: missing closing quote in string literal
(invalid-syntax)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/write_tests.py` around lines 7 - 45, The variable part1 (a
triple-quoted Python string) is unterminated because the JS template literal
stops mid-line ("const groqKey{"); close the Python triple-quoted string and
restore the remainder of the JS test template so part1 contains a valid complete
test file. Locate the part1 assignment in server/write_tests.py, add the
terminating triple quotes (""") and ensure the included JS snippet completes the
broken line (finish the "const groqKey..." insertion and any missing test
blocks) so the generated string is syntactically valid JavaScript when written
out.
Summary by Sourcery
Refine routing, streaming, and error-handling behavior across the proxy, router, and providers to improve reliability, isolation between sessions, and observability for specific models and pools.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Tests:
Summary by CodeRabbit
New Features
Bug Fixes
Tests