feat(router): add cache TTL gate, fix auto-refresh order, fix indentation, deduplicate RoutingMode type#12
feat(router): add cache TTL gate, fix auto-refresh order, fix indentation, deduplicate RoutingMode type#12vi70x3 wants to merge 10 commits into
Conversation
…alized thread protection scanner
- Add ANALYTICS_WINDOW_DAYS constant for SQL decay divisor - Extend ModelStats with rawSuccesses/rawTotal fields - Rewrite SQL query with CTE-based time-decay weighting (julianday) - Update statsCache population to use weighted columns - Add Math.max(0.1, ...) guards in thompsonSampleScore, smartSampleScore, getAnalyticsScore, and getSmartAnalyticsScore to handle fractional weights - Update getAnalyticsScores dashboard to display rawTotal counts - Add modelId field to getAnalyticsScores return type for fallback route compat
- Add 'fast' to RoutingMode type and ModelPool enum (Fast/Balanced/Smart) - Implement classifyModel() to categorize models by speed/intelligence ranks - Implement fastSampleScore() with 3x speed weight, 1.5x TTFB factor, minimal intelligence - Add pool-based routing in routeRequest: fast pool first, balanced pool borrowing fallback - Add freellmapi/auto-fast model ID support in proxy routes - Extend /v1/models endpoint with primaryPool field for all models - Add unit tests for pool classification (T13-T16) and fast routing (T17-T18) - Add integration tests for /v1/models primaryPool fields (T13-T17) - Update README.md and CLAUDE.md documentation
- extractErrorMessage: handle string error payloads and object errors with message field - throwWrappedError: guard against non-numeric error codes (NaN prevention) - cloudflare/cohere/openai-compat: move isWrappedError check outside JSON.parse try/catch in SSE loops
…tion, deduplicate RoutingMode type
📝 WalkthroughWalkthroughAdds a third routing mode ("fast") and model-pool classification, implements recency-weighted analytics and fast-mode scoring, disables sticky sessions in balanced mode, introduces transient per-model cooldowns and a thread-protection scaffold, rewrites SSE streaming to add heartbeat/stall detection, detects wrapped provider errors on HTTP 200 responses, adds frontend pool UI components, and expands test coverage and specs. ChangesRouter, Proxy, and Provider Core Implementation
Specifications and Tasks
Public Documentation
Estimated code review effort: Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
|
There was a problem hiding this comment.
Code Review
This pull request introduces several major routing and stability enhancements, including a new fast routing mode (freellmapi/auto-fast) with model pool classification, dynamic thread protection to prevent concurrent session conflicts, SSE stream heartbeats with stall protection, and a shared transient model cooldown mechanism. It also excludes LongCat and Owl Alpha from balanced routing while adding smart preference for Owl Alpha. The review feedback is highly constructive, pointing out a critical 'Headers already sent' crash during pre-stream stalls, an unused import, a database query performance bottleneck under thread protection, a double semicolon syntax error, and minor indentation inconsistencies.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| } catch (streamErr: any) { | ||
| cleanupStream(); | ||
| if (streamStarted) { |
There was a problem hiding this comment.
If a pre-stream stall is detected, the stall handler sends a 504 response and ends the connection. However, if the generator subsequently throws an error, it will be caught by this catch block. Since streamStarted is false, the error will be rethrown and propagate to the outer retry loop, which will attempt to fallback to another provider and write to the already-ended response, causing a 'Headers already sent' crash. Checking streamAborted here and returning early prevents this crash.
} catch (streamErr: any) {\n cleanupStream();\n if (streamAborted) {\n logRequest(route.platform, route.modelId, 'error', estimatedInputTokens, totalOutputTokens, Date.now() - start, ttfbMs, 'stream_stalled');\n return;\n }\n if (streamStarted) {| import { z } from 'zod'; | ||
| import { evaluateThreadProtection } from '../services/threadProtection.js'; | ||
| import type { ChatCompletionChunk, ChatCompletionResponse, ChatMessage, ChatToolCall, ChatToolDefinition } from '@freellmapi/shared/types.js'; |
There was a problem hiding this comment.
The import of evaluateThreadProtection is unused in this file. The design spec for the 'Generalized Thread Protection Scanner' seems to have been implemented inline or left as a placeholder in threadProtection.ts. Consider removing this unused import to keep the codebase clean.
import { z } from 'zod';\nimport type { ChatCompletionChunk, ChatCompletionResponse, ChatMessage, ChatToolCall, ChatToolDefinition } from '@freellmapi/shared/types.js';| // clear the set and let the request through rather than failing outright. | ||
| if (activeCooldownModels.size > 0) { | ||
| const db = getDb(); | ||
| const allEnabled = db.prepare('SELECT id FROM models WHERE enabled = 1').all() as Array<{ id: number }>; |
There was a problem hiding this comment.
| }); | ||
| } else { | ||
| res.write(`data: ${JSON.stringify(payload)}\n\n`); | ||
| res.write('data: [DONE]\n\n');; |
| console.warn(`[Proxy] Mid-stream 5xx from ${route.platform} — skipping model ${route.modelId} only`); | ||
| skipModels.add(route.modelDbId); | ||
| } | ||
| } | ||
|
|
||
| // Generalized truncation detection for any provider (not just LongCat) | ||
| // Register global transient cooldown for any 5xx mid-stream error |
There was a problem hiding this comment.
The indentation of these lines is inconsistent with the surrounding block. Adjust the indentation to match the 14-space statement indentation and 12-space closing brace indentation of the parent block.
| console.warn(`[Proxy] Mid-stream 5xx from ${route.platform} — skipping model ${route.modelId} only`); | |
| skipModels.add(route.modelDbId); | |
| } | |
| } | |
| // Generalized truncation detection for any provider (not just LongCat) | |
| // Register global transient cooldown for any 5xx mid-stream error | |
| // Register global transient cooldown for any 5xx mid-stream error\n transientModelCooldowns.set(route.modelDbId, Date.now() + TRANSIENT_COOLDOWN_MS);\n console.log(\`[TransientCooldown] registered global cooldown for modelDbId=\${route.modelDbId} (\${TRANSIENT_COOLDOWN_MS / 1000}s)\`);\n } |
| } | ||
| } | ||
|
|
||
| if (isRetryableError(err)) { | ||
| if (isRetryableError(err)) { |
Code Review SummaryStatus: 2 Issues Found (existing) | Recommendation: Address before merge Overview
Issue Details (click to expand)
Other Observations (not in diff)
Files Reviewed (3 files)
Fix these issues in Kilo Cloud Reviewed by nemotron-3-super-120b-a12b-20230311:free · 924,387 tokens |
…haustAll, remove double semicolons
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/routes/proxy.ts (1)
1479-1484:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLongCat is still being banned at provider scope in the new error paths.
These branches still ban the whole
longcatplatform instead of skipping onlyroute.modelDbId, so one failing LongCat request will also block future LongCat models for the session. That breaks the model-level contract described in this PR.Also applies to: 1551-1554, 1579-1583, 1610-1613, 1711-1714, 1735-1738
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/routes/proxy.ts` around lines 1479 - 1484, The code currently calls banPlatformFromSession(...) and addProviderModelsToSkipModels(...) for LongCat branches, which bans the entire platform; instead, remove those two calls and add the failing modelDbId to the session-level skip list so only route.modelDbId is skipped. Locate the LongCat branches (the block checking route.platform === 'longcat') and replace banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId) and addProviderModelsToSkipModels(skipModels, 'longcat') with logic that records route.modelDbId under the LongCat provider in skipModels (e.g., push or set skipModels['longcat'] to include route.modelDbId or call an existing addModel-level helper), ensuring you use the existing skipModels variable and route.modelDbId so only that model is skipped for the session.
🧹 Nitpick comments (4)
.roo/specs/disable-sticky-on-auto/design.md (1)
11-17: ⚡ Quick winAdd language identifier to fenced code block.
The fenced code block should specify a language for proper syntax highlighting.
📝 Proposed fix
-``` +```typescript function stickyOp(messages, routingMode, ...) { const key = getSessionKey(messages, routingMode); if (!key) return <no-op value>; // undefined, false, or early return ...operate on stickySessionMap using key... }Based on learnings from static analysis: markdownlint-cli2 flagged this as missing a language specifier.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/disable-sticky-on-auto/design.md around lines 11 - 17, The fenced code block containing the example for function stickyOp is missing a language identifier; update the block to include a language (e.g., "typescript" or "js") so syntax highlighters handle getSessionKey, stickyOp, and stickySessionMap correctly—locate the block showing function stickyOp(messages, routingMode, ...) and add the language specifier immediately after the opening backticks (for example ```typescript)..roo/specs/generalized-thread-protection/design.md (1)
43-45: ⚡ Quick winAdd language identifier to fenced code block.
The environment variable example block should specify a language (e.g.,
bashorshell) for proper syntax highlighting and markdown compliance.📝 Proposed fix
-``` +```bash THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In @.roo/specs/generalized-thread-protection/design.md around lines 43 - 45,
Update the fenced code block that shows the environment variable
THREAD_PROTECTION_PLATFORMS so it includes a language identifier (e.g., use
bash orshell) for proper markdown highlighting; locate the block
containing THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"
and change the opening fence to include the language token.</details> </blockquote></details> <details> <summary>fix.py (1)</summary><blockquote> `4-8`: _⚡ Quick win_ **Remove or complete this helper script.** Right now it only reads `server/src/routes/proxy.ts` and stops, so running it does nothing and leaves a misleading artifact in the repo. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@fix.py` around lines 4 - 8, This helper script is incomplete and should be removed or finished; either delete fix.py entirely, or implement its intended behavior: open('server/src/routes/proxy.ts', 'r') should read into lines, locate the streaming block start by searching for the string "for await (const chunk" and then find the matching block end (matching braces) and then perform the intended action (e.g., print the start/end indices, extract the block, or modify and write back) before closing—use the existing lines variable and proper file write handling to persist changes if any. ``` </details> </blockquote></details> <details> <summary>client/src/components/pool-badge.tsx (1)</summary><blockquote> `7-7`: _⚡ Quick win_ **Inconsistent color naming in pool configurations.** Each pool configuration mixes different color families in text and background: - Fast: `bg-emerald-100` with `text-green-800` (emerald vs green) - Balanced: `bg-slate-100` with `text-gray-700` (slate vs gray) - Smart: `bg-purple-100` with `text-indigo-700` (purple vs indigo) While these may be intentional design choices, using consistent color families improves predictability and maintainability. <details> <summary>🎨 Proposed fix for color consistency</summary> ```diff fast: { icon: "⚡", label: "Fast", - className: "bg-emerald-100 text-green-800 dark:bg-emerald-900/30 dark:text-emerald-400", + className: "bg-emerald-100 text-emerald-800 dark:bg-emerald-900/30 dark:text-emerald-400", }, balanced: { icon: "⚖️", label: "Balanced", - className: "bg-slate-100 text-gray-700 dark:bg-slate-800 dark:text-slate-300", + className: "bg-slate-100 text-slate-700 dark:bg-slate-800 dark:text-slate-300", }, smart: { icon: "🧠", label: "Smart", - className: "bg-purple-100 text-indigo-700 dark:bg-purple-900/30 dark:text-purple-400", + className: "bg-purple-100 text-purple-700 dark:bg-purple-900/30 dark:text-purple-400", }, ``` </details> Also applies to: 12-12, 17-17 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/src/components/pool-badge.tsx` at line 7, The pool badge color classes mix different Tailwind color families; update the className entries in the PoolBadge/pool configuration for Fast, Balanced, and Smart so the text color uses the same family as the background (e.g., bg-emerald-100 -> text-emerald-800 for Fast, bg-slate-100 -> text-slate-700 for Balanced, bg-purple-100 -> text-purple-700 for Smart). Locate the className fields in client/src/components/pool-badge.tsx (the PoolBadge or pool config entries referenced on the three lines) and replace the mismatched text color tokens to match each background family for consistency. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.Inline comments:
In @.roo/specs/generalized-thread-protection/requirements.md:
- Line 5: The requirement line is truncated and contains a stray '{' so rewrite
it into a single complete sentence that clearly states the problem and expected
outcome: describe that the proxy route handler (the exported router/handler in
proxy.ts) currently contains 6+ hardcoded branches that special-case the
"longcat" route/host/prefix and specify what should be done instead (e.g.,
remove hardcoded branches and replace with a configurable/generalized routing or
policy mechanism), ensuring no stray punctuation and including the scope (which
branches/behavior to change) and the acceptance criteria for the fix.In @.roo/specs/model-pools-display/design.md:
- Around line 109-132: The fenced ASCII diagrams (the triple-backtick blocks
that start before the "Bandit routing" diagram and the similar blocks for
"Balanced (5 models)" and "Smart (2 models)") must include a markdown language
identifier; update each opening fence fromtotext (or ```plain) so
linters/renderers recognize them as plain text and the diagrams render
consistently.In @.roo/specs/model-pools-display/requirements.md:
- Line 90: The document ends with a truncated heading "## Out of" — change that
heading to "## Out of Scope" and populate it with explicit exclusions for this
feature (e.g., what model-pool UI interactions, permissions, API endpoints, or
integrations are not covered), ensuring the section clarifies boundaries for
implementation and QA; update any table of contents or cross-references if
present to reflect the completed heading.In @.roo/specs/owl-alpha-longcat-model-routing/design.md:
- Around line 5-46: The markdown diagrams use triple-fenced blocks without a
language which triggers MD040; update each ASCII diagram/code block (the ones
showing Balanced Router / Smart Router, the routeRequest() flow, and the
handleChatCompletion() flow — e.g., the block containing "Balanced Router",
"Smart Router", "routeRequest() in router.ts", and "handleChatCompletion() in
proxy.ts") to include a fence language of text (use ```text) so markdownlint no
longer flags them; apply the same change to the other listed blocks (the ones
noted around lines 52-78, 82-92, 96-105).In @.roo/specs/owl-alpha-longcat-model-routing/requirements.md:
- Around line 122-126: The Markdown links in the requirements list are using
../server/... which resolves relative to
.roo/specs/owl-alpha-longcat-model-routing/ instead of the repo root; update
each link target to go up three levels to reference the true server files (e.g.,
change ../server/src/services/router.ts to
../../../server/src/services/router.ts and similarly update
../server/src/routes/proxy.ts and ../server/src/db/index.ts references used for
migrateModelsV15/migrateModelsV16 and the LongCat smart/sticky/ban logic).In @.roo/specs/recency-biased-thompson-sampling/design.md:
- Line 5: The relative source links in
.roo/specs/recency-biased-thompson-sampling/design.md point to server files
using paths like server/src/... which resolve under the spec folder; update each
such link to use the repository-root relative prefix "../../../server/..." so
they open correctly from the spec location (apply the same fix to the other
occurrences noted around lines 31, 85, 115, 132, 138, and 229-230); search for
markdown links referencing "server/src/..." in this file and replace them with
"../../../server/src/..." to correct all references.In @.roo/specs/wrapped-error-interception/tasks.md:
- Around line 28-31: In OpenAICompatProvider.streamChatCompletion(), constrain
the wrapped-error check to only the first parsed SSE payload: after parsing each
chunk (assign JSON.parse(data) to a variable), check this.isWrappedError(parsed)
and call this.throwWrappedError(parsed) only if no previous payload has been
yielded yet (e.g., guard with a local flag like sawFirstPayload or check a
stream-start state) so later parsed chunks do not abort an already-started
downstream stream; then proceed to yield the parsed payload and set the flag.In
@client/src/components/pool-section.tsx:
- Around line 5-30: The PoolSection component currently renders a static
container but needs collapse/expand behavior; update PoolSection to maintain
internal open state (e.g., useState) or accept a prop like defaultOpen, render a
clickable header (wrap the PoolBadge + title in a button/div with onClick) that
toggles that state, show/hide the children based on the state, add an accessible
aria-expanded attribute on the header and visually indicate state with a
chevron/icon, and preserve the existing className and styling when collapsed or
expanded; reference the PoolSection component and PoolBadge for where to attach
the toggle and children visibility logic.In
@do_fix.py:
- Around line 1-7: The file do_fix.py contains an incomplete helper script
(orphaned '{' at the top and a partial read into the variable content) and
should be either removed or completed; to fix, either delete do_fix.py from the
commit if it was temporary, or complete the script by removing the stray '{',
add necessary imports and a proper main() implementation that opens
server/src/routes/proxy.ts, reads content, performs the replacement of the
streaming block with the Promise.race-based stall-detection snippet, writes the
modified content back safely (use a temp file and atomic replace), handle
exceptions and return non-zero on failure, and ensure the shebang and file
permissions are correct so do_fix.py is a runnable, syntactically valid helper;
locate the incomplete references to content and the
open('server/src/routes/proxy.ts', 'r') call to implement or remove them.In
@fix_streaming.py:
- Around line 12-21: The string assigned to the variable new_streaming in
fix_streaming.py is left unterminated (starts with r''' but no closing '''),
causing a SyntaxError; fix it by adding the missing closing triple-quote to
terminate the raw triple-quoted string (or remove the stray snippet entirely) so
new_streaming is a valid string literal and the module can be imported/executed.In
@new_streaming_block.txt:
- Around line 21-28: The snippet for stallTimeout was truncated and left
unclosed; restore the full implementation of stallTimeout so it returns a
Promise that sets a timer via setTimeout to reject with an
Error('Upstream stream stalled') augmented with { status: 504, type:
'stream_timeout' after MAX_STREAM_STALL_MS, and close all opened
braces/parentheses; ensure the timer variable is declared and the promise body
is properly terminated so stallTimeout and its inner setTimeout callback are
syntactically complete (refer to stallTimeout and MAX_STREAM_STALL_MS to locate
where to fix).In
@server/src/__tests__/routes/stream-heartbeat-stall.test.ts:
- Around line 74-119: The test asserts a keep-alive comment during the 300ms
gap, but the proxy implementation (streamStarted flag in
server/src/routes/proxy.ts) only emits ": keep-alive\n\n" once streamStarted ===
true; update the test in
server/src/tests/routes/stream-heartbeat-stall.test.ts to stop expecting a
keep-alive before the first chunk (remove or change the expect(raw).toContain(':
keep-alive') assertion) or adjust it to assert keep-alives only after the first
chunk/after streamStarted is true so the test matches the current behavior of
streamStarted-controlled heartbeats.- Around line 12-18: The test helper currently injects only the unified API key
via getUnifiedApiKey() when path.startsWith('/v1/'), which incorrectly allows
unauthenticated access to /api/* routes; update the helper (the fetch wrapper
used in these tests) to accept an explicit admin key or auth override and use
getAdminApiKey() when the request path startsWith('/api/'), while continuing to
use getUnifiedApiKey() for '/v1/'; alternatively add a parameter to the helper
to pass custom headers so tests can seed the DB or set Authorization explicitly
instead of implicitly baking unauthenticated /api/* access.In
@server/src/providers/openai-compat.ts:
- Around line 131-143: Only check for and throw wrapped SSE errors before any
chunk has been yielded: add a boolean flag (e.g., hasYielded) that is false
until the first successful yield of parsed, and only call
isWrappedError/throwWrappedError when hasYielded is false; if you detect a
wrapped error before the first yield, cancel the underlying reader (call
reader.cancel()) before calling throwWrappedError(parsed) so the upstream stream
is aborted and the proxy can retry/fallback cleanly. Update the code around
parsed, isWrappedError, throwWrappedError, and the reader to implement this
gating and cancellation.In
@server/src/routes/proxy.ts:
- Around line 1422-1425: The close listener is being added inside the retry loop
which causes listener leaks; register it once outside the retry loop (or use
req.once('close', ...) or remove any existing listener before adding) so
cleanupStream is not re-registered on each attempt—move the req.on('close', ...)
call (or switch to req.once or removeListener) so it executes a single time for
the request and still calls cleanupStream on client disconnect.- Around line 1279-1307: The exhaustion check currently compares
activeCooldownModels against all enabled models (using allEnabledIds) which is
incorrect because routing mode and existing skipModels further restrict the
actual routable candidates; change the check to compute the same candidate set
that routeRequest() uses (apply enabled filter, current routing mode filters,
existing skipModels and remove preferredModel) and then test whether
activeCooldownModels would exhaust that specific candidate set; use the same
variables/functions involved (activeCooldownModels, preferredModel, skipModels,
routeRequest or the helper that builds candidate IDs) so the logic mirrors
routing and avoids false negatives that lead to 429s.In
@server/src/services/router.ts:
- Around line 345-370: The classifyModel function uses normalized thresholds
which yield incorrect top-40% cutoffs for small catalogs; change it to compute
pool membership from ordinal positions and total count instead: compute
totalCount = maxSpeedRank - minSpeedRank + 1 (and similarly for intelligence),
compute ordinalPosition = speedRank - minSpeedRank + 1 (and for intelligence),
compute cutoff = Math.max(1, Math.ceil(totalCount * 0.4)), then classify as Fast
if ordinalPosition <= cutoff, Smart if intelligence ordinal <= cutoff, else
Balanced; preserve the early-return when there is no variation and handle edge
cases (totalCount <= 0) in classifyModel and reference the ModelPool enum.- Around line 407-408: Call refreshStatsCache(getDb(), true) unconditionally
before returning cached stats so the TTL-gated refresh runs even when statsCache
is already set; specifically, invoke refreshStatsCache(getDb(), true)
unconditionally (rather than only when !statsCache) and then keep the existing
guard if (!statsCache) return []; this ensures statsCache and the TTL logic are
honored and prevents serving stale scores while preserving the cheap TTL check
inside refreshStatsCache.In
@server/write_test.py:
- Around line 4-29: The file defines a JavaScript test payload in Python but
currently contains an unterminated string and a machine-specific absolute path;
fix the top-level variables 'path' and 'content' so the Python file parses:
replace the hard-coded '/home/vi/freellmapi/...' assigned to the variable path
with a repo-relative computed path (use file / pathlib to resolve the
project root and join the relative test path), and ensure the 'content' value is
a properly closed Python list or a single properly quoted multi-line string so
there are no unterminated string literals (verify brackets/quotes around the
array are balanced and the last line completes the JS snippet such as the
truncated "expect(() => route..." line). Ensure no other syntax errors remain so
the script runs.In
@server/write_tests.py:
- Around line 4-45: The generator file server/write_tests.py currently contains
an unterminated triple-quoted string in the part1 variable (the test text stops
mid-token like "const groqKey{"), causing a SyntaxError; close or properly
terminate part1 (complete the test contents or split into smaller, properly
quoted strings and concatenate) and ensure all embedded JS quotes/braces are
correctly escaped if needed; also replace the hard-coded absolute output path
stored in path with a repo-relative construction (use
pathlib.Path(file).parent / 'src' / 'tests' / 'services' /
'router.test.ts' or os.path.join with repo root resolution) so the generator
writes to a portable location.
Outside diff comments:
In@server/src/routes/proxy.ts:
- Around line 1479-1484: The code currently calls banPlatformFromSession(...)
and addProviderModelsToSkipModels(...) for LongCat branches, which bans the
entire platform; instead, remove those two calls and add the failing modelDbId
to the session-level skip list so only route.modelDbId is skipped. Locate the
LongCat branches (the block checking route.platform === 'longcat') and replace
banPlatformFromSession(normalizedMessages, routingMode, 'longcat',
route.modelDbId) and addProviderModelsToSkipModels(skipModels, 'longcat') with
logic that records route.modelDbId under the LongCat provider in skipModels
(e.g., push or set skipModels['longcat'] to include route.modelDbId or call an
existing addModel-level helper), ensuring you use the existing skipModels
variable and route.modelDbId so only that model is skipped for the session.
Nitpick comments:
In @.roo/specs/disable-sticky-on-auto/design.md:
- Around line 11-17: The fenced code block containing the example for function
stickyOp is missing a language identifier; update the block to include a
language (e.g., "typescript" or "js") so syntax highlighters handle
getSessionKey, stickyOp, and stickySessionMap correctly—locate the block showing
function stickyOp(messages, routingMode, ...) and add the language specifier
immediately after the opening backticks (for example ```typescript).In @.roo/specs/generalized-thread-protection/design.md:
- Around line 43-45: Update the fenced code block that shows the environment
variable THREAD_PROTECTION_PLATFORMS so it includes a language identifier (e.g.,
usebash orshell) for proper markdown highlighting; locate the block
containing THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"
and change the opening fence to include the language token.In
@client/src/components/pool-badge.tsx:
- Line 7: The pool badge color classes mix different Tailwind color families;
update the className entries in the PoolBadge/pool configuration for Fast,
Balanced, and Smart so the text color uses the same family as the background
(e.g., bg-emerald-100 -> text-emerald-800 for Fast, bg-slate-100 ->
text-slate-700 for Balanced, bg-purple-100 -> text-purple-700 for Smart). Locate
the className fields in client/src/components/pool-badge.tsx (the PoolBadge or
pool config entries referenced on the three lines) and replace the mismatched
text color tokens to match each background family for consistency.In
@fix.py:
- Around line 4-8: This helper script is incomplete and should be removed or
finished; either delete fix.py entirely, or implement its intended behavior:
open('server/src/routes/proxy.ts', 'r') should read into lines, locate the
streaming block start by searching for the string "for await (const chunk" and
then find the matching block end (matching braces) and then perform the intended
action (e.g., print the start/end indices, extract the block, or modify and
write back) before closing—use the existing lines variable and proper file write
handling to persist changes if any.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro Plus **Run ID**: `ebc2d52b-36c4-4d90-9c6a-9b952a33334b` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 233e03142895cd8c18d441461d5c9568186c7123 and 40abf0b8c67d25c3d5526a9345e88d01db0cca9e. </details> <details> <summary>📒 Files selected for processing (51)</summary> * `.roo/specs/disable-sticky-on-auto/design.md` * `.roo/specs/disable-sticky-on-auto/requirements.md` * `.roo/specs/disable-sticky-on-auto/tasks.md` * `.roo/specs/generalized-thread-protection/design.md` * `.roo/specs/generalized-thread-protection/requirements.md` * `.roo/specs/generalized-thread-protection/tasks.md` * `.roo/specs/model-pools-display/design.md` * `.roo/specs/model-pools-display/requirements.md` * `.roo/specs/model-pools-display/tasks.md` * `.roo/specs/owl-alpha-longcat-model-routing/design.md` * `.roo/specs/owl-alpha-longcat-model-routing/requirements.md` * `.roo/specs/owl-alpha-longcat-model-routing/tasks.md` * `.roo/specs/recency-biased-thompson-sampling/design.md` * `.roo/specs/recency-biased-thompson-sampling/requirements.md` * `.roo/specs/recency-biased-thompson-sampling/tasks.md` * `.roo/specs/sse-stream-heartbeat-stall-protection/design.md` * `.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md` * `.roo/specs/sse-stream-heartbeat-stall-protection/tasks.md` * `.roo/specs/transient-model-cooldown/design.md` * `.roo/specs/transient-model-cooldown/requirements.md` * `.roo/specs/transient-model-cooldown/tasks.md` * `.roo/specs/wrapped-error-interception/design.md` * `.roo/specs/wrapped-error-interception/requirements.md` * `.roo/specs/wrapped-error-interception/tasks.md` * `CLAUDE.md` * `README.md` * `client/src/components/pool-badge.tsx` * `client/src/components/pool-section.tsx` * `do_fix.py` * `fix.py` * `fix_streaming.py` * `fix{` * `new_streaming_block.txt` * `server/src/__tests__/integration/full-flow.test.ts` * `server/src/__tests__/routes/provider-session-ban.test.ts` * `server/src/__tests__/routes/proxy-tools.test.ts` * `server/src/__tests__/routes/stream-heartbeat-stall.test.ts` * `server/src/__tests__/routes/transient-cooldown.test.ts` * `server/src/__tests__/services/router.test.ts` * `server/src/providers/base.ts` * `server/src/providers/cloudflare.ts` * `server/src/providers/cohere.ts` * `server/src/providers/google.ts` * `server/src/providers/openai-compat.ts` * `server/src/routes/fallback.ts` * `server/src/routes/proxy.ts` * `server/src/services/router.ts` * `server/src/services/threadProtection.ts` * `server/write_test.py` * `server/write_tests.py` * `shared/types.ts` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
|
||
| ## Problem Statement | ||
|
|
||
| The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{ No newline at end of file |
There was a problem hiding this comment.
Finish the truncated problem statement.
This sentence ends mid-thought and includes a stray {, so the requirement is not readable enough to implement from.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/generalized-thread-protection/requirements.md at line 5, The
requirement line is truncated and contains a stray '{' so rewrite it into a
single complete sentence that clearly states the problem and expected outcome:
describe that the proxy route handler (the exported router/handler in proxy.ts)
currently contains 6+ hardcoded branches that special-case the "longcat"
route/host/prefix and specify what should be done instead (e.g., remove
hardcoded branches and replace with a configurable/generalized routing or policy
mechanism), ensuring no stray punctuation and including the scope (which
branches/behavior to change) and the acceptance criteria for the fix.
| ``` | ||
| ┌─────────────────────────────────────────────────────────┐ | ||
| │ Bandit routing │ | ||
| ├─────────────────────────────────────────────────────────┤ | ||
| │ ⚡ Fast (3 models) │ | ||
| │ ┌─────────────────────────────────────────────────────┐ │ | ||
| │ │ Model 1 ... [ON] │ │ | ||
| │ │ Model 2 ... [ON] │ │ | ||
| │ │ Model 3 ... [ON] │ │ | ||
| │ └─────────────────────────────────────────────────────┘ │ | ||
| │ │ | ||
| │ ⚖️ Balanced (5 models) │ | ||
| │ ┌─────────────────────────────────────────────────────┐ │ | ||
| │ │ Model 4 ... [ON] │ │ | ||
| │ │ ... │ │ | ||
| │ └─────────────────────────────────────────────────────┘ │ | ||
| │ │ | ||
| │ 🧠 Smart (2 models) │ | ||
| │ ┌─────────────────────────────────────────────────────┐ │ | ||
| │ │ Model 9 ... [ON] │ │ | ||
| │ │ Model 10 ... [ON] │ │ | ||
| │ └─────────────────────────────────────────────────────┘ │ | ||
| └─────────────────────────────────────────────────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks for ASCII diagrams.
The ASCII layout diagrams should have a language identifier (e.g., text or plain) to satisfy markdown linting requirements and improve rendering consistency across different viewers.
📝 Proposed fix
-```
+```text
┌─────────────────────────────────────────────────────────┐
│ Bandit routing │Apply the same change to the other two ASCII diagrams at lines 142 and 168.
Also applies to: 142-158, 168-177
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 109-109: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/model-pools-display/design.md around lines 109 - 132, The fenced
ASCII diagrams (the triple-backtick blocks that start before the "Bandit
routing" diagram and the similar blocks for "Balanced (5 models)" and "Smart (2
models)") must include a markdown language identifier; update each opening fence
from ``` to ```text (or ```plain) so linters/renderers recognize them as plain
text and the diagrams render consistently.
| - Pool classification logic should match backend routing logic exactly | ||
| - No changes to the pool classification algorithm (only display changes) | ||
|
|
||
| ## Out of No newline at end of file |
There was a problem hiding this comment.
File appears truncated at "Out of" heading.
The requirements document ends abruptly with an incomplete "## Out of" section heading. This likely should be "## Out of Scope" with content defining what is explicitly excluded from this feature.
✅ Suggested completion
-## Out of
+## Out of Scope
+
+- Changes to the pool classification algorithm itself
+- Backend routing logic modifications
+- Analytics calculation changes
+- Model catalog structure changes📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Out of | |
| ## Out of Scope | |
| - Changes to the pool classification algorithm itself | |
| - Backend routing logic modifications | |
| - Analytics calculation changes | |
| - Model catalog structure changes |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/model-pools-display/requirements.md at line 90, The document ends
with a truncated heading "## Out of" — change that heading to "## Out of Scope"
and populate it with explicit exclusions for this feature (e.g., what model-pool
UI interactions, permissions, API endpoints, or integrations are not covered),
ensuring the section clarifies boundaries for implementation and QA; update any
table of contents or cross-references if present to reflect the completed
heading.
| ``` | ||
| ┌─────────────────────────────────────────────────────────────┐ | ||
| │ Client Request │ | ||
| │ model: "freellmapi/auto" | "freellmapi/auto-smart" │ | ||
| └─────────────┬───────────────────────────────┬───────────────┘ | ||
| │ │ | ||
| ▼ ▼ | ||
| ┌─────────────────────────┐ ┌─────────────────────────┐ | ||
| │ Balanced Router │ │ Smart Router │ | ||
| │ (auto) │ │ (auto-smart) │ | ||
| │ │ │ │ | ||
| │ - Excludes longcat/* │ │ - Prefers longcat/* │ | ||
| │ - Excludes OR/owl-alpha│ │ and OR/owl-alpha │ | ||
| │ - Normal bandit for │ │ when valid keys exist│ | ||
| │ everything else │ │ - Applies sticky │ | ||
| │ │ │ cooldown for both │ | ||
| │ │ │ - Model-level banning │ | ||
| │ │ │ on errors │ | ||
| └─────────────┬───────────┘ └───────────┬─────────────┘ | ||
| │ │ | ||
| ▼ ▼ | ||
| ┌─────────────────────────────────────────────────────────────┐ | ||
| │ routeRequest() in router.ts │ | ||
| │ │ | ||
| │ 1. Build chain from fallback_config + models │ | ||
| │ 2. Score via Thompson sampling │ | ||
| │ 3. Apply balanced exclusions (REQ-1) │ | ||
| │ 4. Apply smart preferences (REQ-2) │ | ||
| │ 5. Apply sticky session pin │ | ||
| │ 6. Iterate chain, find first model with valid key │ | ||
| └─────────────────────────────────────────────────────────────┘ | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────────────────────────────────────┐ | ||
| │ handleChatCompletion() in proxy.ts │ | ||
| │ │ | ||
| │ - Sticky cooldown check for longcat + owl-alpha (REQ-3) │ | ||
| │ - Model-level skipModels on 5xx/retryable (REQ-4, REQ-5) │ | ||
| │ - Model-level skipModels on truncation (REQ-4, REQ-5) │ | ||
| │ - Model-level skipModels on mid-stream errors (REQ-4, REQ-5)│ | ||
| └─────────────────────────────────────────────────────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Add fence languages to these new code blocks.
These fences will trip markdownlint (MD040). Using text for the ASCII diagrams/flows is enough.
Also applies to: 52-78, 82-92, 96-105
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 5-5: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/owl-alpha-longcat-model-routing/design.md around lines 5 - 46,
The markdown diagrams use triple-fenced blocks without a language which triggers
MD040; update each ASCII diagram/code block (the ones showing Balanced Router /
Smart Router, the routeRequest() flow, and the handleChatCompletion() flow —
e.g., the block containing "Balanced Router", "Smart Router", "routeRequest() in
router.ts", and "handleChatCompletion() in proxy.ts") to include a fence
language of text (use ```text) so markdownlint no longer flags them; apply the
same change to the other listed blocks (the ones noted around lines 52-78,
82-92, 96-105).
| - Existing LongCat smart auto preference logic in [`server/src/services/router.ts`](../server/src/services/router.ts) | ||
| - Existing LongCat sticky cooldown logic in [`server/src/routes/proxy.ts`](../server/src/routes/proxy.ts) | ||
| - Existing provider-level ban logic in [`server/src/routes/proxy.ts`](../server/src/routes/proxy.ts) | ||
| - Owl Alpha model seeded in [`server/src/db/index.ts`](../server/src/db/index.ts) via `migrateModelsV15` | ||
| - LongCat model seeded in [`server/src/db/index.ts`](../server/src/db/index.ts) via `migrateModelsV16` |
There was a problem hiding this comment.
Fix the dependency links; they currently point to the wrong directory depth.
From .roo/specs/owl-alpha-longcat-model-routing/, ../server/... resolves to .roo/specs/server/..., not the repo root. These should go up to the root first (for example, ../../../server/...).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.roo/specs/owl-alpha-longcat-model-routing/requirements.md around lines 122
- 126, The Markdown links in the requirements list are using ../server/... which
resolves relative to .roo/specs/owl-alpha-longcat-model-routing/ instead of the
repo root; update each link target to go up three levels to reference the true
server files (e.g., change ../server/src/services/router.ts to
../../../server/src/services/router.ts and similarly update
../server/src/routes/proxy.ts and ../server/src/db/index.ts references used for
migrateModelsV15/migrateModelsV16 and the LongCat smart/sticky/ban logic).
| // Attach client-disconnect listener | ||
| req.on('close', () => { | ||
| cleanupStream(); | ||
| }); |
There was a problem hiding this comment.
Don't register a new close listener on every retry attempt.
This code sits inside the retry loop, so repeated pre-stream failures will stack listeners on the same req object. Once a request crosses ~10 attempts, Node will start warning about listener leaks, and every disconnect will fan out through all prior handlers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/routes/proxy.ts` around lines 1422 - 1425, The close listener is
being added inside the retry loop which causes listener leaks; register it once
outside the retry loop (or use req.once('close', ...) or remove any existing
listener before adding) so cleanupStream is not re-registered on each
attempt—move the req.on('close', ...) call (or switch to req.once or
removeListener) so it executes a single time for the request and still calls
cleanupStream on client disconnect.
| export function classifyModel( | ||
| speedRank: number, | ||
| intelligenceRank: number, | ||
| minSpeedRank: number, | ||
| maxSpeedRank: number, | ||
| minIntelligenceRank: number, | ||
| maxIntelligenceRank: number, | ||
| ): ModelPool { | ||
| const speedRange = maxSpeedRank - minSpeedRank; | ||
| const intelRange = maxIntelligenceRank - minIntelligenceRank; | ||
|
|
||
| // If no variation at all, default to Balanced | ||
| if (speedRange <= 0 && intelRange <= 0) return ModelPool.Balanced; | ||
|
|
||
| const normalizedSpeed = speedRange > 0 | ||
| ? (speedRank - minSpeedRank) / speedRange | ||
| : 0.5; | ||
| const normalizedIntel = intelRange > 0 | ||
| ? (intelligenceRank - minIntelligenceRank) / intelRange | ||
| : 0.5; | ||
|
|
||
| // Fast pool: top 40% fastest models (lowest normalized speed_rank) | ||
| if (normalizedSpeed <= 0.4) return ModelPool.Fast; | ||
| // Smart pool: top 40% smartest models (lowest normalized intelligence_rank) | ||
| if (normalizedIntel <= 0.4) return ModelPool.Smart; | ||
| return ModelPool.Balanced; |
There was a problem hiding this comment.
classifyModel() does not implement a true top-40% cutoff.
This normalized-rank check over-selects for common catalog sizes. With 6 models, normalized <= 0.4 marks ranks 1-3 as Fast/Smart, which is 50%, not 40%. That changes both pool exposure and fast-mode routing decisions. Please compute pool membership from ordinal position/count (or precomputed cutoff ranks) instead of a fixed normalized threshold.
As per coding guidelines, **/*router*.{ts,tsx,js,jsx}: Models are classified into Fast (top 40% speed), Smart (top 40% intelligence), or Balanced pools via the classifyModel() function in router.ts. Fast mode routes to Fast pool first, then borrows from Balanced pool
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/services/router.ts` around lines 345 - 370, The classifyModel
function uses normalized thresholds which yield incorrect top-40% cutoffs for
small catalogs; change it to compute pool membership from ordinal positions and
total count instead: compute totalCount = maxSpeedRank - minSpeedRank + 1 (and
similarly for intelligence), compute ordinalPosition = speedRank - minSpeedRank
+ 1 (and for intelligence), compute cutoff = Math.max(1, Math.ceil(totalCount *
0.4)), then classify as Fast if ordinalPosition <= cutoff, Smart if intelligence
ordinal <= cutoff, else Balanced; preserve the early-return when there is no
variation and handle edge cases (totalCount <= 0) in classifyModel and reference
the ModelPool enum.
| if (!statsCache) { refreshStatsCache(getDb(), true); } | ||
| if (!statsCache) return []; |
There was a problem hiding this comment.
Call the TTL-gated refresh unconditionally here.
Once statsCache exists, this path never refreshes it again, so the dashboard can keep serving stale scores until some other route happens to touch refreshStatsCache(). The new TTL gate already makes the unconditional call cheap.
Suggested fix
-export function getAnalyticsScores(): Array<{
+export function getAnalyticsScores(): Array<{
platform: string;
modelName: string;
modelId: string;
@@
- if (!statsCache) { refreshStatsCache(getDb(), true); }
+ refreshStatsCache(getDb());
if (!statsCache) return [];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/src/services/router.ts` around lines 407 - 408, Call
refreshStatsCache(getDb(), true) unconditionally before returning cached stats
so the TTL-gated refresh runs even when statsCache is already set; specifically,
invoke refreshStatsCache(getDb(), true) unconditionally (rather than only when
!statsCache) and then keep the existing guard if (!statsCache) return []; this
ensures statsCache and the TTL logic are honored and prevents serving stale
scores while preserving the cheap TTL check inside refreshStatsCache.
| path = '/home/vi/freellmapi/server/src/__tests__/services/router.test.ts' | ||
|
|
||
| content = [ | ||
| "import { describe, it, expect, beforeAll, beforeEach } from 'vitest';", | ||
| "import { initDb, getDb } from '../../db/index.js';", | ||
| "import { encrypt } from '../../lib/crypto.js';", | ||
| "import { routeRequest, refreshStatsCache, getAnalyticsScores } from '../../services/router.js';", | ||
| "", | ||
| "describe('Router', () => {", | ||
| " beforeAll(() => {", | ||
| " process.env.ENCRYPTION_KEY = '0'.repeat(64);", | ||
| " initDb(':memory:');", | ||
| " });", | ||
| "", | ||
| " beforeEach(() => {", | ||
| " const db = getDb();", | ||
| " db.prepare('DELETE FROM api_keys').run();", | ||
| " const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[];", | ||
| " const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');", | ||
| " for (let i = 0; i < models.length; i++) {", | ||
| " update.run(i + 1, models[i].id);", | ||
| " }", | ||
| " });", | ||
| "", | ||
| " it('should throw when no keys are configured', () => {", | ||
| " expect(() => route{ No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 521
🏁 Script executed:
#!/bin/bash
python3 - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text(encoding="utf-8")
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 521
🏁 Script executed:
#!/bin/bash
python3 - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text(encoding="utf-8")
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 521
🏁 Script executed:
#!/bin/bash
python3 - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text(encoding="utf-8")
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 521
🏁 Script executed:
#!/bin/bash
python3 - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text(encoding="utf-8")
ast.parse(src, filename=str(path))
print("syntax OK")
PYRepository: vi70x3/freellmapi
Length of output: 426
🏁 Script executed:
#!/bin/bash
python3 - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_test.py")
src = path.read_text(encoding="utf-8")
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 521
Critical: Fix server/write_test.py so it’s valid Python and remove the hard-coded absolute path.
- Fails to parse with
SyntaxError: unterminated string literalatserver/write_test.py:29, so the script can’t run as committed. - Replace the machine-specific absolute path (
/home/vi/freellmapi/...) with a repo-relative path (e.g., computed from__file__/ project root).
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 29-29: missing closing quote in string literal
(invalid-syntax)
[warning] 29-29: unexpected EOF while parsing
(invalid-syntax)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/write_test.py` around lines 4 - 29, The file defines a JavaScript test
payload in Python but currently contains an unterminated string and a
machine-specific absolute path; fix the top-level variables 'path' and 'content'
so the Python file parses: replace the hard-coded '/home/vi/freellmapi/...'
assigned to the variable path with a repo-relative computed path (use __file__ /
pathlib to resolve the project root and join the relative test path), and ensure
the 'content' value is a properly closed Python list or a single properly quoted
multi-line string so there are no unterminated string literals (verify
brackets/quotes around the array are balanced and the last line completes the JS
snippet such as the truncated "expect(() => route..." line). Ensure no other
syntax errors remain so the script runs.
| path = '/home/vi/freellmapi/server/src/__tests__/services/router.test.ts' | ||
|
|
||
| # Part 1: existing tests | ||
| part1 = """import { describe, it, expect, beforeAll, beforeEach } from 'vitest'; | ||
| import { initDb, getDb } from '../../db/index.js'; | ||
| import { encrypt } from '../../lib/crypto.js'; | ||
| import { routeRequest, refreshStatsCache, getAnalyticsScores } from '../../services/router.js'; | ||
|
|
||
| describe('Router', () => { | ||
| beforeAll(() => { | ||
| process.env.ENCRYPTION_KEY = '0'.repeat(64); | ||
| initDb(':memory:'); | ||
| }); | ||
|
|
||
| beforeEach(() => { | ||
| const db = getDb(); | ||
| db.prepare('DELETE FROM api_keys').run(); | ||
| const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[]; | ||
| const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?'); | ||
| for (let i = 0; i < models.length; i++) { | ||
| update.run(i + 1, models[i].id); | ||
| } | ||
| }); | ||
|
|
||
| it('should throw when no keys are configured', () => { | ||
| expect(() => routeRequest()).toThrow(/exhausted/i); | ||
| }); | ||
|
|
||
| it('should route to highest priority model with available key', () => { | ||
| const db = getDb(); | ||
| const { encrypted, iv, authTag } = encrypt('test-groq-key'); | ||
| db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('groq', 'test', encrypted, iv, authTag, 'healthy', 1); | ||
| const result = routeRequest(); | ||
| expect(result.platform).toBe('groq'); | ||
| expect(result.apiKey).toBe('test-groq-key'); | ||
| }); | ||
|
|
||
| it('should route to an available model when keys exist for multiple platforms', () => { | ||
| const db = getDb(); | ||
| const googleKey = encrypt('test-google-key'); | ||
| db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('google', 'test', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 1); | ||
| const groqKey{ No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
PYRepository: vi70x3/freellmapi
Length of output: 504
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
PYRepository: vi70x3/freellmapi
Length of output: 504
🏁 Script executed:
#!/bin/bash
python - <<'PY'
from pathlib import Path
import ast
path = Path("server/write_tests.py")
src = path.read_text()
try:
ast.parse(src, filename=str(path))
print(f"{path}: syntax OK")
except SyntaxError as e:
print(f"{path}:{e.lineno}:{e.offset}: {e.msg}")
raise
PYRepository: vi70x3/freellmapi
Length of output: 613
Critical: Fix server/write_tests.py generator (invalid Python + non-portable absolute path).
server/write_tests.pydoesn’t parse:SyntaxError: unterminated triple-quoted string literalinpart1(part1 = """...), line 7 (detected at line 45).- Generator output path is hard-coded to
/home/vi/freellmapi/server/src/__tests__/services/router.test.ts; switch to a repo-relative path or remove the generator.
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 7-45: missing closing quote in string literal
(invalid-syntax)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/write_tests.py` around lines 4 - 45, The generator file
server/write_tests.py currently contains an unterminated triple-quoted string in
the part1 variable (the test text stops mid-token like "const groqKey{"),
causing a SyntaxError; close or properly terminate part1 (complete the test
contents or split into smaller, properly quoted strings and concatenate) and
ensure all embedded JS quotes/braces are correctly escaped if needed; also
replace the hard-coded absolute output path stored in path with a repo-relative
construction (use pathlib.Path(__file__).parent / 'src' / '__tests__' /
'services' / 'router.test.ts' or os.path.join with repo root resolution) so the
generator writes to a portable location.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
client/src/pages/PlaygroundPage.tsx (1)
274-285: 💤 Low valueConsider adding ARIA attributes for accessibility.
The custom dropdown works but lacks semantic hints for screen readers. Adding
aria-expandedto the trigger androle="menu"to the dropdown would improve accessibility without significant effort.♿ Suggested accessibility improvements
<Button variant="outline" className="w-[260px] justify-between font-normal" onClick={() => setModelPickerOpen(v => !v)} + aria-expanded={modelPickerOpen} + aria-haspopup="menu" >-<div className="absolute right-0 top-full mt-2 z-50 max-h-[70vh] overflow-y-auto w-[360px] space-y-3 rounded-lg border bg-background p-3 shadow-lg"> +<div role="menu" className="absolute right-0 top-full mt-2 z-50 max-h-[70vh] overflow-y-auto w-[360px] space-y-3 rounded-lg border bg-background p-3 shadow-lg">🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/src/pages/PlaygroundPage.tsx` around lines 274 - 285, The dropdown trigger Button (onClick={() => setModelPickerOpen(v => !v)}) and the conditional dropdown div (rendered when modelPickerOpen is true) lack ARIA semantics; update the Button to include aria-expanded={modelPickerOpen} and aria-controls referencing the dropdown's id, and add role="menu" (and an id) to the dropdown container div so screen readers can detect its expanded state and treat the container as a menu; ensure any interactive items inside the dropdown use appropriate roles (e.g., role="menuitem") if present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@client/src/pages/PlaygroundPage.tsx`:
- Around line 274-285: The dropdown trigger Button (onClick={() =>
setModelPickerOpen(v => !v)}) and the conditional dropdown div (rendered when
modelPickerOpen is true) lack ARIA semantics; update the Button to include
aria-expanded={modelPickerOpen} and aria-controls referencing the dropdown's id,
and add role="menu" (and an id) to the dropdown container div so screen readers
can detect its expanded state and treat the container as a menu; ensure any
interactive items inside the dropdown use appropriate roles (e.g.,
role="menuitem") if present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 56c2d9ac-e7b6-4dbc-90ed-56abb2c18f97
📒 Files selected for processing (2)
client/src/pages/PlaygroundPage.tsxserver/src/routes/proxy.ts
Implements the
recency-biased-thompson-samplingspec.Phase 2 changes to
server/src/services/router.ts:refreshStatsCache()to skip redundant refreshes within the TTL windowgetAnalyticsScores()so cache refresh happens before the early-return guardmodelIdindentation in return type annotation andresult.push()object literalRoutingModetype by importing/re-exporting from@freellmapi/shared/types.jsSummary by CodeRabbit
New Features
Improvements