Feat/realtime sticky by vi70x3 · Pull Request #15 · animaios/api-llm-localhost

vi70x3 · 2026-06-04T21:57:57Z

Summary by Sourcery

Refine routing, streaming, and error-handling behavior across the proxy, router, and providers to improve reliability, isolation between sessions, and observability for specific models and pools.

New Features:

Introduce active-request tracking and safeguards so LongCat and Owl Alpha are excluded from routing when another session is currently using them.
Add transient model cooldowns shared across requests to temporarily skip recently failing models.
Add SSE heartbeat and stall protection for streaming responses to detect stalled upstream providers and signal timeouts to clients.
Expose recency-biased analytics and model pool metadata to the fallback admin UI, grouping models into fast, balanced, and smart pools.

Bug Fixes:

Ensure sticky sessions and provider bans are disabled for balanced routing mode so auto routing remains free of sticky state.
Handle wrapped error payloads returned with HTTP 200 from multiple providers to avoid misinterpreting error bodies as successful completions.
Tighten truncation detection patterns so cut-off responses are classified correctly across providers.

Enhancements:

Bias Thompson-sampling analytics toward recent traffic using time-decayed success statistics while preserving raw counts for display.
Refine LongCat and Owl Alpha routing to use model-level bans instead of provider-wide bans and adjust their participation in balanced vs smart routing.
Generalize thread protection and error-handling paths, including mid-stream failures, to consistently clear sticky preferences and skip only affected models.
Refactor router tests and utilities to better cover key selection, routing behavior, and new analytics fields.

Documentation:

Add internal design documents describing wrapped-error interception, SSE stall protection, recency-biased Thompson sampling, transient model cooldowns, generalized thread protection, and Owl Alpha/LongCat routing behavior.

Tests:

Add extensive unit and integration tests for provider session bans, transient cooldowns, SSE heartbeat and stall handling, wrapped-error interception, and recency-weighted analytics.
Extend fallback and router test suites to validate pool enums, routing invariants, and that balanced mode bypasses sticky session logic.

Summary by CodeRabbit

New Features
- Streaming heartbeat and stall protection to keep connections alive and detect upstream timeouts
- Transient model cooldowns to mitigate concurrent provider failures
- Recency-biased model scoring favoring recent request performance
- Preferential routing for Owl Alpha and LongCat models in smart mode
- Balanced routing mode excludes premium models for fair distribution
- Fallback page now groups models by speed tier
- Improved error handling for wrapped error responses from providers
Bug Fixes
- Fixed sticky session routing for balanced mode
Tests
- Added comprehensive test coverage for new streaming, cooldown, and routing features

…alized thread protection scanner

… longcat branches

…tracking for LongCat and Owl Alpha

- Change activeRequests from Map to Set to allow concurrent requests from same session - Add stale active request cleanup with 10-minute TTL - Cache owl-alpha model ID to avoid repeated DB lookups - Fix active request iteration to use Set-compatible syntax

sourcery-ai · 2026-06-04T21:58:03Z

Reviewer's Guide

Implements generalized realtime sticky / thread protection: disables sticky sessions in balanced mode, adds shared transient model cooldowns and active-request-based protection for LongCat and Owl Alpha, extends router scoring with recency-biased stats and balanced-pool exclusions, introduces wrapped-error interception in providers, adds SSE heartbeat/stall protection for streams, and updates fallback UI to surface pools, along with comprehensive tests and design docs.

Sequence diagram for active-request-based LongCat/Owl Alpha protection

sequenceDiagram
  actor ClientA
  actor ClientB
  participant Proxy as handleChatCompletion
  participant Router as routeRequest
  participant Provider as provider.streamChatCompletion

  ClientA->>Proxy: POST /chat (sessionKey A, LongCat)
  Proxy->>Router: routeRequest(..., routingMode)
  Router-->>Proxy: route (platform=longcat)
  Proxy->>Proxy: activeRequests.add({sessionKey A, platform longcat, modelId})
  Proxy->>Provider: streamChatCompletion(...)

  ClientB->>Proxy: POST /chat (sessionKey B, auto-smart)
  Proxy->>Router: routeRequest(..., routingMode)
  Router-->>Proxy: candidate routes incl. LongCat
  Proxy->>Proxy: scan activeRequests
  alt otherSessionUsingLongCat
    Proxy->>Proxy: addProviderModelsToSkipModels(skipModels, longcat)
    Proxy->>Router: routeRequest retries with skipModels
  end

  Provider-->>Proxy: stream ends for ClientA
  Proxy->>Proxy: delete activeRequests entry for sessionKey A
  Proxy-->>ClientA: SSE complete

Flow diagram for updated routing with pools and recency bias

flowchart TD
  A[routeRequest] --> B[load enabled models chain]
  B --> C{routingMode == balanced?}
  C -- yes --> D[filter chain using EXCLUDED_FROM_BALANCED and EXCLUDED_MODELS_FROM_BALANCED]
  C -- no --> E[use full chain]
  D --> F[compute intelligenceRanks]
  E --> F

  F --> G[refreshStatsCache with recency-weighted total/successes]
  G --> H[build ModelStats with rawTotal, total, successes]
  H --> I[compute effectiveScore per entry]
  I --> J[sort entries by effectiveScore]

  J --> K{routingMode == smart?}
  K -- no --> L[iterate sorted routes normally]

  K -- yes --> M[LongCat preference: hasValidKeys for longcat]
  M -->|true| N[move longcat entries to front]
  M -->|false| O[leave order]
  N --> P[Owl Alpha preference: hasValidKeys for owl-alpha]
  O --> P
  P -->|true| Q[insert owl-alpha just after any longcat entries]
  P -->|false| R[keep order]

  Q --> L
  R --> L

  L --> S[return best route to proxy]

File-Level Changes

Change	Details	Files
Disable sticky sessions for balanced routing and tighten provider-session-ban semantics.	Change getSessionKey to return an empty string for balanced mode so all sticky operations become no-ops for freellmapi/auto. Update provider-session-ban tests to use smart routing where stickiness applies and add explicit tests that balanced mode does not create or read sticky state. Adjust truncation detection sample phrase to better match provider messages ("cut off").	`server/src/routes/proxy.ts` `server/src/__tests__/routes/provider-session-ban.test.ts`
Introduce active-request-based protection and transient cooldowns for LongCat and Owl Alpha models in the proxy.	Track currently active requests per session/platform/model via an activeRequests Set and periodically prune stale entries. Before routing, skip LongCat if another session is actively using any LongCat model, and skip Owl Alpha if another session is actively using openrouter/owl-alpha (with DB lookup and caching of the model id). Replace prior LongCat provider-wide bans with model-level skipModels updates for LongCat-2.0-Preview and Owl Alpha, and clear sticky preferences when their pinned model fails with ban-eligible or retryable errors. Export and test a transientModelCooldowns Map and TRANSIENT_COOLDOWN_MS; integrate it into skipModels injection and sticky override, and add tests covering lifecycle, error classification, and interaction with session bans.	`server/src/routes/proxy.ts` `server/src/__tests__/routes/proxy-tools.test.ts` `server/src/__tests__/routes/transient-cooldown.test.ts`
Harden streaming path with SSE keepalive / stall protection and add configuration for testing.	Introduce a streamKeepaliveConfig object (KEEPALIVE_INTERVAL_MS, MAX_STREAM_STALL_MS) and use it in the streaming branch of handleChatCompletion to detect stalls and send keepalive comments. On stall, emit a structured stream_timeout error frame (or response.failed event in Responses API mode), end the response, and avoid double-writes via a streamAborted flag and cleanup handler. Add tests that shrink the intervals, mock provider streaming behavior to exercise heartbeat emission, mid-stream stall, pre-stream stall (504), client disconnect cleanup, and normal fast-streaming behavior.	`server/src/routes/proxy.ts` `server/src/__tests__/routes/stream-heartbeat-stall.test.ts`
Make router analytics and routing more robust with recency-biased stats, balanced-pool exclusions, and smarter key checks.	Change stats aggregation SQL to compute recency-weighted successes and totals over the analytics window, while also tracking rawTotal separately. Extend ModelStats to include rawTotal and adjust getAnalyticsScores to expose rawTotal while using weighted successRate. Introduce EXCLUDED_FROM_BALANCED and EXCLUDED_MODELS_FROM_BALANCED so balanced routing excludes the LongCat platform and openrouter/owl-alpha, and only smart routing may pick them automatically. Factor out hasValidKeys helper that checks enabled, non-invalid keys for cooldown and capacity, then reuse it to prefer LongCat and Owl Alpha in smart mode when capacity exists (with Owl Alpha inserted after any preferred LongCat entries). Fix router tests to reset fallback_config ordering and correct an INSERT statement typo; simplify key-skip tests.	`server/src/services/router.ts` `server/src/__tests__/services/router.test.ts`
Improve provider error handling by intercepting wrapped error payloads returned with HTTP 200.	Make BaseProvider.extractErrorMessage protected and add isWrappedError/throwWrappedError helpers to detect root-level error fields and throw ProviderApiError with appropriate status and message. In Cloudflare, Cohere, OpenAI-compat, and Google providers, call isWrappedError/throwWrappedError after parsing non-stream responses and for each parsed streaming chunk before yielding. Add design documentation for wrapped-error interception describing architecture and formats handled.	`server/src/providers/base.ts` `server/src/providers/cloudflare.ts` `server/src/providers/cohere.ts` `server/src/providers/openai-compat.ts` `server/src/providers/google.ts` `.roo/specs/wrapped-error-interception/design.md` `.roo/specs/wrapped-error-interception/requirements.md` `.roo/specs/wrapped-error-interception/tasks.md`
Refine thread protection and model pool semantics for LongCat/Owl Alpha and pool-based fallback UI.	Exclude LongCat and Owl Alpha from balanced mode routing chain while allowing explicit targeting, and add smart-mode preference for Owl Alpha alongside LongCat using hasValidKeys. Introduce a generalized thread protection design (and partially implemented threadProtection service) to move away from hardcoded LongCat-only logic, though most enforcement is still in proxy via activeRequests and cooldowns. Expose model pool (fast/balanced/smart) in fallback API responses, validate values against ModelPool in tests, and update FallbackPage UI to group models by pool using PoolSection and poolTitles for better operator visibility.	`server/src/services/router.ts` `server/src/services/threadProtection.ts` `server/src/__tests__/routes/fallback.test.ts` `client/src/pages/FallbackPage.tsx` `.roo/specs/owl-alpha-longcat-model-routing/design.md` `.roo/specs/owl-alpha-longcat-model-routing/requirements.md` `.roo/specs/owl-alpha-longcat-model-routing/tasks.md` `.roo/specs/generalized-thread-protection/design.md` `.roo/specs/generalized-thread-protection/requirements.md` `.roo/specs/generalized-thread-protection/tasks.md`
Add and adjust internal design/specification and helper scripts for router and streaming behavior.	Document SSE heartbeat/stall-protection, recency-biased Thompson sampling, disabling sticky on auto, and transient model cooldowns via .roo/specs design, requirements, and task files. Add small helper scripts in the repo (Python and text files) apparently used by the author to patch proxy.ts and router.test.ts; these are not used at runtime but may influence development workflows.	`.roo/specs/sse-stream-heartbeat-stall-protection/design.md` `.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md` `.roo/specs/sse-stream-heartbeat-stall-protection/tasks.md` `.roo/specs/recency-biased-thompson-sampling/design.md` `.roo/specs/recency-biased-thompson-sampling/requirements.md` `.roo/specs/recency-biased-thompson-sampling/tasks.md` `.roo/specs/disable-sticky-on-auto/design.md` `.roo/specs/disable-sticky-on-auto/requirements.md` `.roo/specs/disable-sticky-on-auto/tasks.md` `.roo/specs/transient-model-cooldown/design.md` `.roo/specs/transient-model-cooldown/requirements.md` `.roo/specs/transient-model-cooldown/tasks.md` `fix_streaming.py` `fix.py` `do_fix.py` `new_streaming_block.txt` `server/write_test.py` `server/write_tests.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-06-04T21:58:08Z

📝 Walkthrough

Walkthrough

Large cross-functional PR implementing sticky-session disabling for balanced routing, wrapped-error detection across providers, model-level banning for LongCat/Owl Alpha, recency-weighted analytics, transient model cooldown tracking, SSE stream stall protection, client pool grouping, and comprehensive design specifications for planned features.

Changes

Specification & Design Documentation

Layer / File(s)	Summary
Sticky Session Disabling `.roo/specs/disable-sticky-on-auto/*`	Design, requirements, and tasks for disabling sticky behavior on balanced/auto routing via empty session key from `getSessionKey()`.
Generalized Thread Protection Scanner `.roo/specs/generalized-thread-protection/*`	Design and requirements for replacing hardcoded LongCat logic with dynamic, configurable per-platform thread protection levels via `THREAD_PROTECTION_PLATFORMS` env var.
Owl Alpha + LongCat Model-Level Routing `.roo/specs/owl-alpha-longcat-model-routing/*`	Comprehensive design, requirements, and three-phase task plan for balanced-mode exclusion and smart-mode preference ordering of both models with model-level banning.
Recency-Biased Thompson Sampling `.roo/specs/recency-biased-thompson-sampling/*`	Design and requirements for linear time-decay weighting in analytics aggregation with raw/weighted count separation and Beta parameter guards.
SSE Stream Heartbeat & Stall Protection `.roo/specs/sse-stream-heartbeat-stall-protection/*`	Design and requirements for keep-alive SSE comments and upstream stall detection with configurable timeouts.
Transient Model Cooldown `.roo/specs/transient-model-cooldown/*`	Design and requirements for shared global 15-second cooldown state to mitigate concurrent transient failures.
Wrapped Error Interception `.roo/specs/wrapped-error-interception/*`	Design, requirements, and tasks for detecting HTTP 200 responses with root-level `error` payloads across all providers.

Core Implementation

Layer / File(s)	Summary
Wrapped Error Detection `server/src/providers/base.ts`, `server/src/providers/{openai-compat,cohere,cloudflare,google}.ts`	`BaseProvider.extractErrorMessage` visibility changed to `protected`; new `isWrappedError()` and `throwWrappedError()` methods detect and throw on root-level `error` fields in HTTP 200 responses; all providers insert checks after JSON parsing in both non-streaming and streaming paths.
Proxy Route: Sticky & Active Requests `server/src/routes/proxy.ts`	`getSessionKey()` returns `''` for balanced mode; new `activeRequests` map tracks concurrent sessions; model-level banning replaces provider-level bans for LongCat/Owl Alpha across truncation/5xx/retryable errors; sticky clearing only when pinned model matches failing model; active request registration/deregistration in `finally` blocks.
Router: Analytics & Mode Filtering `server/src/services/router.ts`	`refreshStatsCache` computes recency-weighted successes/total with raw counts; `ModelStats` extended with `rawTotal`; balanced mode filters candidate chain to exclude longcat and owl-alpha; smart mode uses new `hasValidKeys()` helper to conditionally promote LongCat and Owl Alpha to front.
Thread Protection Service `server/src/services/threadProtection.ts`	New module defining `ProtectionLevel`, `ErrorContextKind`, `ErrorContext`, `ThreadProtectionAction` types; parses `THREAD_PROTECTION_PLATFORMS` env var with backward-compatible defaults (`longcat -> provider-ban`, others -> `model-skip`); exports `getProtectionLevel()` and `evaluateThreadProtection()`.
Client Pool Grouping `client/src/pages/FallbackPage.tsx`	Import `PoolSection` and `PoolType`; type `pool` field as `PoolType`; compute `poolGroups` by filtering `displayEntries` per pool; render grouped `PoolSection` blocks with per-pool `SortHeader` and `ModelRow` lists.

Test Coverage

Layer / File(s)	Summary
Session Ban & Sticky Tests `server/src/__tests__/routes/provider-session-ban.test.ts`	Shift sticky tests from balanced to smart mode; add ban lifecycle/TTL/interaction assertions; new suite verifying balanced mode disables all sticky operations (empty session key, no sticky creation, false ban checks).
Transient Cooldown Tests `server/src/__tests__/routes/transient-cooldown.test.ts`	Test cooldown map basics, injection/pruning, auto-recovery, sticky override interactions, error classification eligibility, and integration with session-ban additions.
SSE Heartbeat & Stall Tests `server/src/__tests__/routes/stream-heartbeat-stall.test.ts`	New test suite with local `request()` helper; five test cases covering keep-alive emission, stall termination, pre-stream stalls, client disconnect cleanup, and normal fast streaming.
Additional Updates `server/src/__tests__/routes/{proxy-tools,fallback}.test.ts`, `server/src/__tests__/services/router.test.ts`	proxy-tools adds cooldown cleanup; fallback adds pool field validation; router test refactors key insertion and imports additional helpers.

Helper Scripts & Utilities

Layer / File(s)	Summary
Streaming Block Helpers `fix.py`, `do_fix.py`, `fix_streaming.py`, `new_streaming_block.txt`	Python scripts and reference implementation for Promise.race-based stall detection in proxy streaming path; `new_streaming_block.txt` shows stream state initialization and `cleanupStream()` helper.
Test Generation `server/write_test.py`, `server/write_tests.py`	Python scripts generating `router.test.ts` with database initialization, API key setup, and initial routing behavior assertions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

vi70x3/freellmapi#9: Shares the same getSessionKey() sticky-skipping logic for balanced mode and complementary test expectations in provider-session-ban.
vi70x3/freellmapi#2: Both PRs modify sticky-session machinery—main PR disables sticky for balanced by making getSessionKey() return '', while this PR adds sticky-key persistence, so balanced-mode change directly affects when new sticky-key behavior can activate.
vi70x3/freellmapi#8: Main PR disables sticky for balanced, which prevents LongCat sticky-cooldown safeguard (dependent on sticky session and skipModels/preferred-model handling) from running in that mode.

Poem

🐰 Sticky sessions untangled, errors caught mid-stream,
Models grouped by pool, analytics weighed with time's gleam,
Longcat and Owl Alpha now rule with finesse,
From balanced to smart, the routing's blessed!

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/realtime-sticky

qodo-code-review · 2026-06-04T21:58:44Z

Review Summary by Qodo

Realtime Sticky Sessions with Active-Request Tracking, Transient Cooldowns, and Wrapped Error Interception

✨ Enhancement 🧪 Tests 📝 Documentation

Walkthroughs

Description

  **Core Features:**
• Replaced LongCat cooldown mechanism with real-time active-request tracking for concurrent session
  protection, enabling model-level bans instead of provider-level bans
• Implemented identical active-request pattern for Owl Alpha model protection
• Added transient model cooldown system with 15-second window to share failure state across
  concurrent requests, preventing redundant retries during outages
• Introduced recency-weighted analytics scoring using 7-day decay function for fresher provider
  statistics
• Excluded LongCat and Owl Alpha from balanced routing mode entirely, with smart-mode preference
  when valid keys exist
• Disabled sticky sessions for balanced/auto endpoint by returning empty string from
  getSessionKey() in that mode
  **Error Handling & Resilience:**
• Added wrapped error detection helpers (isWrappedError(), throwWrappedError()) to
  BaseProvider for HTTP 200 responses containing error payloads
• Implemented wrapped error interception across all provider implementations (OpenAI-compat, Cohere,
  Cloudflare, Google)
• Added SSE stream heartbeat keep-alive comments (15-second interval) to prevent intermediate proxy
  idle timeouts
• Implemented stall detection (45-second threshold) with graceful termination of hung upstream
  connections
• Added sticky session clearing logic when LongCat or Owl Alpha encounters 5xx errors or retryable
  failures
  **Configuration & Routing:**
• Introduced generalized thread protection configuration module with configurable protection levels
  (provider-ban, model-skip, off) per platform
• Added hasValidKeys() helper to check key capacity for platform/model combinations
• Updated router to compute weighted success/total counts while tracking raw unweighted totals for
  dashboard transparency
  **UI Improvements:**
• Added pool-based grouping to Fallback Configuration Page, organizing models by pool (fast,
  balanced, smart)
• Added pool and speedRank properties to fallback API response with validation
  **Testing:**
• Comprehensive test suite for transient model cooldown lifecycle, injection, and auto-recovery
• New test suite for SSE stream heartbeat and stall detection with client disconnect cleanup
• Updated provider-session-ban tests to validate balanced mode sticky session disabling
• Updated proxy-tools tests to clear transient cooldowns in cleanup
• Fixed router test infrastructure with proper imports and SQL syntax

Diagram

flowchart LR
  A["HTTP 200 Response"] -->|"isWrappedError()"| B["Wrapped Error Detected"]
  B -->|"throwWrappedError()"| C["ProviderApiError Thrown"]
  C -->|"Retry Loop"| D["Request Retried"]
  
  E["Concurrent Requests"] -->|"5xx/Connection Error"| F["transientModelCooldowns Map"]
  F -->|"15s Cooldown"| G["Model Skipped Globally"]
  G -->|"Sticky Override"| H["Preferred Model Bypassed"]
  
  I["Balanced Mode"] -->|"getSessionKey()"| J["Empty String"]
  J -->|"Cascades"| K["All Sticky Ops Disabled"]
  
  L["LongCat/Owl Alpha"] -->|"Active Requests"| M["Concurrent Protection"]
  M -->|"Model-Level Ban"| N["Per-Model Tracking"]
  
  O["SSE Stream"] -->|"15s Interval"| P["Heartbeat Comment"]
  O -->|"45s Stall"| Q["Stream Terminated"]

File Changes

1. server/src/routes/proxy.ts ✨ Enhancement +204/-72

Realtime active-request tracking replaces cooldown for session protection

• Replaced LongCat-specific cooldown mechanism with active-request tracking for concurrent session
 protection
• Added activeRequests Set to track ongoing requests per session/platform/model, enabling
 real-time safeguards against simultaneous usage
• Implemented Owl Alpha model protection alongside LongCat using the same active-request pattern
• Changed error handling from provider-level bans to model-level bans for both LongCat and Owl Alpha
• Added sticky session clearing logic when LongCat or Owl Alpha encounters 5xx errors or retryable
 failures
• Modified getSessionKey() to return empty string for balanced routing mode, disabling sticky
 sessions in that mode

server/src/routes/proxy.ts

2. server/src/__tests__/routes/provider-session-ban.test.ts 🧪 Tests +92/-48

Tests updated for balanced mode sticky session disabling

• Updated all test cases to use smart routing mode instead of balanced for sticky session
 testing
• Added new test suite validating that balanced mode disables sticky sessions entirely
• Changed truncation detection test pattern from 'conflict in response' to 'cut off'
• Verified that getSessionKey() returns empty string for balanced mode
• Confirmed that sticky operations (ban, set, get) are no-ops in balanced mode

server/src/tests/routes/provider-session-ban.test.ts

3. server/src/__tests__/routes/transient-cooldown.test.ts 🧪 Tests +415/-0

New test suite for transient model cooldown system

• New comprehensive test suite for transient model cooldown functionality
• Tests cooldown map lifecycle: creation, retrieval, deletion, expiry pruning
• Validates cooldown injection into skipModels set and auto-recovery after expiry
• Tests sticky session override when preferred model is on global cooldown
• Verifies only 5xx and connection failures (undefined status) trigger cooldowns, not
 auth/rate-limit errors
• Integration tests confirm cooldown and session-ban mechanisms work together

server/src/tests/routes/transient-cooldown.test.ts

View more (40)

4. server/src/__tests__/routes/stream-heartbeat-stall.test.ts 🧪 Tests +324/-0

New test suite for stream heartbeat and stall protection

• New test suite for SSE stream heartbeat and stall detection protection
• Tests heartbeat keep-alive comments emitted during idle periods (>100ms without chunks)
• Validates stream termination with stream_timeout error when stalled >500ms
• Tests pre-stream stall detection returning 504 before headers sent
• Verifies client disconnect cleanup prevents interval leaks
• Confirms normal fast streams complete successfully with heartbeat enabled

server/src/tests/routes/stream-heartbeat-stall.test.ts

5. server/src/services/router.ts ✨ Enhancement +86/-32

Recency-weighted analytics and Owl Alpha smart preference

• Added recency-weighted analytics scoring using 7-day decay function for stats freshness
• Introduced EXCLUDED_FROM_BALANCED and EXCLUDED_MODELS_FROM_BALANCED sets to exclude LongCat
 and Owl Alpha from balanced routing
• Implemented hasValidKeys() helper to check key capacity for a platform/model combination
• Added Owl Alpha smart-mode preference logic alongside LongCat preference
• Updated refreshStatsCache() to compute weighted success/total counts and track raw unweighted
 totals
• Modified getAnalyticsScores() to return raw unweighted totals instead of weighted totals

server/src/services/router.ts

6. server/src/__tests__/services/router.test.ts 🧪 Tests +3/-36

Router tests updated with import changes

• Updated test imports to include refreshStatsCache and getAnalyticsScores
• Removed redundant test cases for invalid/disabled keys
• Fixed SQL INSERT statement syntax error in test setup

server/src/tests/services/router.test.ts

7. server/src/services/threadProtection.ts ✨ Enhancement +119/-0

New thread protection configuration and decision module

• New module defining thread protection configuration and decision logic
• Exports ProtectionLevel type with three levels: provider-ban, model-skip, off
• Implements parseProtectionConfig() to parse THREAD_PROTECTION_PLATFORMS environment variable
• Provides getProtectionLevel() to look up protection level for a platform
• Implements evaluateThreadProtection() decision matrix for error context evaluation
• Default behavior preserves LongCat provider-ban while applying model-skip to other platforms

server/src/services/threadProtection.ts

8. server/src/providers/base.ts ✨ Enhancement +28/-1

Wrapped error detection helpers for HTTP 200 error responses

• Changed extractErrorMessage() visibility from private to protected for reuse
• Added isWrappedError() predicate to detect root-level error field in JSON responses
• Added throwWrappedError() helper to construct and throw ProviderApiError from wrapped error
 payloads
• Handles both string and object error formats with optional error codes

server/src/providers/base.ts

9. server/src/__tests__/routes/fallback.test.ts 🧪 Tests +11/-0

Fallback API tests for pool property validation

• Added import for ModelPool type from shared types
• Added test validating fallback API response includes speedRank and pool properties
• Added test verifying pool values are valid ModelPool enum values

server/src/tests/routes/fallback.test.ts

10. server/src/__tests__/routes/proxy-tools.test.ts 🧪 Tests +3/-2

Proxy tools tests updated for cooldown cleanup
• Added import for transientModelCooldowns from proxy module
• Updated beforeEach hook to clear both stickySessionMap and transientModelCooldowns
server/src/tests/routes/proxy-tools.test.ts

11. server/src/providers/openai-compat.ts ✨ Enhancement +10/-1

Wrapped error detection for OpenAI-compatible providers

• Added wrapped error detection in chatCompletion() method after JSON parsing
• Added wrapped error detection in streamChatCompletion() method for each SSE chunk
• Calls isWrappedError() and throwWrappedError() before processing response data

server/src/providers/openai-compat.ts

12. server/src/providers/cohere.ts ✨ Enhancement +10/-1

Wrapped error detection for Cohere provider

• Added wrapped error detection in chatCompletion() method after JSON parsing
• Added wrapped error detection in streamChatCompletion() method for each SSE chunk
• Calls isWrappedError() and throwWrappedError() before processing response data

server/src/providers/cohere.ts

13. server/src/providers/cloudflare.ts ✨ Enhancement +10/-1

Wrapped error detection for Cloudflare provider

• Added wrapped error detection in chatCompletion() method after JSON parsing
• Added wrapped error detection in streamChatCompletion() method for each SSE chunk
• Calls isWrappedError() and throwWrappedError() before processing response data

server/src/providers/cloudflare.ts

14. server/src/providers/google.ts ✨ Enhancement +10/-0

Wrapped error detection for Google Gemini provider

• Added wrapped error detection in chatCompletion() method after JSON parsing
• Added wrapped error detection in streamChatCompletion() method after parsing each chunk
• Calls isWrappedError() and throwWrappedError() before accessing response candidates

server/src/providers/google.ts

15. .roo/specs/sse-stream-heartbeat-stall-protection/design.md 📝 Documentation +330/-0

Design spec for stream heartbeat and stall protection

• New design specification for SSE stream heartbeat and stall protection feature
• Documents architecture with heartbeat interval (15s) and stall timeout (45s)
• Provides detailed implementation guidance for stream state variables and cleanup routines
• Includes mermaid flowchart showing stream lifecycle with heartbeat and stall detection
• Covers edge cases and interaction with existing error handling paths
• Lists all files requiring modification with specific line numbers

.roo/specs/sse-stream-heartbeat-stall-protection/design.md

16. .roo/specs/wrapped-error-interception/design.md 📝 Documentation +337/-0

Design spec for wrapped error payload interception

• New design specification for wrapped error payload detection on HTTP 200 responses
• Documents architecture with isWrappedError() predicate and throwWrappedError() helper
• Provides implementation details for all provider types (OpenAI, Cohere, Cloudflare, Google)
• Includes mermaid flowchart showing error detection and retry loop integration
• Covers wrapped error formats and edge cases (null errors, array errors, streaming errors)
• Lists all files to modify with specific method locations

.roo/specs/wrapped-error-interception/design.md

17. .roo/specs/transient-model-cooldown/requirements.md 📝 Documentation +38/-0

Requirements for transient model cooldown system

• New requirements document for shared temporary cooldowns for concurrent failure mitigation
• Defines problem: multiple concurrent requests independently retry failing models during outages
• Specifies cross-request transient failure state with 15-second cooldown window
• Requires integration with existing routing logic via skipModels set
• Mandates sticky session precedence: global cooldown overrides preferred model
• Includes auto-recovery via expiry and acceptance criteria

.roo/specs/transient-model-cooldown/requirements.md

18. .roo/specs/transient-model-cooldown/design.md Design documentation +197/-0

Shared Temporary Cooldowns for Concurrent Failure Mitigation

• Introduces a module-level transientModelCooldowns Map to share transient failure state across
 concurrent requests
• Implements lazy pruning of expired cooldown entries on every request
• Registers global cooldowns for 5xx and connection errors, excluding rate-limit and auth errors
• Overrides sticky session preferences when the preferred model is on global cooldown

.roo/specs/transient-model-cooldown/design.md

19. .roo/specs/recency-biased-thompson-sampling/design.md Design documentation +238/-0

Recency-Biased Thompson Sampling Time-Decay Aggregation

• Replaces flat request aggregation with linear time-decay weighting in SQL queries
• Adds rawSuccesses and rawTotal fields to ModelStats for dashboard transparency
• Implements Math.max(0.1, ...) guards on Beta parameters to prevent mathematical errors
• Updates dashboard display to show actual request counts while using weighted success rates

.roo/specs/recency-biased-thompson-sampling/design.md

20. .roo/specs/sse-stream-heartbeat-stall-protection/requirements.md Requirements documentation +132/-0

SSE Stream Heartbeats and Stall Protection Requirements

• Adds periodic SSE comment heartbeats (15-second interval) to prevent intermediate proxy idle
 timeouts
• Implements stall detection (45-second threshold) to gracefully terminate hung upstream connections
• Defines cleanup routine for client disconnects and error conditions
• Specifies error frame formats for both Responses API and Chat Completion streams

.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md

21. .roo/specs/owl-alpha-longcat-model-routing/design.md Design documentation +184/-0

Owl Alpha and LongCat Model-Level Routing Design

• Excludes LongCat and Owl Alpha from balanced auto routing entirely
• Implements smart mode preference for both models when valid keys exist
• Migrates LongCat from provider-level to model-level banning
• Applies identical model-level banning to Owl Alpha across all error types

.roo/specs/owl-alpha-longcat-model-routing/design.md

22. .roo/specs/owl-alpha-longcat-model-routing/tasks.md Tasks documentation +116/-0

Owl Alpha and LongCat Model-Level Routing Implementation Tasks
• Phase 1: Add balanced mode exclusion constants and filter logic in router
• Phase 2: Implement sticky cooldown checks and model-level error handling in proxy
• Phase 3: Define test cases for exclusion, preference, cooldown, and banning behavior
.roo/specs/owl-alpha-longcat-model-routing/tasks.md

23. client/src/pages/FallbackPage.tsx ✨ Enhancement +49/-30

Add Pool-Based Grouping to Fallback Configuration Page

• Adds pool field to FallbackEntry interface to support pool categorization
• Imports PoolSection component and PoolType type for UI organization
• Groups model entries by pool (fast, balanced, smart) with descriptive section titles
• Wraps table rendering in pool-grouped sections instead of flat list

client/src/pages/FallbackPage.tsx

24. .roo/specs/owl-alpha-longcat-model-routing/requirements.md Requirements documentation +126/-0

Owl Alpha and LongCat Model-Level Routing Requirements

• Specifies exclusion of LongCat and Owl Alpha from balanced auto routing
• Defines smart auto preference logic for both models with capacity validation
• Requires sticky session cooldown protection for both platforms
• Mandates model-level (not provider-level) banning for both LongCat and Owl Alpha

.roo/specs/owl-alpha-longcat-model-routing/requirements.md

25. .roo/specs/generalized-thread-protection/design.md Design documentation +152/-0

Generalized Thread Protection Scanner Architecture

• Introduces threadProtection.ts scanner module to replace hardcoded platform-specific branches
• Defines configurable protection levels (provider-ban, model-skip, off) per platform
• Implements decision matrix for 5xx, truncation, and retryable error handling
• Generalizes sticky cooldown logic to work with any platform configured for provider-ban protection

.roo/specs/generalized-thread-protection/design.md

26. .roo/specs/disable-sticky-on-auto/design.md Design documentation +97/-0

Disable Sticky Sessions on Balanced Auto Endpoint

• Modifies getSessionKey() to return empty string for balanced routing mode
• Cascades through all sticky session functions via early-return pattern
• Preserves sticky sessions for smart routing mode unchanged
• Eliminates need for changes across multiple functions via single-point guard

.roo/specs/disable-sticky-on-auto/design.md

27. .roo/specs/wrapped-error-interception/requirements.md Requirements documentation +53/-0

Wrapped Error Payloads on HTTP 200 Responses Detection

• Detects error payloads wrapped in HTTP 200 responses across all provider implementations
• Throws properly typed ProviderApiError to enable existing retry loop handling
• Validates root-level error field in both JSON and streaming responses
• Extracts error messages and status codes from wrapped error objects

.roo/specs/wrapped-error-interception/requirements.md

28. .roo/specs/wrapped-error-interception/tasks.md Tasks documentation +65/-0

Wrapped Error Interception Implementation Tasks

• Adds isWrappedError() and throwWrappedError() helper methods to BaseProvider
• Changes extractErrorMessage() visibility from private to protected
• Inserts wrapped-error checks in all four provider implementations (OpenAI-compat, Cohere,
 Cloudflare, Google)
• Covers both non-streaming and streaming response paths

.roo/specs/wrapped-error-interception/tasks.md

29. .roo/specs/recency-biased-thompson-sampling/tasks.md Tasks documentation +17/-0

Recency-Biased Thompson Sampling Implementation Tasks

• Adds ANALYTICS_WINDOW_DAYS constant derivation from existing window constant
• Extends ModelStats interface with rawSuccesses and rawTotal fields
• Rewrites SQL query with CTE-based linear time-decay weighting
• Adds Math.max(0.1, ...) guards to Beta parameters in four scoring functions

.roo/specs/recency-biased-thompson-sampling/tasks.md

30. .roo/specs/recency-biased-thompson-sampling/requirements.md Requirements documentation +76/-0

Recency-Biased Thompson Sampling Requirements and Test Cases

• Specifies linear time-decay formula for request weighting over 7-day window
• Requires Math.max(0.1, ...) guards for Beta parameter safety
• Uses standard SQLite julianday() function for portability
• Defines test cases for outage sensitivity and fractional evaluation safety

.roo/specs/recency-biased-thompson-sampling/requirements.md

31. .roo/specs/disable-sticky-on-auto/requirements.md Requirements documentation +44/-0

Disable Sticky Sessions on Balanced Auto Endpoint Requirements
• Disables sticky model and key pinning for balanced/auto endpoint
• Removes session-level platform bans for balanced mode
• Preserves all sticky functionality for smart/auto-smart endpoint
• Maintains per-request retry skip logic for both modes
.roo/specs/disable-sticky-on-auto/requirements.md

32. .roo/specs/sse-stream-heartbeat-stall-protection/tasks.md Tasks documentation +20/-0

SSE Stream Heartbeats and Stall Protection Implementation Tasks

• Adds constants KEEPALIVE_INTERVAL_MS and MAX_STREAM_STALL_MS to proxy.ts
• Implements heartbeat interval with stall detection and cleanup routine
• Adds client-disconnect listener and error handling for heartbeat write failures
• Defines unit tests for heartbeat emission, stall detection, and cleanup scenarios

.roo/specs/sse-stream-heartbeat-stall-protection/tasks.md

33. .roo/specs/transient-model-cooldown/tasks.md Tasks documentation +16/-0

Transient Model Cooldown Implementation Tasks

• Declares module-level transientModelCooldowns Map and TRANSIENT_COOLDOWN_MS constant
• Implements pre-routing cooldown injection with lazy pruning
• Registers global cooldowns on 5xx and connection failures
• Defines unit tests for injection, registration, sticky override, and auto-recovery

.roo/specs/transient-model-cooldown/tasks.md

34. server/write_tests.py Miscellaneous +45/-0

Router Test File Generation Script

• Python script to write router test file in parts to avoid truncation
• Sets up test infrastructure with database initialization and encryption key
• Defines basic test cases for key configuration and model routing

server/write_tests.py

35. .roo/specs/generalized-thread-protection/tasks.md Tasks documentation +12/-0

Generalized Thread Protection Scanner Implementation Tasks

• Renames LONGCAT_STICKY_COOLDOWN_MS to THREAD_COOLDOWN_MS throughout proxy.ts
• Removes hardcoded LongCat and Owl Alpha cooldown blocks
• Inserts generalized thread protection scanner with exhaustion protection
• Defines unit tests for dynamic exclusivity, exhaustion bypass, and self-preservation

.roo/specs/generalized-thread-protection/tasks.md

36. server/write_test.py Miscellaneous +29/-0

Router Test File Generation Script

• Python script to write complete router.test.ts file with test infrastructure
• Initializes database and encryption key for test environment
• Defines basic test cases for routing behavior

server/write_test.py

37. .roo/specs/disable-sticky-on-auto/tasks.md Tasks documentation +16/-0

Disable Sticky Sessions on Balanced Auto Endpoint Tasks

• Modifies getSessionKey() to return empty string for balanced routing mode
• Adds test cases verifying sticky operations are skipped in balanced mode
• Runs existing test suite to verify backward compatibility
• Includes manual smoke test for free routing behavior

.roo/specs/disable-sticky-on-auto/tasks.md

38. new_streaming_block.txt Miscellaneous +28/-0

New Streaming Block Implementation Snippet

• Defines new streaming block structure with heartbeat and stall detection state variables
• Implements cleanupStream() idempotent cleanup routine
• Introduces stallTimeout() helper for Promise.race-based stall detection

new_streaming_block.txt

39. fix_streaming.py Miscellaneous +21/-0

Streaming Block Replacement Script

• Python script to replace streaming block in proxy.ts with Promise.race-based stall detection
• Reads before/after parts and applies new streaming block structure

fix_streaming.py

40. fix.py Miscellaneous +8/-0

Streaming Block Replacement Script Stub

• Python script stub for replacing streaming block in proxy.ts
• Locates streaming block boundaries for replacement

fix.py

41. .roo/specs/generalized-thread-protection/requirements.md Miscellaneous +5/-0

Generalized Thread Protection Requirements Stub
• Incomplete requirements file with problem statement header
• Indicates need for generalized thread protection scanner to replace hardcoded branches
.roo/specs/generalized-thread-protection/requirements.md

42. do_fix.py Miscellaneous +7/-0

Streaming Block Replacement Script Stub
• Python script stub for streaming block replacement
• Minimal implementation for file content reading
do_fix.py

43. fix{ Additional files +0/-0

...

fix{

qodo-code-review · 2026-06-04T21:58:44Z

Code Review by Qodo

🐞 Bugs (7) 📘 Rule violations (0)

Context used

✅ Compliance rules (platform): 6 rules

1. Extra paren breaks SQL 🐞 Bug ≡ Correctness

Description

refreshStatsCache() contains an unbalanced weighted SUM expression (... / 7.0))))) that will fail
SQLite statement preparation. Since routeRequest() calls refreshStatsCache() unconditionally,
routing can throw before selecting any model.

Code

server/src/services/router.ts[184]

+      SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total,

Evidence

The SQL string used by refreshStatsCache ends the weighted SUM(...) expression with four closing
parentheses, and routeRequest always invokes refreshStatsCache before routing, so the prepare-time
SQL error will occur in the hot path.

server/src/services/router.ts[175-201]
server/src/services/router.ts[493-506]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`refreshStatsCache()` builds an invalid SQLite query due to an extra closing parenthesis in the recency-weighted `SUM(MAX(0, MIN(...))))` expression.

## Issue Context
`routeRequest()` calls `refreshStatsCache(db)` on every routing decision, so this SQL prepare error will break request routing globally.

## Fix Focus Areas
- server/src/services/router.ts[178-188]
- server/src/services/router.ts[503-506]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Missing pool components 🐞 Bug ≡ Correctness

Description

FallbackPage imports @/components/pool-section and @/components/pool-badge, but no such modules
exist in the client codebase. This causes immediate client build failures due to unresolved module
imports.

Code

client/src/pages/FallbackPage.tsx[R6-7]

+import { PoolSection } from '@/components/pool-section'
+import type { PoolType } from '@/components/pool-badge'

Evidence
FallbackPage now depends on PoolSection/PoolType via new imports and JSX usage; without
corresponding modules in the repo, the client build will fail at import resolution.
client/src/pages/FallbackPage.tsx[1-10]
client/src/pages/FallbackPage.tsx[288-371]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`client/src/pages/FallbackPage.tsx` imports `PoolSection` and `PoolType` from modules that are not present in the repository, which will fail module resolution during TS/build.

## Issue Context
The page renders `<PoolSection>` and types entries with `PoolType`, so the imports cannot be trivially removed without adjusting rendering/typing.

## Fix Focus Areas
- client/src/pages/FallbackPage.tsx[6-7]
- client/src/pages/FallbackPage.tsx[67-72]
- client/src/pages/FallbackPage.tsx[335-371]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Fallback API lacks pool 🐞 Bug ≡ Correctness

Description

FallbackPage groups/filters entries by entry.pool and the new fallback API test asserts a pool
field, but the /api/fallback response objects do not include pool. This will render no entries
in the UI and fail the new test.

Code

client/src/pages/FallbackPage.tsx[R298-300]

+  const poolGroups = poolOrder
+    .map(pool => ({ pool, entries: displayEntries.filter(e => e.pool === pool) }))
+    .filter(group => group.entries.length > 0)

Evidence
The client filters on e.pool and the test expects a pool property, but the server’s fallback
route response mapping never assigns pool, so the field will be undefined/missing.
client/src/pages/FallbackPage.tsx[288-301]
server/src/routes/fallback.ts[37-78]
server/src/tests/routes/fallback.test.ts[43-62]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The client and tests now require a `pool` field on `/api/fallback` entries, but the server route currently omits it.

## Issue Context
- Client: builds `poolGroups` by filtering `displayEntries` on `e.pool`.
- Test: asserts each entry has `pool` and that it is in the expected enum.

## Fix Focus Areas
- server/src/routes/fallback.ts[37-78]
- client/src/pages/FallbackPage.tsx[288-301]
- server/src/__tests__/routes/fallback.test.ts[43-62]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

View more (2)

4. Shared types missing ModelPool 🐞 Bug ≡ Correctness

Description

fallback.test.ts imports ModelPool from @freellmapi/shared/types.js, but the shared workspace
package entrypoint is shared/types.ts, which does not export ModelPool. This breaks compilation
of the new fallback test (and any code trying to use ModelPool).

Code

server/src/tests/routes/fallback.test.ts[2]
+import { ModelPool } from '@freellmapi/shared/types.js';

Evidence
The test imports and uses ModelPool.*, but the shared package resolves to shared/types.ts (per
package.json) and that file currently exports types like Platform/Model without any ModelPool
definition.
server/src/tests/routes/fallback.test.ts[1-62]
shared/package.json[1-7]
shared/types.ts[1-73]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`server/src/__tests__/routes/fallback.test.ts` imports `ModelPool` from `@freellmapi/shared/types.js`, but the workspace `@freellmapi/shared` package points to `shared/types.ts` and that file does not define/export `ModelPool`.

## Issue Context
Because `shared/package.json` uses `main`/`types` = `./types.ts`, any symbol imported from `@freellmapi/shared/types.js` must be exported from `shared/types.ts`.

## Fix Focus Areas
- server/src/__tests__/routes/fallback.test.ts[1-62]
- shared/package.json[1-7]
- shared/types.ts[1-80]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. router.test.ts SQL malformed 🐞 Bug ≡ Correctness

Description

server/src/__tests__/services/router.test.ts contains a broken INSERT template string with a
duplicated VALUES line, producing invalid SQL and invalid test setup. This will fail the router
test (and may block the overall test suite depending on runner settings).

Code

server/src/tests/services/router.test.ts[R31-32]
+      VALUES (?,
      VALUES (?, ?, ?, ?, ?, ?, ?)

Evidence

The test’s SQL string includes both VALUES (?, and a second VALUES (?, ?, ...) line inside the
same INSERT statement, which is invalid SQL and will fail when executed.

server/src/tests/services/router.test.ts[22-38]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The router test inserts an API key using a template string that includes `VALUES` twice, making the SQL invalid.

## Issue Context
The first `it('should route to highest priority model...')` test depends on this INSERT to seed the DB.

## Fix Focus Areas
- server/src/__tests__/services/router.test.ts[26-34]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. NaN wrapped error status 🐞 Bug ☼ Reliability

Description

BaseProvider.throwWrappedError() sets error.status using Number(errPayload.code), which becomes
NaN for non-numeric codes. proxy.ts treats NaN as a number in getErrorStatus(), but NaN never
matches the numeric status checks, misclassifying wrapped provider errors as
non-retryable/non-ban-eligible.

Code

server/src/providers/base.ts[R142-145]

+    error.status =
+      typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)
+        ? Number((errPayload as Record<string, unknown>).code)
+        : 200;

Evidence
The base provider converts error.code with Number(...) (NaN possible) and proxy routing logic
relies on numeric status comparisons; NaN will bypass these checks.
server/src/providers/base.ts[124-149]
server/src/routes/proxy.ts[496-545]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`throwWrappedError()` can assign `status: NaN` when the upstream `error.code` is non-numeric, causing downstream status-based logic to fail to recognize retryable/ban-eligible cases.

## Issue Context
`proxy.ts` uses strict numeric comparisons for retry/ban decisions, and `getErrorStatus()` currently returns any `number`, including NaN.

## Fix Focus Areas
- server/src/providers/base.ts[135-148]
- server/src/routes/proxy.ts[496-545]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

7. Broken helper scripts committed 🐞 Bug ⚙ Maintainability

Description

Multiple newly added Python helper scripts are committed in a truncated/syntax-error state and
include hard-coded absolute paths. These files add repo noise and can break if accidentally executed
by developers or CI tooling.

Code

do_fix.py[R1-7]

+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+    content = f.read()
+
+{

Evidence
The scripts are clearly truncated (e.g., a bare { at EOF) and contain hard-coded absolute paths,
indicating they are not production-ready artifacts.
do_fix.py[1-7]
server/write_test.py[1-29]
server/write_tests.py[1-45]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Helper scripts appear accidentally committed: they are incomplete (syntax errors) and include machine-specific absolute paths.

## Issue Context
These files are not referenced by the app but remain in the repo root and `server/`, increasing maintenance burden and risk of accidental execution.

## Fix Focus Areas
- do_fix.py[1-7]
- fix.py[1-8]
- fix_streaming.py[1-21]
- server/write_test.py[1-29]
- server/write_tests.py[1-45]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

sourcery-ai

Hey - I've found 3 issues, and left some high level feedback:

There are several scratch/automation scripts and temp files checked in (e.g. write_test.py, fix_streaming.py, fix.py, do_fix.py, server/write_tests.py, new_streaming_block.txt); these should either be removed from the repo or moved under an appropriate tooling directory and excluded from the production code path.
The changes to server/src/tests/services/router.test.ts appear incomplete/corrupted (e.g. the malformed INSERT INTO api_keys statement with duplicate VALUES clause and the truncated test body ending in const groqKey = encrypt); please restore this file to a syntactically valid state and ensure the intended test modifications are fully applied.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There are several scratch/automation scripts and temp files checked in (e.g. write_test.py, fix_streaming.py, fix.py, do_fix.py, server/write_tests.py, new_streaming_block.txt); these should either be removed from the repo or moved under an appropriate tooling directory and excluded from the production code path.
- The changes to server/src/__tests__/services/router.test.ts appear incomplete/corrupted (e.g. the malformed INSERT INTO api_keys statement with duplicate VALUES clause and the truncated test body ending in `const groqKey = encrypt`); please restore this file to a syntactically valid state and ensure the intended test modifications are fully applied.

## Individual Comments

### Comment 1
<location path="fix_streaming.py" line_range="8-17" />
<code_context>
+with open('/tmp/before.ts', 'r') as f:
+    before = f.read()
+
+with open('/tmp/after.ts', 'r') as f:
+    after = f.read()  # starts with "} else {"
+
+# The new streaming block
+new_streaming = r'''      if (stream) {
+        // SSE headers set immediately so keep-alive works during TTFB.
+        // Pre-stream errors stay retryable; mid-stream errors emit an SSE error frame.
+        let totalOutputTokens = 0;
+        let streamedText = '';
+        let sawToolCalls = false;
+        let streamStarted = false;
+        let ttfbMs: number | null = null;
+        let lastChunkTimestamp = Date.now();
+        let heartbeatInterval: ReturnType<typeof setInterval>{
\ No newline at end of file
</code_context>
<issue_to_address>
**issue:** Helper script appears incomplete/truncated and may not be runnable as-is.

This script only reads `before.ts`/`after.ts` and defines `new_streaming`; it never performs a replacement or writes output. It also ends mid-line at `let heartbeatInterval: ReturnType<typeof setInterval>{`, so it isn’t syntactically valid.

If it’s just a local helper, either remove it from the repo or finish it so it compiles and has a clear usage, to avoid confusion for future maintainers and tooling.
</issue_to_address>

### Comment 2
<location path="fix.py" line_range="4" />
<code_context>
+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+    content = f.read()
+
</code_context>
<issue_to_address>
**issue:** This script is effectively a no-op and looks like an unfinished helper.

`fix.py` only reads `proxy.ts` and doesn’t implement the described replacement. If this script isn’t used, consider removing it to avoid confusion; if you intend to keep it, either add the replacement logic or a clear TODO indicating it’s incomplete and how it should be used.
</issue_to_address>

### Comment 3
<location path="do_fix.py" line_range="4-7" />
<code_context>
+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+    content = f.read()
+
+{
\ No newline at end of file
</code_context>
<issue_to_address>
**issue (bug_risk):** do_fix.py ends with a bare `{` and is syntactically invalid Python.

The trailing `{` makes this file invalid Python and will raise a syntax error if run. If this was just a scratch file, please either delete it from the repo or update it to be valid so it doesn't break tools that scan or import Python modules.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-06-04T21:59:52Z

+with open('/tmp/after.ts', 'r') as f:
+    after = f.read()  # starts with "} else {"
+
+# The new streaming block
+new_streaming = r'''      if (stream) {
+        // SSE headers set immediately so keep-alive works during TTFB.
+        // Pre-stream errors stay retryable; mid-stream errors emit an SSE error frame.
+        let totalOutputTokens = 0;
+        let streamedText = '';
+        let sawToolCalls = false;


issue: Helper script appears incomplete/truncated and may not be runnable as-is.

This script only reads before.ts/after.ts and defines new_streaming; it never performs a replacement or writes output. It also ends mid-line at let heartbeatInterval: ReturnType<typeof setInterval>{, so it isn’t syntactically valid.

If it’s just a local helper, either remove it from the repo or finish it so it compiles and has a clear usage, to avoid confusion for future maintainers and tooling.

sourcery-ai · 2026-06-04T21:59:52Z

+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:


issue: This script is effectively a no-op and looks like an unfinished helper.

fix.py only reads proxy.ts and doesn’t implement the described replacement. If this script isn’t used, consider removing it to avoid confusion; if you intend to keep it, either add the replacement logic or a clear TODO indicating it’s incomplete and how it should be used.

sourcery-ai · 2026-06-04T21:59:52Z

+with open('server/src/routes/proxy.ts', 'r') as f:
+    content = f.read()
+
+{


issue (bug_risk): do_fix.py ends with a bare { and is syntactically invalid Python.

The trailing { makes this file invalid Python and will raise a syntax error if run. If this was just a scratch file, please either delete it from the repo or update it to be valid so it doesn't break tools that scan or import Python modules.

gemini-code-assist

Code Review

This pull request introduces several enhancements, including disabling sticky sessions on the balanced endpoint, implementing model-level routing and smart preferences for Owl Alpha and LongCat, adding recency-biased Thompson Sampling, and handling wrapped error payloads on HTTP 200 responses. However, several critical issues and missing implementations were identified. There are SQL syntax errors in the time-decay query in router.ts, and the router.test.ts file is syntactically invalid due to a duplicated VALUES clause and truncation. Additionally, the SSE stream heartbeat/stall protection and shared temporary cooldown features are completely unimplemented in proxy.ts, causing test compilation failures, while the new threadProtection.ts module remains unused. Finally, several optimizations are recommended in proxy.ts to eliminate redundant database queries, simplify duplicate truncation-handling branches, and remove global mutable cache variables that could cause test flakiness.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-04T22:02:40Z

  const rows = db.prepare(`
    SELECT platform, model_id,
-      COUNT(*) as total,
-      SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes,
+      COUNT(*) as raw_total,
+      SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total,
+      SUM(CASE WHEN status = 'success'
+        THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))
+        ELSE 0 END) as successes,


There are SQL syntax errors in the query due to mismatched parentheses on lines 184 and 186, which will cause a runtime crash when db.prepare() is executed.

On line 184, there is an extra closing parenthesis: SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) (4 closing parentheses instead of 3).

On line 186, there is also an extra closing parenthesis: THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) (3 closing parentheses instead of 2).

Suggested change

const rows = db.prepare(`

SELECT platform, model_id,

COUNT(*) as total,

SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes,

COUNT(*) as raw_total,

SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total,

SUM(CASE WHEN status = 'success'

THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))

ELSE 0 END) as successes,

const rows = db.prepare(`

SELECT platform, model_id,

COUNT(*) as raw_total,

SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))) as total,

SUM(CASE WHEN status = 'success'

THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0))

ELSE 0 END) as successes,

gemini-code-assist · 2026-06-04T22:02:40Z

    const { encrypted, iv, authTag } = encrypt('test-groq-key');
    db.prepare(`
      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?,


There is a syntax error in the SQL query on lines 31-32. The VALUES clause is duplicated and malformed, which will cause the test suite to crash. Removing the duplicate line fixes the query.

gemini-code-assist · 2026-06-04T22:02:40Z

-    expect(result.platform).toBe('groq');
-  });
-});
+    const groqKey = encrypt


The test file is truncated at the end, leaving the file syntactically invalid and missing several tests and closing braces. Restoring the rest of the file fixes the issue.

const groqKey = encrypt('test-groq-key'); db.prepare(` INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?) `).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1); const result = routeRequest(); expect(result.platform).toBe('groq'); }); it('should skip invalid keys', () => { const db = getDb(); const invalidKey = encrypt('invalid-key'); db.prepare(` INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?) `).run('google', 'invalid', invalidKey.encrypted, invalidKey.iv, invalidKey.authTag, 'invalid', 1); const groqKey = encrypt('test-groq-key'); db.prepare(` INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?) `).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1); const result = routeRequest(); expect(result.platform).toBe('groq'); }); });

gemini-code-assist · 2026-06-04T22:02:40Z

+import type { Express } from 'express';
+import { createApp } from '../../app.js';
+import { initDb, getDb, getUnifiedApiKey } from '../../db/index.js';
+import { streamKeepaliveConfig } from '../../routes/proxy.js';


This test file imports streamKeepaliveConfig from ../../routes/proxy.js, but streamKeepaliveConfig is neither defined nor exported in server/src/routes/proxy.ts. The "SSE Stream Heartbeats and Stall Protection" feature is completely unimplemented in proxy.ts, making this test file fail to compile. Please ensure the implementation of this feature is committed and exported.

gemini-code-assist · 2026-06-04T22:02:40Z

+import {
+  transientModelCooldowns,
+  TRANSIENT_COOLDOWN_MS,
+  stickySessionMap,
+  addProviderModelsToSkipModels,
+} from '../../routes/proxy.js';


This test file imports transientModelCooldowns and TRANSIENT_COOLDOWN_MS from ../../routes/proxy.js, but they are neither defined nor exported in server/src/routes/proxy.ts. The "Shared Temporary Cooldowns" feature is completely unimplemented in proxy.ts, making this test file fail to compile. Please ensure the implementation of this feature is committed and exported.

gemini-code-assist · 2026-06-04T22:02:41Z

+// Cached owl-alpha model ID: undefined = not yet looked up, null = not found, number = found
+let cachedOwlAlphaModelId: number | null | undefined = undefined;


This global mutable variable is used to cache the Owl Alpha model ID, but since we are querying it directly in the request handler to avoid test flakiness and stale cache issues, this variable is no longer needed.

gemini-code-assist · 2026-06-04T22:02:41Z

+  // Pure Active-Request Owl Alpha Safeguard: Exclude openrouter/owl-alpha
+  // from bandit routing for this request ONLY if another session is actively using it right now.
+  let otherSessionUsingOwl = false;
+
+  for (const active of activeRequests) {
+    if (active.sessionKey !== sessionKey && active.platform === 'openrouter' && active.modelId === 'owl-alpha') {
+      otherSessionUsingOwl = true;
+      break;
+    }
+  }
+
+  if (otherSessionUsingOwl) {
+    if (cachedOwlAlphaModelId === undefined) {
+      const db = getDb();
+      const owlRow = db.prepare("SELECT id FROM models WHERE platform = 'openrouter' AND model_id = 'owl-alpha'").get() as { id: number } | undefined;
+      cachedOwlAlphaModelId = owlRow ? owlRow.id : null;
+    }
+    if (cachedOwlAlphaModelId !== null) {
+      skipModels.add(cachedOwlAlphaModelId);
+      console.log(`[Sticky] Owl Alpha protection active — excluding openrouter/owl-alpha from bandit routing because another session is actively using it`);
    }
  }


Caching cachedOwlAlphaModelId in a global mutable variable without resetting it can cause test flakiness or bugs when the database is re-initialized (e.g., in-memory DBs recreated between tests). Since querying the model ID by platform and model ID is extremely fast, it is safer and cleaner to query it directly when needed.

if (otherSessionUsingOwl) { const db = getDb(); const owlRow = db.prepare("SELECT id FROM models WHERE platform = 'openrouter' AND model_id = 'owl-alpha'").get() as { id: number } | undefined; if (owlRow) { skipModels.add(owlRow.id); console.log(`[Sticky] Owl Alpha protection active — excluding openrouter/owl-alpha from bandit routing because another session is actively using it`); } }

gemini-code-assist · 2026-06-04T22:02:41Z

              if (route.platform === 'longcat') {
-                // LongCat: exclude entire provider immediately on truncation
-                console.warn(`[Proxy] Truncated stream content detected from LongCat — banning LongCat provider for session`);
-                banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId);
-                addProviderModelsToSkipModels(skipModels, 'longcat');
+                // LongCat: model-level ban
+                console.warn(`[Proxy] Truncated stream content detected from LongCat — skipping model ${route.modelId} for session`);
+                skipModels.add(route.modelDbId);
+              } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
+                // Owl Alpha: model-level ban
+                console.warn(`[Proxy] Truncated stream content detected from Owl Alpha — skipping model ${route.modelId} for session`);
+                skipModels.add(route.modelDbId);
              } else {
-                // Non-LongCat: skip only this specific model, other models from same provider remain available
+                // Other providers: model-level ban
                console.warn(`[Proxy] Truncated stream content detected from ${route.platform} — skipping model ${route.modelId} for session`);
                skipModels.add(route.modelDbId);
              }


The conditional branches for longcat, owl-alpha, and other providers on truncation detection perform the exact same action (skipModels.add(route.modelDbId)). We can simplify this block significantly by combining them into a single dynamic log and action.

if (isTruncatedResponse(streamTextToCheck)) { console.warn(`[Proxy] Truncated stream content detected from ${route.displayName} — skipping model ${route.modelId} for session`); skipModels.add(route.modelDbId); }

gemini-code-assist · 2026-06-04T22:02:41Z

+                // Clear sticky if pinned to LongCat
+                if (preferredModel) {
+                  const db = getDb();
+                  const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+                  if (prefRow?.model_id === 'LongCat-2.0-Preview') {
+                    preferredModel = undefined;
+                    preferredKeyId = undefined;
+                  }
+                }
+              } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
+                console.warn(`[Proxy] Mid-stream 5xx from Owl Alpha — skipping model ${route.modelId} for session`);
+                skipModels.add(route.modelDbId);
+                // Clear sticky if pinned to Owl Alpha
+                if (preferredModel) {
+                  const db = getDb();
+                  const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+                  if (prefRow?.model_id === 'owl-alpha') {
+                    preferredModel = undefined;
+                    preferredKeyId = undefined;
+                  }
+                }
              } else {


Inside the error handling blocks, the code queries the database to retrieve the model_id of preferredModel to check if it matches 'LongCat-2.0-Preview' or 'owl-alpha'. This is redundant because the active route that just failed (route) already contains the modelDbId and modelId. We can simply check if preferredModel === route.modelDbId to clear the sticky preference, avoiding any database queries. This optimization applies to all 6 occurrences of this pattern in the file.

if (route.platform === 'longcat') { console.warn(`[Proxy] Mid-stream 5xx from LongCat — skipping model ${route.modelId} for session`); skipModels.add(route.modelDbId); // Clear sticky if pinned to LongCat if (preferredModel === route.modelDbId) { preferredModel = undefined; preferredKeyId = undefined; } } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { console.warn(`[Proxy] Mid-stream 5xx from Owl Alpha — skipping model ${route.modelId} for session`); skipModels.add(route.modelDbId); // Clear sticky if pinned to Owl Alpha if (preferredModel === route.modelDbId) { preferredModel = undefined; preferredKeyId = undefined; } }

gemini-code-assist · 2026-06-04T22:02:41Z

        if (route.platform === 'longcat') {
-          console.warn(`[Proxy] Retryable error from LongCat — excluding entire LongCat provider for session`);
-          banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId);
-          addProviderModelsToSkipModels(skipModels, 'longcat');
+          console.warn(`[Proxy] Retryable error from LongCat — skipping model ${route.modelId} for session`);
+          skipModels.add(route.modelDbId);
+          // Clear sticky if pinned to LongCat
+          if (preferredModel) {
+            const db = getDb();
+            const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+            if (prefRow?.model_id === 'LongCat-2.0-Preview') {
+              preferredModel = undefined;
+              preferredKeyId = undefined;
+            }
+          }
+        } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
+          console.warn(`[Proxy] Retryable error from Owl Alpha — skipping model ${route.modelId} for session`);
+          skipModels.add(route.modelDbId);
+          // Clear sticky if pinned to Owl Alpha
          if (preferredModel) {
            const db = getDb();
-            const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
-            if (prefRow?.platform === 'longcat') {
+            const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+            if (prefRow?.model_id === 'owl-alpha') {
              preferredModel = undefined;
              preferredKeyId = undefined;
            }
          }
        } else {


We can optimize the sticky preference clearing here by directly comparing preferredModel === route.modelDbId instead of querying the database. This avoids redundant queries on every retryable error.

if (route.platform === 'longcat') { console.warn(`[Proxy] Retryable error from LongCat — skipping model ${route.modelId} for session`); skipModels.add(route.modelDbId); // Clear sticky if pinned to LongCat if (preferredModel === route.modelDbId) { preferredModel = undefined; preferredKeyId = undefined; } } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') { console.warn(`[Proxy] Retryable error from Owl Alpha — skipping model ${route.modelId} for session`); skipModels.add(route.modelDbId); // Clear sticky if pinned to Owl Alpha if (preferredModel === route.modelDbId) { preferredModel = undefined; preferredKeyId = undefined; } }

kilo-code-bot · 2026-06-04T22:02:57Z

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
WARNING	2

Issue Details (click to expand)

WARNING

File	Line	Issue
`server/src/routes/proxy.ts`	1506	Unused `isRetryableStreamError` check — branch always evaluates to false and has no effect
`server/src/services/threadProtection.ts`	1	Dead module — never imported or called anywhere in the codebase

Other Observations (not in diff)

File	Line	Issue
`server/src/routes/proxy.ts`	1335	Unguarded `chunk.choices[0]` access — if a provider yields an empty `choices` array, this throws `TypeError` and breaks the stream

Files Reviewed (6 files)

server/src/providers/base.ts — wrapped-error helpers (OK)
server/src/providers/cloudflare.ts — wrapped-error check added (OK)
server/src/providers/cohere.ts — wrapped-error check added (OK)
server/src/providers/google.ts — wrapped-error check added (OK)
server/src/providers/openai-compat.ts — wrapped-error check added (OK)
server/src/routes/proxy.ts — 2 issues
server/src/services/router.ts — recency weighting + owl-alpha preference (OK)
server/src/services/threadProtection.ts — dead module

_{Reviewed by step-3.7-flash-20260528 · 2,262,052 tokens}

qodo-code-review · 2026-06-04T22:09:35Z

-      COUNT(*) as total,
-      SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes,
+      COUNT(*) as raw_total,
+      SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total,


1. Extra paren breaks sql 🐞 Bug ≡ Correctness

refreshStatsCache() contains an unbalanced weighted SUM expression (... / 7.0))))) that will fail SQLite statement preparation. Since routeRequest() calls refreshStatsCache() unconditionally, routing can throw before selecting any model.

Agent Prompt

## Issue description `refreshStatsCache()` builds an invalid SQLite query due to an extra closing parenthesis in the recency-weighted `SUM(MAX(0, MIN(...))))` expression. ## Issue Context `routeRequest()` calls `refreshStatsCache(db)` on every routing decision, so this SQL prepare error will break request routing globally. ## Fix Focus Areas - server/src/services/router.ts[178-188] - server/src/services/router.ts[503-506]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

+import { PoolSection } from '@/components/pool-section'
+import type { PoolType } from '@/components/pool-badge'


2. Missing pool components 🐞 Bug ≡ Correctness

FallbackPage imports @/components/pool-section and @/components/pool-badge, but no such modules exist in the client codebase. This causes immediate client build failures due to unresolved module imports.

Agent Prompt

## Issue description `client/src/pages/FallbackPage.tsx` imports `PoolSection` and `PoolType` from modules that are not present in the repository, which will fail module resolution during TS/build. ## Issue Context The page renders `<PoolSection>` and types entries with `PoolType`, so the imports cannot be trivially removed without adjusting rendering/typing. ## Fix Focus Areas - client/src/pages/FallbackPage.tsx[6-7] - client/src/pages/FallbackPage.tsx[67-72] - client/src/pages/FallbackPage.tsx[335-371]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

+  const poolGroups = poolOrder
+    .map(pool => ({ pool, entries: displayEntries.filter(e => e.pool === pool) }))
+    .filter(group => group.entries.length > 0)


3. Fallback api lacks pool 🐞 Bug ≡ Correctness

FallbackPage groups/filters entries by entry.pool and the new fallback API test asserts a pool field, but the /api/fallback response objects do not include pool. This will render no entries in the UI and fail the new test.

Agent Prompt

## Issue description The client and tests now require a `pool` field on `/api/fallback` entries, but the server route currently omits it. ## Issue Context - Client: builds `poolGroups` by filtering `displayEntries` on `e.pool`. - Test: asserts each entry has `pool` and that it is in the expected enum. ## Fix Focus Areas - server/src/routes/fallback.ts[37-78] - client/src/pages/FallbackPage.tsx[288-301] - server/src/__tests__/routes/fallback.test.ts[43-62]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

@@ -1,4 +1,5 @@
 import { describe, it, expect, beforeAll } from 'vitest';
+import { ModelPool } from '@freellmapi/shared/types.js';


4. Shared types missing modelpool 🐞 Bug ≡ Correctness

fallback.test.ts imports ModelPool from @freellmapi/shared/types.js, but the shared workspace package entrypoint is shared/types.ts, which does not export ModelPool. This breaks compilation of the new fallback test (and any code trying to use ModelPool).

Agent Prompt

## Issue description `server/src/__tests__/routes/fallback.test.ts` imports `ModelPool` from `@freellmapi/shared/types.js`, but the workspace `@freellmapi/shared` package points to `shared/types.ts` and that file does not define/export `ModelPool`. ## Issue Context Because `shared/package.json` uses `main`/`types` = `./types.ts`, any symbol imported from `@freellmapi/shared/types.js` must be exported from `shared/types.ts`. ## Fix Focus Areas - server/src/__tests__/routes/fallback.test.ts[1-62] - shared/package.json[1-7] - shared/types.ts[1-80]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

+      VALUES (?,
      VALUES (?, ?, ?, ?, ?, ?, ?)


5. Router.test.ts sql malformed 🐞 Bug ≡ Correctness

server/src/__tests__/services/router.test.ts contains a broken INSERT template string with a duplicated VALUES line, producing invalid SQL and invalid test setup. This will fail the router test (and may block the overall test suite depending on runner settings).

Agent Prompt

## Issue description The router test inserts an API key using a template string that includes `VALUES` twice, making the SQL invalid. ## Issue Context The first `it('should route to highest priority model...')` test depends on this INSERT to seed the DB. ## Fix Focus Areas - server/src/__tests__/services/router.test.ts[26-34]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

+    error.status =
+      typeof errPayload === 'object' && errPayload !== null && 'code' in (errPayload as Record<string, unknown>)
+        ? Number((errPayload as Record<string, unknown>).code)
+        : 200;


6. Nan wrapped error status 🐞 Bug ☼ Reliability

BaseProvider.throwWrappedError() sets error.status using Number(errPayload.code), which becomes NaN for non-numeric codes. proxy.ts treats NaN as a number in getErrorStatus(), but NaN never matches the numeric status checks, misclassifying wrapped provider errors as non-retryable/non-ban-eligible.

Agent Prompt

## Issue description `throwWrappedError()` can assign `status: NaN` when the upstream `error.code` is non-numeric, causing downstream status-based logic to fail to recognize retryable/ban-eligible cases. ## Issue Context `proxy.ts` uses strict numeric comparisons for retry/ban decisions, and `getErrorStatus()` currently returns any `number`, including NaN. ## Fix Focus Areas - server/src/providers/base.ts[135-148] - server/src/routes/proxy.ts[496-545]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-04T22:09:35Z

+#!/usr/bin/env python3
+"""Replace the streaming block in proxy.ts with Promise.race-based stall detection."""
+
+with open('server/src/routes/proxy.ts', 'r') as f:
+    content = f.read()
+
+{


7. Broken helper scripts committed 🐞 Bug ⚙ Maintainability

Multiple newly added Python helper scripts are committed in a truncated/syntax-error state and include hard-coded absolute paths. These files add repo noise and can break if accidentally executed by developers or CI tooling.

Agent Prompt

## Issue description Helper scripts appear accidentally committed: they are incomplete (syntax errors) and include machine-specific absolute paths. ## Issue Context These files are not referenced by the app but remain in the repo root and `server/`, increasing maintenance burden and risk of accidental execution. ## Fix Focus Areas - do_fix.py[1-7] - fix.py[1-8] - fix_streaming.py[1-21] - server/write_test.py[1-29] - server/write_tests.py[1-45]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

coderabbitai

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

server/src/routes/proxy.ts (1)

1505-1537: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Mid-stream retryable error handling still uses provider-level banning for LongCat and is missing for Owl Alpha.

This code path (lines 1506-1518) contradicts the completed tasks T2.8 and T2.9, and violates requirements REQ-4 and REQ-5:

LongCat: Still uses banPlatformFromSession('longcat') + addProviderModelsToSkipModels(skipModels, 'longcat') (provider-level) instead of skipModels.add(route.modelDbId) (model-level)
Owl Alpha: No mid-stream retryable error handling at all—the code only checks for LongCat

This means:

LongCat retryable errors ban the entire platform instead of just the specific model (inconsistent with other error paths)
Owl Alpha retryable errors fall through to generic handling without model-level skipping

🔧 Proposed fix for model-level banning

-          // Mid-stream retryable error handling for LongCat
-          if (route.platform === 'longcat' && isRetryableStreamError(streamErr)) {
-            console.warn(`[Proxy] Mid-stream retryable error from LongCat — excluding entire LongCat provider for session`);
-            banPlatformFromSession(normalizedMessages, routingMode, 'longcat', route.modelDbId);
-            addProviderModelsToSkipModels(skipModels, 'longcat');
-            // Clear sticky preference if pinned to LongCat
-            if (preferredModel) {
-              const db = getDb();
-              const prefRow = db.prepare('SELECT platform FROM models WHERE id = ?').get(preferredModel) as { platform: string } | undefined;
-              if (prefRow?.platform === 'longcat') {
-                preferredModel = undefined;
-                preferredKeyId = undefined;
-              }
+          // Mid-stream retryable error handling
+          if (isRetryableStreamError(streamErr)) {
+            if (route.platform === 'longcat') {
+              console.warn(`[Proxy] Mid-stream retryable error from LongCat — skipping model ${route.modelId} for session`);
+              skipModels.add(route.modelDbId);
+              // Clear sticky if pinned to this specific model
+              if (preferredModel) {
+                const db = getDb();
+                const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+                if (prefRow?.model_id === 'LongCat-2.0-Preview') {
+                  preferredModel = undefined;
+                  preferredKeyId = undefined;
+                }
+              }
+            } else if (route.platform === 'openrouter' && route.modelId === 'owl-alpha') {
+              console.warn(`[Proxy] Mid-stream retryable error from Owl Alpha — skipping model ${route.modelId} for session`);
+              skipModels.add(route.modelDbId);
+              // Clear sticky if pinned to Owl Alpha
+              if (preferredModel) {
+                const db = getDb();
+                const prefRow = db.prepare('SELECT model_id FROM models WHERE id = ?').get(preferredModel) as { model_id: string } | undefined;
+                if (prefRow?.model_id === 'owl-alpha') {
+                  preferredModel = undefined;
+                  preferredKeyId = undefined;
+                }
+              }
             }
-            try {
+            // ... rest of error response handling

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/routes/proxy.ts` around lines 1505 - 1537, The mid-stream
retryable error branch currently bans entire providers via
banPlatformFromSession and addProviderModelsToSkipModels for LongCat and omits
Owl Alpha; change it to perform model-level skipping instead: when
route.platform === 'longcat' and isRetryableStreamError(streamErr) replace
banPlatformFromSession('longcat') and addProviderModelsToSkipModels(skipModels,
'longcat') with skipModels.add(route.modelDbId) (so only the failing model is
skipped), and ensure the sticky-preference clear logic still checks
prefRow?.platform === 'longcat' before clearing preferredModel/preferredKeyId;
additionally add an identical mid-stream branch for route.platform ===
'owl-alpha' that uses skipModels.add(route.modelDbId) and the same
sticky-preference clearing for 'owl-alpha' so Owl Alpha retryable stream errors
are handled at model level rather than falling through.

server/src/__tests__/services/router.test.ts (1)

63-64: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix truncated/incomplete test file: router.test.ts won’t parse

server/src/__tests__/services/router.test.ts ends at line 63 with const groqKey = encrypt, leaving the expression unfinished and the file syntactically invalid. Complete the encrypt(...) call and restore the rest of the test block.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/__tests__/services/router.test.ts` around lines 63 - 64, The test
file is truncated at "const groqKey = encrypt" so complete the encrypt(...) call
and restore the test block; specifically finish the expression for groqKey by
calling encrypt with the intended plaintext or test mock (e.g.,
encrypt('test-groq-key') or the project’s GROQ key/mocked value), add the
missing semicolon, and re-add the remaining assertions and closing braces of the
test/describe block so the file parses; look for the encrypt identifier and the
surrounding it/describe that reference groqKey to ensure you restore the
original assertions and closures.

🧹 Nitpick comments (3)

server/src/services/router.ts (1)

184-187: ⚡ Quick win

Avoid hardcoding the decay window in SQL.

The recency divisor is duplicated as 7.0; this can silently diverge from ANALYTICS_WINDOW_MS. Derive and bind window days once, then reuse in both weighted expressions.

Proposed diff

 const ANALYTICS_WINDOW_MS = 7 * 24 * 60 * 60 * 1000;
+const ANALYTICS_WINDOW_DAYS = ANALYTICS_WINDOW_MS / (24 * 60 * 60 * 1000);
 const ANALYTICS_CACHE_TTL_MS = 60 * 1000;
...
-      SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))) as total,
+      SUM(MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / ?)))) as total,
       SUM(CASE WHEN status = 'success'
-        THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / 7.0)))
+        THEN MAX(0, MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / ?)))
         ELSE 0 END) as successes,
...
-  `).all(since) as Array<{
+  `).all(ANALYTICS_WINDOW_DAYS, ANALYTICS_WINDOW_DAYS, since) as Array<{

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/services/router.ts` around lines 184 - 187, Replace the duplicated
literal 7.0 recency divisor with a single derived-and-bound value calculated
from ANALYTICS_WINDOW_MS (e.g., const windowDays = ANALYTICS_WINDOW_MS /
(1000*60*60*24)) and reuse that variable in both weighted expressions (the
MIN(1.0, 1.0 - (julianday('now') - julianday(created_at)) / <window>)) for total
and successes; bind it into the SQL once (use a named parameter like :windowDays
or a positional placeholder) so the same value is used in SUM(MAX(...)) and
SUM(CASE WHEN ... THEN MAX(...)) and cannot diverge.

server/write_test.py (1)

4-4: ⚡ Quick win

Use a repository-relative output path to keep this script portable.

Hardcoding /home/vi/... makes the generator machine-specific.

Proposed fix

+from pathlib import Path
-path = '/home/vi/freellmapi/server/src/__tests__/services/router.test.ts'
+path = Path(__file__).resolve().parent / 'src/__tests__/services/router.test.ts'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/write_test.py` at line 4, Replace the hardcoded absolute string
assigned to the variable path in write_test.py with a repository-relative
construction: build the path from the script's location (e.g., using
Path(__file__) / os.path.dirname(__file__) and os.path.join or pathlib.Path with
.resolve()/parents to reach the repository root) and then append
src/__tests__/services/router.test.ts so the path variable is portable across
machines; update the assignment to use that computed path instead of
'/home/vi/...'.

server/write_tests.py (1)

4-4: ⚡ Quick win

Switch to a relative path instead of a user-specific absolute path.

This keeps the script usable in CI and on other developer machines.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/write_tests.py` at line 4, Replace the hard-coded user-specific
absolute path in the variable path with a machine-independent relative or
programmatically-resolved path; update the assignment to compute the test file
location using pathlib or os.path (e.g., Path(__file__).resolve().parents[...] /
"src" / "__tests__" / "services" / "router.test.ts" or os.path.join relative to
the repository/script location) so the variable path in server/write_tests.py is
portable across CI and other developer machines.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.roo/specs/generalized-thread-protection/design.md:
- Around line 43-45: The fenced code block containing
THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" is missing a
language tag; update that fenced block (the block surrounding the
THREAD_PROTECTION_PLATFORMS line) to include a language identifier such as bash
so the fence becomes ```bash and resolves markdownlint MD040 while keeping the
same content.

In @.roo/specs/generalized-thread-protection/requirements.md:
- Line 5: Fix the malformed requirement sentence on line 5 by removing the stray
"{", closing the inline code delimiter around `longcat`, and rewriting the
truncated fragment into a complete sentence that clearly states the issue (e.g.,
that the proxy route handler’s exported function in proxy.ts contains 6+
hardcoded branches that special-case the `longcat` route). Ensure the sentence
is grammatical, the backticks are balanced, and the requirement precisely
describes the special-casing problem so it’s unambiguous.

In @.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md:
- Around line 47-48: The requirement is ambiguous about retry semantics for
stalls: update the document to explicitly distinguish "pre-stream" stalls
(before any response bytes are sent) from "mid-/post-stream" stalls (after
partial delivery). Amend the bullet "5. Return from the handler (no retry on
stall — the stream is already partially delivered)" and the paragraph that marks
stalled-stream retry logic out of scope to state that 504/timeouts occurring
before any data is flushed should allow a retry per the companion tests, whereas
any stall detected after the first byte is sent must return without retry. Also
mirror this clarification where the doc references stalled-stream retry logic
(the section referencing out-of-scope behavior) so the pre-stream 504/retry
behavior and mid-stream no-retry behavior are consistent throughout.

In @.roo/specs/wrapped-error-interception/design.md:
- Line 120: The note is incorrect about exception flow: with the current
structure throwWrappedError() and ProviderApiError can be caught by the
surrounding try/catch, so update the text to require a parse-only try/catch and
to perform wrapped-error checks outside that block; specifically describe that
JSON.parse should be inside a minimal try/catch that only handles malformed
chunk parsing, while throwWrappedError() (and any ProviderApiError checks) must
occur after the parse block (outside the try) so they propagate to the consumer
(e.g., proxy.ts) rather than being swallowed.

In @.roo/specs/wrapped-error-interception/tasks.md:
- Around line 28-31: The current task instructs throwing wrapped errors inside
the parse try/catch which causes them to be swallowed; update the instructions
for OpenAICompatProvider.streamChatCompletion() so the try/catch only surrounds
JSON.parse: assign the result of JSON.parse(data) to a variable inside the try,
exit the try/catch, then immediately call this.isWrappedError(parsed) and if
true call this.throwWrappedError(parsed) (outside the parse try/catch), and only
after that yield the parsed value; apply the same change to the other listed
task steps.

In `@do_fix.py`:
- Around line 4-7: The script do_fix.py is syntactically invalid due to a stray
"{" after reading the file; locate the block containing "with
open('server/src/routes/proxy.ts', 'r') as f: content = f.read()" and remove the
extra "{" (or replace it with the intended Python logic/statement) so the file
is valid Python and will compile with python -m py_compile do_fix.py.

In `@fix_streaming.py`:
- Around line 12-21: The variable new_streaming is assigned with an unterminated
raw triple-quoted string (new_streaming = r''' ...) causing a SyntaxError; fix
by closing that raw triple-quoted string with a matching r''' terminator and
ensure any internal triple quotes are escaped or changed so the literal ends
properly; locate the new_streaming assignment in fix_streaming.py and add the
missing closing triple quotes (or adjust surrounding quotes/escaping) so the
Python file parses.

In `@new_streaming_block.txt`:
- Around line 22-28: The snippet for stallTimeout is truncated leaving the
setTimeout callback and Promise executor unclosed; fix the function stallTimeout
by properly closing the setTimeout callback and the new Promise executor and
providing a timeout duration (e.g. setTimeout(() => {
reject(Object.assign(...)); }, timeoutMs)); ensure the timer variable is
declared/used consistently (const timer = setTimeout(...)) and close with the
matching braces and parentheses so stallTimeout returns a well-formed
Promise<never>.

In `@server/src/__tests__/routes/fallback.test.ts`:
- Around line 56-61: The test exercising '/api/fallback' currently only asserts
pool values; extend it to cover auth boundaries: add an assertion that calling
request(app, 'GET', '/api/fallback') without the admin key returns 401, that
calling it with a valid admin key header returns 200 and the existing body
checks (the existing request(...) call can be reused or duplicated), and that
calling '/api/fallback' with the unified API key header (the key used elsewhere
for '/v1/*' tests) returns 401. Reference the existing test title "GET
/api/fallback pool values are valid ModelPool enum values", the request(app,
'GET', '/api/fallback') invocation, and the ModelPool checks when adding these
auth assertions.

In `@server/src/__tests__/routes/stream-heartbeat-stall.test.ts`:
- Around line 7-23: The test helper function request(...) can leak the server if
fetch() or res.text() throws; modify the request function to start the server
the same way but wrap the fetch and response handling in a try/finally, and in
finally await server closure by converting server.close into a Promise (e.g.,
await new Promise(resolve => server.close(resolve))) so the server is always
closed and awaited even on errors; update the function named request to ensure
deterministic shutdown.
- Around line 14-17: The helper request() currently injects only the unified API
key for paths starting with '/v1/' (uses getUnifiedApiKey()), causing setup
calls to '/api/*' (e.g., POST to '/api/keys') to run without admin credentials;
update the headers logic to apply the admin credential for '/api/' routes by
adding a branch that sets Authorization: `Bearer ${getAdminApiKey()}` when
path.startsWith('/api/'), keep the existing unified-key branch for '/v1/', and
ensure the Content-Type behavior remains unchanged so admin-gated setup calls
succeed.

In `@server/src/__tests__/services/router.test.ts`:
- Around line 31-33: The INSERT in the "highest priority model" test is
malformed: replace the duplicated/broken VALUES clauses in the db.prepare call
with a single VALUES (?, ?, ?, ?, ?, ?, ?) placeholder list and call .run(...)
with the parameters (e.g., 'groq', 'test', encrypted, iv, authTag, 'healthy',
1). Also finish the truncated test by completing the groqKey assignment (e.g.,
const groqKey = encrypt(...)), ensuring you pass the resulting encrypted, iv,
authTag into the prepared statement, execute the insert, and add the necessary
assertions/cleanup so router.test.ts compiles and the test executes.

In `@server/src/providers/base.ts`:
- Around line 138-145: throwWrappedError builds the Error message and status
incorrectly for some wrapped payloads; update the logic in throwWrappedError to
derive the message from the actual wrapped payload (use this.extractErrorMessage
on errPayload as well as body so string-form payloads like { error: "..." } are
captured) and set error.status only after validating the numeric code (parse
Number((errPayload as Record<string, unknown>).code) and check isFinite or
Number.isFinite before assigning, otherwise default to 200). Ensure you
reference the same symbols: throwWrappedError, this.extractErrorMessage,
errPayload, body, and error.status when applying these fixes.

In `@server/src/providers/cloudflare.ts`:
- Around line 123-130: The local catch in the Cloudflare stream parser is
swallowing wrapped 200 errors thrown by throwWrappedError(parsed), so they never
reach retry/cooldown handling. Update the logic in the stream parsing path
around isWrappedError and throwWrappedError so wrapped errors are detected and
rethrown outside the malformed-chunk catch, while keeping only JSON parse
failures or truly invalid chunks suppressed. Use the existing parsing flow in
the provider method to separate wrapped-error handling from the generic catch.

In `@server/src/providers/cohere.ts`:
- Around line 114-121: The current try/catch around JSON.parse in the stream
loop swallows errors thrown by throwWrappedError(parsed), preventing
retry/fallback logic from seeing wrapped errors; update the logic in the loop
that uses isWrappedError and throwWrappedError so that JSON.parse remains inside
a narrow try for parse-only failures but any detected wrapped error is thrown
outside that catch (e.g., parse into a local variable in try, then after the try
check isWrappedError(parsed) and call throwWrappedError(parsed) so it escapes
the parse-catch), ensuring throwWrappedError is not caught by the parse error
handler.

In `@server/src/providers/openai-compat.ts`:
- Around line 130-137: The catch is swallowing ProviderApiError because
throwWrappedError(parsed) is inside the try that catches all errors; change the
control flow so JSON.parse errors are handled but ProviderApiError thrown by
throwWrappedError is propagated: parse the chunk inside a small try/catch that
only handles JSON.parse failures (skip malformed), then after successful parse
call this.isWrappedError(parsed) and, if true, call
this.throwWrappedError(parsed) outside the parse-only catch (or rethrow if the
caught error is a ProviderApiError). This involves updating the parsing block
around ChatCompletionChunk so throwWrappedError and its ProviderApiError are not
swallowed.

In `@server/write_test.py`:
- Line 29: The test contains an unterminated/invalid string literal "   
expect(() => route{ which looks like leftover JS—fix by replacing that line with
a valid Python assertion or properly terminated string/statement (e.g., use
assert <condition> or close the quotes and parentheses) so server/write_test.py
parses; also remove the hardcoded absolute path '/home/vi/...' and build the
output path portably using pathlib or os.path (use Path(__file__).parent /
"relative_output_dir" or os.path.join(os.path.dirname(__file__),
"relative_output_dir")) and replace the hardcoded variable (the literal
'/home/vi/...') with that portable path.

In `@server/write_tests.py`:
- Around line 7-45: The variable part1 (a triple-quoted Python string) is
unterminated because the JS template literal stops mid-line ("const groqKey{");
close the Python triple-quoted string and restore the remainder of the JS test
template so part1 contains a valid complete test file. Locate the part1
assignment in server/write_tests.py, add the terminating triple quotes (""") and
ensure the included JS snippet completes the broken line (finish the "const
groqKey..." insertion and any missing test blocks) so the generated string is
syntactically valid JavaScript when written out.

---

Outside diff comments:
In `@server/src/__tests__/services/router.test.ts`:
- Around line 63-64: The test file is truncated at "const groqKey = encrypt" so
complete the encrypt(...) call and restore the test block; specifically finish
the expression for groqKey by calling encrypt with the intended plaintext or
test mock (e.g., encrypt('test-groq-key') or the project’s GROQ key/mocked
value), add the missing semicolon, and re-add the remaining assertions and
closing braces of the test/describe block so the file parses; look for the
encrypt identifier and the surrounding it/describe that reference groqKey to
ensure you restore the original assertions and closures.

In `@server/src/routes/proxy.ts`:
- Around line 1505-1537: The mid-stream retryable error branch currently bans
entire providers via banPlatformFromSession and addProviderModelsToSkipModels
for LongCat and omits Owl Alpha; change it to perform model-level skipping
instead: when route.platform === 'longcat' and isRetryableStreamError(streamErr)
replace banPlatformFromSession('longcat') and
addProviderModelsToSkipModels(skipModels, 'longcat') with
skipModels.add(route.modelDbId) (so only the failing model is skipped), and
ensure the sticky-preference clear logic still checks prefRow?.platform ===
'longcat' before clearing preferredModel/preferredKeyId; additionally add an
identical mid-stream branch for route.platform === 'owl-alpha' that uses
skipModels.add(route.modelDbId) and the same sticky-preference clearing for
'owl-alpha' so Owl Alpha retryable stream errors are handled at model level
rather than falling through.

---

Nitpick comments:
In `@server/src/services/router.ts`:
- Around line 184-187: Replace the duplicated literal 7.0 recency divisor with a
single derived-and-bound value calculated from ANALYTICS_WINDOW_MS (e.g., const
windowDays = ANALYTICS_WINDOW_MS / (1000*60*60*24)) and reuse that variable in
both weighted expressions (the MIN(1.0, 1.0 - (julianday('now') -
julianday(created_at)) / <window>)) for total and successes; bind it into the
SQL once (use a named parameter like :windowDays or a positional placeholder) so
the same value is used in SUM(MAX(...)) and SUM(CASE WHEN ... THEN MAX(...)) and
cannot diverge.

In `@server/write_test.py`:
- Line 4: Replace the hardcoded absolute string assigned to the variable path in
write_test.py with a repository-relative construction: build the path from the
script's location (e.g., using Path(__file__) / os.path.dirname(__file__) and
os.path.join or pathlib.Path with .resolve()/parents to reach the repository
root) and then append src/__tests__/services/router.test.ts so the path variable
is portable across machines; update the assignment to use that computed path
instead of '/home/vi/...'.

In `@server/write_tests.py`:
- Line 4: Replace the hard-coded user-specific absolute path in the variable
path with a machine-independent relative or programmatically-resolved path;
update the assignment to compute the test file location using pathlib or os.path
(e.g., Path(__file__).resolve().parents[...] / "src" / "__tests__" / "services"
/ "router.test.ts" or os.path.join relative to the repository/script location)
so the variable path in server/write_tests.py is portable across CI and other
developer machines.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b69a8c89-5b04-46c0-a2cc-f4d866a07ebe

📥 Commits

Reviewing files that changed from the base of the PR and between 233e031 and 2563017.

📒 Files selected for processing (43)

.roo/specs/disable-sticky-on-auto/design.md
.roo/specs/disable-sticky-on-auto/requirements.md
.roo/specs/disable-sticky-on-auto/tasks.md
.roo/specs/generalized-thread-protection/design.md
.roo/specs/generalized-thread-protection/requirements.md
.roo/specs/generalized-thread-protection/tasks.md
.roo/specs/owl-alpha-longcat-model-routing/design.md
.roo/specs/owl-alpha-longcat-model-routing/requirements.md
.roo/specs/owl-alpha-longcat-model-routing/tasks.md
.roo/specs/recency-biased-thompson-sampling/design.md
.roo/specs/recency-biased-thompson-sampling/requirements.md
.roo/specs/recency-biased-thompson-sampling/tasks.md
.roo/specs/sse-stream-heartbeat-stall-protection/design.md
.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md
.roo/specs/sse-stream-heartbeat-stall-protection/tasks.md
.roo/specs/transient-model-cooldown/design.md
.roo/specs/transient-model-cooldown/requirements.md
.roo/specs/transient-model-cooldown/tasks.md
.roo/specs/wrapped-error-interception/design.md
.roo/specs/wrapped-error-interception/requirements.md
.roo/specs/wrapped-error-interception/tasks.md
client/src/pages/FallbackPage.tsx
do_fix.py
fix.py
fix_streaming.py
fix{
new_streaming_block.txt
server/src/__tests__/routes/fallback.test.ts
server/src/__tests__/routes/provider-session-ban.test.ts
server/src/__tests__/routes/proxy-tools.test.ts
server/src/__tests__/routes/stream-heartbeat-stall.test.ts
server/src/__tests__/routes/transient-cooldown.test.ts
server/src/__tests__/services/router.test.ts
server/src/providers/base.ts
server/src/providers/cloudflare.ts
server/src/providers/cohere.ts
server/src/providers/google.ts
server/src/providers/openai-compat.ts
server/src/routes/proxy.ts
server/src/services/router.ts
server/src/services/threadProtection.ts
server/write_test.py
server/write_tests.py

coderabbitai · 2026-06-04T22:21:15Z

+```
+THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced code block.

The fence at Line 43 is missing a language identifier, which triggers markdownlint MD040.

Proposed fix

-``` +```bash THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip"

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 43-43: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/generalized-thread-protection/design.md around lines 43 - 45, The fenced code block containing THREAD_PROTECTION_PLATFORMS="longcat:provider-ban,groq:model-skip" is missing a language tag; update that fenced block (the block surrounding the THREAD_PROTECTION_PLATFORMS line) to include a language identifier such as bash so the fence becomes ```bash and resolves markdownlint MD040 while keeping the same content.

coderabbitai · 2026-06-04T22:21:16Z

+
+## Problem Statement
+
+The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the truncated requirement sentence.

Line 5 is malformed (longcat{) and the problem statement is incomplete, so the requirement is currently ambiguous.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/generalized-thread-protection/requirements.md at line 5, Fix the malformed requirement sentence on line 5 by removing the stray "{", closing the inline code delimiter around `longcat`, and rewriting the truncated fragment into a complete sentence that clearly states the issue (e.g., that the proxy route handler’s exported function in proxy.ts contains 6+ hardcoded branches that special-case the `longcat` route). Ensure the sentence is grammatical, the backticks are balanced, and the requirement precisely describes the special-casing problem so it’s unambiguous.

coderabbitai · 2026-06-04T22:21:16Z

+  5. Return from the handler (no retry on stall — the stream is already partially delivered)
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clarify pre-stream stall retry semantics.

Line 47 says stall handling should not retry, and Line 132 marks stalled-stream retry logic out of scope. That conflicts with the pre-stream 504/retry behavior described in companion tasks/tests. Please make pre-stream vs mid-stream stall behavior explicit and consistent here.

Also applies to: 128-132

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/sse-stream-heartbeat-stall-protection/requirements.md around lines 47 - 48, The requirement is ambiguous about retry semantics for stalls: update the document to explicitly distinguish "pre-stream" stalls (before any response bytes are sent) from "mid-/post-stream" stalls (after partial delivery). Amend the bullet "5. Return from the handler (no retry on stall — the stream is already partially delivered)" and the paragraph that marks stalled-stream retry logic out of scope to state that 504/timeouts occurring before any data is flushed should allow a retry per the companion tests, whereas any stall detected after the first byte is sent must return without retry. Also mirror this clarification where the doc references stalled-stream retry logic (the section referencing out-of-scope behavior) so the pre-stream 504/retry behavior and mid-stream no-retry behavior are consistent throughout.

coderabbitai · 2026-06-04T22:21:16Z

+}
+```
+
+**Note**: The `catch` block already skips malformed chunks. The `throwWrappedError()` call throws before `yield`, so the generator terminates immediately. The `try/catch` around `JSON.parse` does NOT catch the `ProviderApiError` thrown by `throwWrappedError()` because that throw happens after successful parsing — it propagates out of the generator to the consumer in `proxy.ts`.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the streaming propagation note; it currently states incorrect exception behavior.

Line 120 says the surrounding try/catch will not catch throwWrappedError(), but with the shown structure it will be caught and swallowed. Please update this note to require parse-only try/catch, with wrapped-error checks outside it.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/wrapped-error-interception/design.md at line 120, The note is incorrect about exception flow: with the current structure throwWrappedError() and ProviderApiError can be caught by the surrounding try/catch, so update the text to require a parse-only try/catch and to perform wrapped-error checks outside that block; specifically describe that JSON.parse should be inside a minimal try/catch that only handles malformed chunk parsing, while throwWrappedError() (and any ProviderApiError checks) must occur after the parse block (outside the try) so they propagate to the consumer (e.g., proxy.ts) rather than being swallowed.

coderabbitai · 2026-06-04T22:21:16Z

+- [x] 5. Add wrapped-error check in `OpenAICompatProvider.streamChatCompletion()` in `server/src/providers/openai-compat.ts`
+  - Inside the `try` block at line 126, after `JSON.parse(data)` succeeds:
+  - Insert: `if (this.isWrappedError(parsed)) { this.throwWrappedError(parsed); }`
+  - Note: assign the result of `JSON.parse` to a variable first, then check, then yield


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update streaming task steps to avoid swallowing wrapped errors.

These steps currently instruct placing throwWrappedError() inside the parse try block. That causes wrapped errors to be consumed by the generic catch. The tasks should require parse-only try/catch, then wrapped-error check/throw outside it.

Also applies to: 37-40, 46-49

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.roo/specs/wrapped-error-interception/tasks.md around lines 28 - 31, The current task instructs throwing wrapped errors inside the parse try/catch which causes them to be swallowed; update the instructions for OpenAICompatProvider.streamChatCompletion() so the try/catch only surrounds JSON.parse: assign the result of JSON.parse(data) to a variable inside the try, exit the try/catch, then immediately call this.isWrappedError(parsed) and if true call this.throwWrappedError(parsed) (outside the parse try/catch), and only after that yield the parsed value; apply the same change to the other listed task steps.

coderabbitai · 2026-06-04T22:21:16Z

        try {
-          yield JSON.parse(data) as ChatCompletionChunk;
+          const parsed = JSON.parse(data) as ChatCompletionChunk;
+          if (this.isWrappedError(parsed)) {
+            this.throwWrappedError(parsed);
+          }
+          yield parsed;
        } catch {
          // Skip malformed chunks


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Wrapped stream errors are currently swallowed by the local catch.

Because throwWrappedError(parsed) is inside this try, the catch treats it as a malformed chunk and suppresses it. This prevents wrapped 200 errors from propagating to retry/cooldown handling.

Suggested fix

- try { - const parsed = JSON.parse(data) as ChatCompletionChunk; - if (this.isWrappedError(parsed)) { - this.throwWrappedError(parsed); - } - yield parsed; - } catch { - // Skip malformed chunks - } + let parsed: ChatCompletionChunk; + try { + parsed = JSON.parse(data) as ChatCompletionChunk; + } catch { + // Skip malformed chunks + continue; + } + if (this.isWrappedError(parsed)) { + this.throwWrappedError(parsed); + } + yield parsed;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/providers/cloudflare.ts` around lines 123 - 130, The local catch in the Cloudflare stream parser is swallowing wrapped 200 errors thrown by throwWrappedError(parsed), so they never reach retry/cooldown handling. Update the logic in the stream parsing path around isWrappedError and throwWrappedError so wrapped errors are detected and rethrown outside the malformed-chunk catch, while keeping only JSON parse failures or truly invalid chunks suppressed. Use the existing parsing flow in the provider method to separate wrapped-error handling from the generic catch.

coderabbitai · 2026-06-04T22:21:16Z

        try {
-          yield JSON.parse(data) as ChatCompletionChunk;
+          const parsed = JSON.parse(data) as ChatCompletionChunk;
+          if (this.isWrappedError(parsed)) {
+            this.throwWrappedError(parsed);
+          }
+          yield parsed;
        } catch {
          // Skip malformed chunks


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Streaming wrapped errors won’t escape this block due to catch-all handling.

throwWrappedError(parsed) is executed inside the same try and gets swallowed by catch, so retry/fallback logic never sees it.

Suggested fix

- try { - const parsed = JSON.parse(data) as ChatCompletionChunk; - if (this.isWrappedError(parsed)) { - this.throwWrappedError(parsed); - } - yield parsed; - } catch { - // Skip malformed chunks - } + let parsed: ChatCompletionChunk; + try { + parsed = JSON.parse(data) as ChatCompletionChunk; + } catch { + // Skip malformed chunks + continue; + } + if (this.isWrappedError(parsed)) { + this.throwWrappedError(parsed); + } + yield parsed;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/providers/cohere.ts` around lines 114 - 121, The current try/catch around JSON.parse in the stream loop swallows errors thrown by throwWrappedError(parsed), preventing retry/fallback logic from seeing wrapped errors; update the logic in the loop that uses isWrappedError and throwWrappedError so that JSON.parse remains inside a narrow try for parse-only failures but any detected wrapped error is thrown outside that catch (e.g., parse into a local variable in try, then after the try check isWrappedError(parsed) and call throwWrappedError(parsed) so it escapes the parse-catch), ensuring throwWrappedError is not caught by the parse error handler.

coderabbitai · 2026-06-04T22:21:16Z

        try {
-          yield JSON.parse(data) as ChatCompletionChunk;
+          const parsed = JSON.parse(data) as ChatCompletionChunk;
+          if (this.isWrappedError(parsed)) {
+            this.throwWrappedError(parsed);
+          }
+          yield parsed;
        } catch {
          // Skip malformed chunks


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Streaming wrapped errors are suppressed instead of propagated.

throwWrappedError(parsed) runs inside the try, so the local catch absorbs the ProviderApiError and the retry loop never sees it.

Suggested fix

- try { - const parsed = JSON.parse(data) as ChatCompletionChunk; - if (this.isWrappedError(parsed)) { - this.throwWrappedError(parsed); - } - yield parsed; - } catch { - // Skip malformed chunks - } + let parsed: ChatCompletionChunk; + try { + parsed = JSON.parse(data) as ChatCompletionChunk; + } catch { + // Skip malformed chunks + continue; + } + if (this.isWrappedError(parsed)) { + this.throwWrappedError(parsed); + } + yield parsed;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

yield JSON.parse(data) as ChatCompletionChunk;

const parsed = JSON.parse(data) as ChatCompletionChunk;

if (this.isWrappedError(parsed)) {

this.throwWrappedError(parsed);

}

yield parsed;

} catch {

// Skip malformed chunks

let parsed: ChatCompletionChunk;

try {

parsed = JSON.parse(data) as ChatCompletionChunk;

} catch {

// Skip malformed chunks

continue;

}

if (this.isWrappedError(parsed)) {

this.throwWrappedError(parsed);

}

yield parsed;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/providers/openai-compat.ts` around lines 130 - 137, The catch is swallowing ProviderApiError because throwWrappedError(parsed) is inside the try that catches all errors; change the control flow so JSON.parse errors are handled but ProviderApiError thrown by throwWrappedError is propagated: parse the chunk inside a small try/catch that only handles JSON.parse failures (skip malformed), then after successful parse call this.isWrappedError(parsed) and, if true, call this.throwWrappedError(parsed) outside the parse-only catch (or rethrow if the caught error is a ProviderApiError). This involves updating the parsing block around ChatCompletionChunk so throwWrappedError and its ProviderApiError are not swallowed.

coderabbitai · 2026-06-04T22:21:16Z

+    "  });",
+    "",
+    "  it('should throw when no keys are configured', () => {",
+    "    expect(() => route{


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="server/write_test.py" if [ ! -f "$FILE" ]; then echo "Missing file: $FILE" exit 1 fi echo "== $FILE ==" nl -ba "$FILE" | sed -n '1,120p'

Repository: vi70x3/freellmapi

Length of output: 150

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="server/write_test.py" if [ ! -f "$FILE" ]; then echo "Missing file: $FILE" exit 1 fi echo "== $FILE (lines 1-120) ==" cat -n "$FILE" | sed -n '1,120p' echo echo "== $FILE (lines 15-45) ==" cat -n "$FILE" | sed -n '15,45p'

Repository: vi70x3/freellmapi

Length of output: 2266

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="server/write_test.py" echo "== $FILE ==" wc -l "$FILE" echo echo "== $FILE (lines 20-end) ==" cat -n "$FILE" | sed -n '20,$p'

Repository: vi70x3/freellmapi

Length of output: 734

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="server/write_test.py" echo "== py_compile check ==" python3 -m py_compile "$FILE" 2>&1 || true echo echo "== tail -n 10 ==" tail -n 10 "$FILE" | cat -n -w2

Repository: vi70x3/freellmapi

Length of output: 323

Critical: fix server/write_test.py syntax error and hardcoded absolute output path

server/write_test.py line 29 has an unterminated string literal (" expect(() => route{), causing a SyntaxError and preventing the generator from running.

server/write_test.py line 4 hardcodes /home/vi/... as the output path, hurting portability.

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 29-29: missing closing quote in string literal

(invalid-syntax)

[warning] 29-29: unexpected EOF while parsing

(invalid-syntax)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/write_test.py` at line 29, The test contains an unterminated/invalid string literal " expect(() => route{ which looks like leftover JS—fix by replacing that line with a valid Python assertion or properly terminated string/statement (e.g., use assert <condition> or close the quotes and parentheses) so server/write_test.py parses; also remove the hardcoded absolute path '/home/vi/...' and build the output path portably using pathlib or os.path (use Path(__file__).parent / "relative_output_dir" or os.path.join(os.path.dirname(__file__), "relative_output_dir")) and replace the hardcoded variable (the literal '/home/vi/...') with that portable path.

coderabbitai · 2026-06-04T22:21:16Z

+part1 = """import { describe, it, expect, beforeAll, beforeEach } from 'vitest';
+import { initDb, getDb } from '../../db/index.js';
+import { encrypt } from '../../lib/crypto.js';
+import { routeRequest, refreshStatsCache, getAnalyticsScores } from '../../services/router.js';
+
+describe('Router', () => {
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+  });
+
+  beforeEach(() => {
+    const db = getDb();
+    db.prepare('DELETE FROM api_keys').run();
+    const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[];
+    const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');
+    for (let i = 0; i < models.length; i++) {
+      update.run(i + 1, models[i].id);
+    }
+  });
+
+  it('should throw when no keys are configured', () => {
+    expect(() => routeRequest()).toThrow(/exhausted/i);
+  });
+
+  it('should route to highest priority model with available key', () => {
+    const db = getDb();
+    const { encrypted, iv, authTag } = encrypt('test-groq-key');
+    db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('groq', 'test', encrypted, iv, authTag, 'healthy', 1);
+    const result = routeRequest();
+    expect(result.platform).toBe('groq');
+    expect(result.apiKey).toBe('test-groq-key');
+  });
+
+  it('should route to an available model when keys exist for multiple platforms', () => {
+    const db = getDb();
+    const googleKey = encrypt('test-google-key');
+    db.prepare('INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled) VALUES (?, ?, ?, ?, ?, ?, ?)').run('google', 'test', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 1);
+    const groqKey{


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix SyntaxError: part1 triple-quoted string is unterminated in server/write_tests.py

File: server/write_tests.py (lines 7-45)
part1 = """ ... is cut off before the closing """ (the snippet ends mid-statement like const groqKey{), which would make the generator script fail to parse. Add the missing closing triple quotes and ensure the full template string content is included.

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 7-45: missing closing quote in string literal

(invalid-syntax)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/write_tests.py` around lines 7 - 45, The variable part1 (a triple-quoted Python string) is unterminated because the JS template literal stops mid-line ("const groqKey{"); close the Python triple-quoted string and restore the remainder of the JS test template so part1 contains a valid complete test file. Locate the part1 assignment in server/write_tests.py, add the terminating triple quotes (""") and ensure the included JS snippet completes the broken line (finish the "const groqKey..." insertion and any missing test blocks) so the generated string is syntactically valid JavaScript when written out.

vi70x3 added 6 commits June 2, 2026 15:12

feat(providers): intercept wrapped error payloads on HTTP 200 responses

2fd34ba

feat(proxy): replace hardcoded LongCat/Owl Alpha cooldowns with gener…

09bffb0

…alized thread protection scanner

chore: temporary commit before switching branch

d4ea579

feat(thread-protection): implement rules engine and replace hardcoded…

24c1c80

… longcat branches

refactor(proxy): replace flat cooldown with real-time active request …

7d4b220

…tracking for LongCat and Owl Alpha

sourcery-ai Bot reviewed Jun 4, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 4, 2026

View reviewed changes

qodo-code-review Bot reviewed Jun 4, 2026

View reviewed changes

coderabbitai Bot suggested changes Jun 4, 2026

View reviewed changes

vi70x3 closed this Jun 5, 2026

		// Cached owl-alpha model ID: undefined = not yet looked up, null = not found, number = found
		let cachedOwlAlphaModelId: number \| null \| undefined = undefined;

		import { PoolSection } from '@/components/pool-section'
		import type { PoolType } from '@/components/pool-badge'

		@@ -1,4 +1,5 @@
		import { describe, it, expect, beforeAll } from 'vitest';
		import { ModelPool } from '@freellmapi/shared/types.js';


		## Problem Statement

		The proxy route handler (`server/src/routes/proxy.ts`) contains 6+ hardcoded branches that special-case the `longcat`{ No newline at end of file

		5. Return from the handler (no retry on stall — the stream is already partially delivered)

Conversation

vi70x3 commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for active-request-based LongCat/Owl Alpha protection

Flow diagram for updated routing with pools and recency bias

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

qodo-code-review Bot commented Jun 4, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

vi70x3 commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

sourcery-ai Bot commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 4, 2026 •

edited

Loading

kilo-code-bot Bot commented Jun 4, 2026 •

edited

Loading