Skip to content

perf: reuse OpenAI client and add undici keep-alive Agent with connection warmup#100

Merged
qorzj merged 9 commits into
lessweb:mainfrom
Lellansin:dev/optimize-keepalive-client-reuse
May 21, 2026
Merged

perf: reuse OpenAI client and add undici keep-alive Agent with connection warmup#100
qorzj merged 9 commits into
lessweb:mainfrom
Lellansin:dev/optimize-keepalive-client-reuse

Conversation

@Lellansin
Copy link
Copy Markdown
Contributor

@Lellansin Lellansin commented May 20, 2026

概述

优化 OpenAI API 客户端的 HTTP 连接复用,减少用户输入间隙的 TLS 握手开销。

背景

  • 默认 undici 全局 fetch 的 TCP 连接保活时间仅 4 秒
  • CLI 场景下用户阅读 LLM 输出通常需要 10–30 秒,远超 4 秒阈值
  • 每次新 prompt(含 chat、tool use)都需要重新完成 TCP+TLS 握手(实测 ~127ms),冷启动首字节延迟 200ms+

性能收益

每次 LLM 调用(chat、tool use 等)减少 80ms~100ms,冷启动首字节延迟从约 210ms 降至约 130ms降幅约 38%~43%

指标 优化前 优化后 改善
TCP+TLS 握手 每次请求 仅首次预热一次
冷启动 TTFB ~210ms ~130ms ↓ 38%
连接保活 4s(默认) 180s 45x
首次请求 用户等待 启动时后台预热 用户无感知

数据来源:scripts/benchmark-optimization.mjs,API 为 api.deepseek.comdeepseek-chat 模型,3 轮连续对话。

可以参考 LOL、王者等游戏的 200+ms 延迟降低到 130ms,体感上这种延迟降低可以改善使用上的手感。

从绝对值来说,100 次 chat/tool_use 大概可以节约 10s,这对长任务来说也有一定的改善收益。

改动

  • 用自定义 undici AgentkeepAliveTimeout: 180s)替换默认 fetch
  • 缓存复用 OpenAI client 实例(按 apiKey + baseURL 匹配)
  • 连接预热:App 挂载时 fire-and-forget 预建 TCP+TLS 连接(3s 超时保护)
  • 将以上逻辑提取到 src/common/openai-client.ts,保持 App.tsx 精简

文件变更

文件 说明
package.json 新增 undici: ^7.25.0
src/common/openai-client.ts 新文件:undici Agent、client 缓存、连接预热、getMachineId
src/ui/App.tsx 移除 ~90 行代码,改为从 common 模块引用
src/ui/index.ts 更新 re-export 路径

检查

  • ✅ TypeScript 类型检查
  • ✅ ESLint
  • ✅ Prettier 格式
  • ✅ 311 项测试全部通过

Lellansin added 9 commits May 19, 2026 16:24
Cache the OpenAI client at module level keyed by (apiKey, baseURL)
to avoid creating a fresh HTTP connection pool on every LLM turn.
The client is a stateless fetch wrapper so sharing across calls is
safe.  Model, thinking-mode and other settings are still read fresh
from config files each time.

Also add a mount-time warmup effect that eagerly creates the client
so the TCP+TLS connection is established while the user composes
their first prompt.
The default undici-based global fetch only keeps connections alive for
4 seconds, which is too short for a CLI where the user may spend
10–30 seconds reading output before typing the next prompt.

Add a custom fetch implementation backed by node:https.Agent with
keepAlive: true and a 60-second idle timeout.  The custom fetch is
passed to the OpenAI SDK constructor so every LLM API request
benefits from persistent connections across conversational turns.

Also handles streaming request bodies (ReadableStream) for SDK
features like file uploads.
Use npm undici's Agent with keepAliveTimeout: 60s instead of the
90-line custom https.Agent-based fetch wrapper.  The approach is the
same but much simpler — just pass undiciFetch with a configured
Agent dispatcher to the OpenAI SDK.
Required by the custom fetch wrapper that replaces the default
4s keepAlive undici global dispatcher with a custom Agent (60s).
undici is imported at runtime in App.tsx for the custom keepAlive
Agent.  When bundled with --packages=external, end users need the
package installed — it cannot be a devDependency.
undici v8 requires Node >=22, but the CI matrix includes Node 20
which the project intentionally supports.  v7 works on >=20.18.1.
Codex review found that the fire-and-forget warmup models.list()
had no timeout.  The OpenAI client defaults to a 10-minute timeout,
so an unreachable API could keep the Node process alive long after
the user exits.
…tion warmup

Extract OpenAI client creation logic into src/common/openai-client.ts:
- Custom undici Agent with 60s keepAlive timeout (default is 4s)
- Module-level client instance cache (reuse across calls)
- Fire-and-forget connection warmup on first creation (3s timeout)
- getMachineId() helper

The App.tsx now simply imports and re-exports createOpenAIClient from
the new common module, keeping UI concerns separate from HTTP/client
lifecycle management.
@qorzj qorzj merged commit a385f5d into lessweb:main May 21, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants