perf: reuse OpenAI client and add undici keep-alive Agent with connection warmup by Lellansin · Pull Request #100 · lessweb/deepcode-cli

Lellansin · 2026-05-20T11:21:19Z

概述

优化 OpenAI API 客户端的 HTTP 连接复用，减少用户输入间隙的 TLS 握手开销。

背景

默认 undici 全局 fetch 的 TCP 连接保活时间仅 4 秒
CLI 场景下用户阅读 LLM 输出通常需要 10–30 秒，远超 4 秒阈值
每次新 prompt（含 chat、tool use）都需要重新完成 TCP+TLS 握手（实测 ~127ms），冷启动首字节延迟 200ms+

性能收益

每次 LLM 调用（chat、tool use 等）减少 80ms~100ms，冷启动首字节延迟从约 210ms 降至约 130ms，降幅约 38%~43%。

指标	优化前	优化后	改善
TCP+TLS 握手	每次请求	仅首次预热一次	—
冷启动 TTFB	~210ms	~130ms	↓ 38%
连接保活	4s（默认）	180s	45x
首次请求	用户等待	启动时后台预热	用户无感知

数据来源：scripts/benchmark-optimization.mjs，API 为 api.deepseek.com，deepseek-chat 模型，3 轮连续对话。

可以参考 LOL、王者等游戏的 200+ms 延迟降低到 130ms，体感上这种延迟降低可以改善使用上的手感。

从绝对值来说，100 次 chat/tool_use 大概可以节约 10s，这对长任务来说也有一定的改善收益。

改动

用自定义 undici Agent（keepAliveTimeout: 180s）替换默认 fetch
缓存复用 OpenAI client 实例（按 apiKey + baseURL 匹配）
连接预热：App 挂载时 fire-and-forget 预建 TCP+TLS 连接（3s 超时保护）
将以上逻辑提取到 src/common/openai-client.ts，保持 App.tsx 精简

文件变更

文件	说明
`package.json`	新增 `undici: ^7.25.0`
`src/common/openai-client.ts`	新文件：undici Agent、client 缓存、连接预热、getMachineId
`src/ui/App.tsx`	移除 ~90 行代码，改为从 common 模块引用
`src/ui/index.ts`	更新 re-export 路径

检查

✅ TypeScript 类型检查
✅ ESLint
✅ Prettier 格式
✅ 311 项测试全部通过

Cache the OpenAI client at module level keyed by (apiKey, baseURL) to avoid creating a fresh HTTP connection pool on every LLM turn. The client is a stateless fetch wrapper so sharing across calls is safe. Model, thinking-mode and other settings are still read fresh from config files each time. Also add a mount-time warmup effect that eagerly creates the client so the TCP+TLS connection is established while the user composes their first prompt.

The default undici-based global fetch only keeps connections alive for 4 seconds, which is too short for a CLI where the user may spend 10–30 seconds reading output before typing the next prompt. Add a custom fetch implementation backed by node:https.Agent with keepAlive: true and a 60-second idle timeout. The custom fetch is passed to the OpenAI SDK constructor so every LLM API request benefits from persistent connections across conversational turns. Also handles streaming request bodies (ReadableStream) for SDK features like file uploads.

Use npm undici's Agent with keepAliveTimeout: 60s instead of the 90-line custom https.Agent-based fetch wrapper. The approach is the same but much simpler — just pass undiciFetch with a configured Agent dispatcher to the OpenAI SDK.

Required by the custom fetch wrapper that replaces the default 4s keepAlive undici global dispatcher with a custom Agent (60s).

undici is imported at runtime in App.tsx for the custom keepAlive Agent. When bundled with --packages=external, end users need the package installed — it cannot be a devDependency.

undici v8 requires Node >=22, but the CI matrix includes Node 20 which the project intentionally supports. v7 works on >=20.18.1.

Codex review found that the fire-and-forget warmup models.list() had no timeout. The OpenAI client defaults to a 10-minute timeout, so an unreachable API could keep the Node process alive long after the user exits.

…tion warmup Extract OpenAI client creation logic into src/common/openai-client.ts: - Custom undici Agent with 60s keepAlive timeout (default is 4s) - Module-level client instance cache (reuse across calls) - Fire-and-forget connection warmup on first creation (3s timeout) - getMachineId() helper The App.tsx now simply imports and re-exports createOpenAIClient from the new common module, keeping UI concerns separate from HTTP/client lifecycle management.

Lellansin added 9 commits May 19, 2026 16:24

chore: add undici devDependency for custom keepAlive Agent

255226a

Required by the custom fetch wrapper that replaces the default 4s keepAlive undici global dispatcher with a custom Agent (60s).

fix: move undici from devDependencies to dependencies

5b74c00

undici is imported at runtime in App.tsx for the custom keepAlive Agent. When bundled with --packages=external, end users need the package installed — it cannot be a devDependency.

fix: downgrade undici to v7 for Node 20 compatibility

db78e2b

undici v8 requires Node >=22, but the CI matrix includes Node 20 which the project intentionally supports. v7 works on >=20.18.1.

fix: add 3s timeout to warmup request to prevent exit hang

87d52ad

Codex review found that the fire-and-forget warmup models.list() had no timeout. The OpenAI client defaults to a 10-minute timeout, so an unreachable API could keep the Node process alive long after the user exits.

perf: extend keepAlive timeout from 60s to 180s

61dbcc8

qorzj merged commit a385f5d into lessweb:main May 21, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: reuse OpenAI client and add undici keep-alive Agent with connection warmup#100

perf: reuse OpenAI client and add undici keep-alive Agent with connection warmup#100
qorzj merged 9 commits into
lessweb:mainfrom
Lellansin:dev/optimize-keepalive-client-reuse

Lellansin commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Lellansin commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

概述

背景

性能收益

改动

文件变更

检查

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Lellansin commented May 20, 2026 •

edited

Loading