Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions .agents/skills/helmor-bump-vendors/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
name: helmor-bump-vendors
description: Bump or upgrade the pinned versions of Helmor's bundled agent CLIs, SDKs, and supporting binaries — Claude Code + claude-agent-sdk (lockstep), Codex, Cursor SDK, OpenCode, Kimi, Pi, and gh / glab / cloudflared / llama.cpp / Node. Encodes exactly which files to edit (`sidecar/package.json`, `sidecar/scripts/vendor-platform.ts`), how to source each version and compute its SHA256, the Claude SDK↔CLI lockstep rule, npm dist-tags caveats (latest vs next vs stable), the cross-arch (arm64+x64) SHA requirement, and the mandatory verification gates. Use whenever the user wants to upgrade / bump / update / refresh a bundled agent CLI or SDK version, check whether a vendor is behind latest, or run a dependency version sweep in the Helmor repo.
---

# Helmor Bump Vendors

Standardized procedure for upgrading the third-party agent CLIs, SDKs, and helper binaries
that Helmor pins and bundles. Goal: a correct, verified bump with no guesswork about *where*
versions live, *how* to source each SHA256, or *what* to run before declaring it done.

## The pin sites

Every bundled version is pinned in one (or both) of these files:

- **`sidecar/package.json`** — npm dependencies. Covers SDKs (imported in TS) and the
npm-distributed CLIs whose native binary is staged from `node_modules`
(`@anthropic-ai/claude-code`, `@openai/codex`, `opencode-ai`).
- **`sidecar/scripts/vendor-platform.ts`** — version constants + per-version **SHA256 tables**
for every *staged binary*. Source of truth for what gets bundled into the release.
- `sidecar/scripts/stage-vendor.ts` — staging *logic*. Only edit it when a vendor's archive
**layout** changes (rare; see codex/cursor notes in `references/vendors.md`).

## Vendor classes (determine the change-set)

| Class | Vendors | What to edit | SHA256? |
|---|---|---|---|
| **A. npm SDK only** | `@anthropic-ai/claude-agent-sdk`, `@cursor/sdk`, `@opencode-ai/sdk`, `@earendil-works/pi-*` | `package.json` line | No — plain npm dep |
| **B. npm-distributed staged binary** | claude-code, codex, opencode | `package.json` line **+** SHA256 table key in `vendor-platform.ts` | Yes — from npm tarball |
| **C. GitHub-release staged binary** | kimi, gh, glab, cloudflared, llama.cpp, node | `<NAME>_VERSION` const **+** SHA256 table in `vendor-platform.ts` (NOT in `package.json`) | Yes — source varies |

Per-vendor exact pin location, SHA256 source, and gotchas live in **`references/vendors.md`** —
read the relevant section before editing.

## Workflow

1. **Scope.** Confirm which vendors to bump. For each, open `references/vendors.md` for its class,
pin location, SHA source, and gotchas.
2. **Find the target version. Check LIVE — never trust memory; dist-tags flip within hours.**
- npm: `bun -e 'console.log((await (await fetch("https://registry.npmjs.org/<pkg>")).json())["dist-tags"])'`
Target `latest` (the stable channel). `next` is a prerelease — do **not** pin it unless the
user explicitly asks. claude-code also publishes a conservative `stable` tag that *lags*
(e.g. `2.1.179`); Helmor tracks `latest`, not `stable`.
- GitHub-release vendors: check the repo's Releases (or `https://api.github.com/repos/<owner>/<repo>/releases`).
3. **Edit the pins** (`package.json` and/or the `_VERSION` const). Apply the **Claude lockstep rule**
and any per-vendor gotcha from the reference.
4. **`cd sidecar && bun install`** — pulls the new versions. Sanity-check: resolved versions are
correct, any *removed* deps dropped from `bun.lock`, transitive deps you rely on are still present.
5. **Compute + fill SHA256** for class B/C. Use `scripts/npm_vendor_sha.sh` for B; see the reference
for C. **Both `arm64` and `x64` are mandatory** (see Critical rules).
6. **Run the verification gates** (below) — all must pass.
7. **Create release metadata.** Once the gates pass, invoke the **`/helmor-release`** skill to draft
the changeset (and an in-app announcement if the bump warrants one). Don't skip this — a vendor
bump is a user-visible change and needs a changeset. A routine bundled-agent refresh is typically
a `patch` changeset with **no** announcement; the body should name the user-visible change (which
agents moved to latest), not the internal cleanup (Pi removal, pin tidy-ups, doc fixes).
8. **Report**: current → target per vendor, breaking-change assessment, gate results, exact files
touched, and the changeset created. Leave commit / PR to the user unless asked.

## Critical rules (the non-obvious parts that cause bad bumps)

- **Claude lockstep.** `@anthropic-ai/claude-agent-sdk@0.3.X` and `@anthropic-ai/claude-code@2.1.X`
share patch `X` and ship together — **always bump both to the same X**. Verify: the SDK's
`node_modules/@anthropic-ai/claude-agent-sdk/package.json` carries `claudeCodeVersion: "2.1.X"`.
Only **claude-code** (the staged binary) needs a SHA256 entry; the agent-sdk is a plain npm dep.
- **Cross-arch SHA is mandatory.** Every class B/C SHA table needs **both `arm64` and `x64`**.
CI cross-builds the x86_64 bundle on an arm64 runner. On a native-arch host the build uses
`node_modules` directly and does **not** verify the SHA — so a wrong/missing `x64` entry passes
locally but **breaks CI**. Always compute both from the tarballs.
- **dist-tags drift.** Re-check `latest` at bump time even if you "just looked" — a newer patch can
be promoted from `next` to `latest` within hours.
- **SHA table = rolling history.** The tables keep a few recent version keys (cache is
version-keyed, so old keys coexist harmlessly). Add the new key; keep the prior one. If you are
*superseding an uncommitted entry you added this session*, replace it (don't stack) for a clean diff.
- **Layout-change watch.** Codex ships a self-describing `codex-package.json` descriptor; after a
bump, diff it — a `layoutVersion` change or new field means `stage-vendor.ts` needs review. See
`references/vendors.md` for codex, cursor (Node engines floor + phantom dep), and kimi (ACP
protocol version) specifics.

## Verification gates (run in order; all must pass)

```bash
cd sidecar && bun install # 1. installs targets; confirm versions + dropped deps in bun.lock
cd sidecar && bun run typecheck # 2. catches SDK API breaks (removed/renamed exports) — main breaking-change detector
cd sidecar && bun test # 3. sidecar unit tests
# 4. MANDATORY after ANY agent CLI/SDK bump — validates the stdout event-shape contract the Rust pipeline depends on:
cd src-tauri && cargo test --test pipeline_scenarios --test pipeline_fixtures --test pipeline_streams
cd sidecar && bun run build # 5. full staging + compile; a wrong SHA256 hard-fails here (downloads + verifies kimi / cross-arch)
```

What each gate proves:
- **typecheck** is the real breaking-change detector for SDK bumps (removed/renamed exports, changed types).
- **cargo pipeline tests** replay *stored* fixtures, so they catch pipeline-code regressions — **not**
new event shapes from a newer binary. For the latter, read the upstream changelog (focus on the
stdout event JSON: codex `item/`,`turn/`,`thread/` methods; claude `SDKMessage`/stream blocks;
opencode `message.part`; kimi ACP `session/update`) and capture fresh fixtures if the shape moved.
- **build** is the only gate that exercises SHA256 verification and the staging layout.

## Breaking-change diligence

Before pinning, read the upstream changelog/release notes across the current→target window. Most
agent-CLI patch bumps are additive; the risks that matter for Helmor are (a) SDK export/type changes
(typecheck catches these) and (b) stdout event-shape changes (the Rust pipeline contract). Tag each
notable change *affects Helmor* or *no impact* with reasoning, and surface it before bumping.

## Tools in this skill

- **`scripts/npm_vendor_sha.sh <claude-code|codex|opencode> <version>`** — downloads the darwin
`arm64` + `x64` npm tarballs and prints their SHA256, ready to paste into the `vendor-platform.ts`
table. (Class B only. Class A SDKs need no SHA; class C sources differ — see the reference.)
- **`references/vendors.md`** — exhaustive per-vendor map: integration mechanism, exact pin
location, SHA256 source/recipe, gotchas, and post-bump steps.
178 changes: 178 additions & 0 deletions .agents/skills/helmor-bump-vendors/references/vendors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# Per-vendor bump reference

Exact pin location, SHA256 source, gotchas, and post-bump steps for every bundled vendor.
All paths are relative to the repo root. Line numbers drift — grep the named constant/key instead.

## Contents

- [Claude (claude-agent-sdk + claude-code)](#claude) — class A+B, **lockstep**
- [Codex (@openai/codex)](#codex) — class B, layout descriptor
- [Cursor (@cursor/sdk)](#cursor) — class A, Node worker + phantom dep
- [OpenCode (@opencode-ai/sdk + opencode-ai)](#opencode) — class A+B, SDK/CLI lockstep
- [Kimi](#kimi) — class C, GitHub release, ACP protocol
- [Pi (@earendil-works/pi-*)](#pi) — class A, **dead code → prefer delete**
- [gh / glab / cloudflared / llama.cpp / node](#supporting-tools) — class C, supporting binaries

---

## Claude

**Integration:** `@anthropic-ai/claude-agent-sdk` is imported in `sidecar/src/claude/`; the SDK
spawns the bundled `claude` binary (`@anthropic-ai/claude-code`, staged from `node_modules`).

**LOCKSTEP — bump both to the same patch X:**
- `sidecar/package.json`: `@anthropic-ai/claude-agent-sdk` = `0.3.X`, `@anthropic-ai/claude-code` = `2.1.X`.
- Verify the pairing: `node_modules/@anthropic-ai/claude-agent-sdk/package.json` has `claudeCodeVersion: "2.1.X"`.

**SHA256 (claude-code only — the agent-sdk is a plain npm dep, no SHA):**
- Table: `CLAUDE_CODE_SHA256["2.1.X"] = { arm64, x64 }` in `sidecar/scripts/vendor-platform.ts`.
- Compute: `scripts/npm_vendor_sha.sh claude-code 2.1.X` (downloads
`registry.npmjs.org/@anthropic-ai/claude-code-darwin-{arm64,x64}/-/claude-code-darwin-{arm64,x64}-2.1.X.tgz`).

**Gotchas:**
- dist-tags: target `latest`. claude-code also has a `stable` tag that LAGS — ignore it, Helmor tracks `latest`.
- The Rust pipeline depends on the SDK stdout event shape (`SDKMessage`, stream blocks, tool_use/tool_result,
thinking). The `cargo` pipeline gate is mandatory after every claude-code bump.

---

## Codex

**Integration:** spawns the bundled `codex app-server` binary over JSON-RPC (NOT an npm SDK — there
is no `@openai/codex-sdk` dependency despite older doc wording). Code in `sidecar/src/codex/`.

**Pins:**
- `sidecar/package.json`: `@openai/codex` = `X` (e.g. `0.142.0`).
- SHA256 table: `CODEX_SHA256["X"] = { arm64, x64 }` in `vendor-platform.ts`.
- Compute: `scripts/npm_vendor_sha.sh codex X` (downloads `registry.npmjs.org/@openai/codex/-/codex-X-darwin-{arm64,x64}.tgz`).

**Gotchas:**
- **Layout descriptor.** Codex ≥0.134 ships `node_modules/@openai/codex-darwin-arm64/vendor/aarch64-apple-darwin/codex-package.json`
(`layoutVersion`, `entrypoint`, `pathDir`, `resourcesDir`). `stage-vendor.ts` reads it and is
forward-compatible for field renames. **After a bump, diff this descriptor** — if `layoutVersion`
bumps past 1 or new top-level keys appear, review `stageCodexFromVendorRoot` in `stage-vendor.ts`.
- Rust pipeline consumes `item/`, `turn/`, `thread/` slash-form methods (see `pipeline/accumulator/codex.rs`
`normalize_item_type`). New item types or renamed methods require Rust changes — the cargo gate catches drift.

---

## Cursor

**Integration:** `@cursor/sdk` runs in a separate **Node worker** (`sidecar/src/cursor/worker/`), NOT
Bun (its HTTP/2 client drops tool traffic under Bun). Class A — npm SDK, **no SHA256 table**.

**Pins:**
- `sidecar/package.json`: `@cursor/sdk` = `X`.
- The cursor-worker bundle version is read **dynamically** by `stageCursorWorkerDeps` in `stage-vendor.ts`
(`readCursorSdkVersion()`), which runs a live `npm install` for the bundle target — **no version literal
or SHA table to edit** in `vendor-platform.ts`.

**Gotchas:**
- **Node engines floor.** `@cursor/sdk` requires Node `>=22.13`. The bundled `NODE_VERSION` (see node
section) must satisfy it. If a cursor bump raises the floor, bump Node too.
- **Phantom `@connectrpc/connect-node`.** Pre-1.0.21 the SDK imported it at runtime without declaring
it, so Helmor injected an explicit pin (in `package.json` AND in `stageCursorWorkerDeps`). 1.0.21+
declares it as a real dependency, so those explicit pins were removed. **After any cursor bump,
verify it still resolves:** `ls sidecar/node_modules/@connectrpc/connect-node` and, after a build,
`ls sidecar/dist/vendor/cursor-worker/node_modules/@connectrpc/`. If absent, re-add the pin.
- `sidecar/src/session-manager.ts` mirrors `ModelParameterDefinition` from the SDK by hand — if that
shape changes, the mirror drifts silently. Spot-check it.
- Cursor ships **no per-patch SDK changelog** → smoke-test the worker after bumping (Agent.create/
resume/prompt, `Cursor.models.list`, raw event names `status`/`tool_call`/`assistant`/`thinking`
which `pipeline/accumulator/cursor.rs` namespaces).

---

## OpenCode

**Integration:** SDK client (`@opencode-ai/sdk/v2`, `createOpencodeClient`) in
`sidecar/src/opencode-protocol/`; the `opencode-ai` native binary is staged and spawned as a server.

**Pins (SDK + CLI release in LOCKSTEP — same version):**
- `sidecar/package.json`: `@opencode-ai/sdk` = `X` and `opencode-ai` = `X`.
- SHA256 table: `OPENCODE_SHA256["X"] = { arm64, x64 }` in `vendor-platform.ts`.
- Compute: `scripts/npm_vendor_sha.sh opencode X` (downloads `registry.npmjs.org/opencode-darwin-{arm64,x64}/-/opencode-darwin-{arm64,x64}-X.tgz`).

**Gotchas:**
- The registry is flooded with `0.0.0-*` snapshot tags — ignore them; the real channel is `latest`.
- `opencode-ai`'s postinstall is blocked (not a trusted dep); harmless — Helmor stages the platform
sub-package (`node_modules/opencode-darwin-<arch>/bin/opencode`) directly, not via that postinstall.
- Rust pipeline consumes `message.updated` / `message.part.*` shapes (`pipeline/accumulator/opencode.rs`).

---

## Kimi

**Integration:** the bundled `kimi` binary speaks ACP (`kimi acp`) over a hand-rolled protocol in
`sidecar/src/kimi/`. Class C — GitHub-release binary, **not** an npm dep, **not** in `package.json`.

**Pins (both in `vendor-platform.ts`):**
- `KIMI_VERSION = "X"`.
- `KIMI_SHA256["X"]` with **four** platform keys: `darwin-arm64`, `darwin-x64`, `win32-arm64`, `win32-x64`.

**Version + SHA source:** repo `MoonshotAI/kimi-code`. Release tag is the scoped npm tag, url-encoded:
`%40moonshot-ai/kimi-code%40X`. Assets: `kimi-code-<platform>.zip`. Get each SHA from the asset's
`digest` field via the GitHub API (or the `.zip.sha256` sidecar):
```bash
curl -s "https://api.github.com/repos/MoonshotAI/kimi-code/releases/tags/%40moonshot-ai%2Fkimi-code%40X" \
| python3 -c 'import sys,json; r=json.load(sys.stdin); [print(a["name"], a.get("digest")) for a in r["assets"] if a["name"].endswith(".zip")]'
```
The `digest` is `sha256:<hex>` — pin the hex. (Verify by downloading + `shasum -a 256` if unsure; the
build hard-fails on mismatch anyway.)

**Gotchas:**
- **ACP protocol version.** Helmor hard-enforces `ACP_PROTOCOL_VERSION` (`sidecar/src/kimi/acp-types.ts`)
at the handshake and throws on mismatch. Kimi patch releases have not changed it, but if a release
negotiates a different version, the connection breaks — smoke-test `kimi acp`'s `initialize` response.
- Most kimi releases are TUI/web-only (no ACP changes) → often low-value bumps. Check release notes.
- A changed SHA auto-forces re-download from `.bundle-cache`; a manual `sidecar/.bundle-cache` wipe is
belt-and-suspenders, not required.

---

## Pi

**Status: DEAD CODE.** `@earendil-works/pi-agent-core` + `@earendil-works/pi-ai` are declared in
`sidecar/package.json` but **imported nowhere** (only referenced in `src-tauri/src/agents/provider_capabilities.rs`
comments as a hypothetical future provider). They drag a heavy transitive tree (anthropic sdk, aws
bedrock, google genai, mistral, openai) into the compiled sidecar.

**Recommendation: DELETE rather than bump.** Remove both lines from `package.json`, `bun install`,
then `rm -rf sidecar/node_modules/@earendil-works` (bun may leave stale orphan dirs after removal;
confirm `grep -c earendil sidecar/bun.lock` is 0).

If a future integration revives it: `^0.75.x` (caret on a 0.x package) floats only within `0.75.x`, so
a real upgrade needs editing the range. Note 0.80.0 has a breaking API rewrite (`AgentHarnessOptions.models`
required, `getApiKeyAndHeaders` removed).

---

## Supporting tools

All class C, all in `vendor-platform.ts`, none in `package.json`. macOS SHA is strict; Windows is
soft-verified (empty `""` SHA tolerated). arch naming for gh/glab/cloudflared is `arm64`/`amd64`.

### gh (`GH_VERSION` + `GH_SHA256{arm64,amd64}`)
Repo `cli/cli`. SHA from `gh_<ver>_checksums.txt` at the release — pick the macOS zip rows
(`gh_<ver>_macOS_{arm64,amd64}.zip`).

### glab (`GLAB_VERSION` + `GLAB_SHA256{arm64,amd64}`)
GitLab `gitlab-org/cli`. SHA from `checksums.txt` at the release — the
`glab_<ver>_darwin_{arm64,amd64}.tar.gz` rows.

### cloudflared (`CLOUDFLARED_VERSION` + `CLOUDFLARED_SHA256{arm64,amd64}`)
Repo `cloudflare/cloudflared`. SHA = `shasum -a 256` of the release asset
`cloudflared-darwin-{arm64,amd64}.tgz` (no upstream checksums file):
```bash
curl -fsSL "https://github.com/cloudflare/cloudflared/releases/download/<ver>/cloudflared-darwin-arm64.tgz" | shasum -a 256
```

### llama.cpp (`LLAMA_VERSION` + `LLAMA_SHA256{arm64,x64}`)
Repo `ggml-org/llama.cpp`, version is a build tag (e.g. `b9763`). Asset
`llama-<ver>-bin-macos-{arm64,x64}.tar.gz`. SHA is soft-verified (the table may hold `""` for dev);
compute with `curl … | shasum -a 256` to pin for release.

### node (`NODE_VERSION` + `NODE_SHA256{darwin:{arm64,x64}, windows:{arm64,x64}}`)
The runtime that runs the cursor worker. SHA from `https://nodejs.org/dist/v<ver>/SHASUMS256.txt`
(rows `node-v<ver>-darwin-{arm64,x64}.tar.gz`, `node-v<ver>-win-{arm64,x64}.zip`). **Pin to the Node
24 line** to satisfy `@cursor/sdk`'s `>=22.13` engines floor and match Conductor's bundled runtime.
34 changes: 34 additions & 0 deletions .agents/skills/helmor-bump-vendors/scripts/npm_vendor_sha.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/usr/bin/env bash
# Compute the darwin arm64 + x64 npm-tarball SHA256s for a Helmor class-B staged binary,
# ready to paste into the matching table in sidecar/scripts/vendor-platform.ts.
#
# Usage: npm_vendor_sha.sh <claude-code|codex|opencode> <version>
# Example: npm_vendor_sha.sh opencode 1.17.10
#
# Prints two lines: arm64: <sha256> / x64: <sha256>
# These are SHA256 of the *.tgz tarballs (what downloadAndVerify compares), NOT the npm
# registry's sha1/sha512 dist metadata — so they must be computed from the tarball itself.
set -euo pipefail

vendor="${1:?usage: npm_vendor_sha.sh <claude-code|codex|opencode> <version>}"
version="${2:?missing version (e.g. 2.1.191)}"

tarball_url() { # $1 = arm64|x64
local arch="$1"
case "$vendor" in
claude-code) printf 'https://registry.npmjs.org/@anthropic-ai/claude-code-darwin-%s/-/claude-code-darwin-%s-%s.tgz' "$arch" "$arch" "$version" ;;
codex) printf 'https://registry.npmjs.org/@openai/codex/-/codex-%s-darwin-%s.tgz' "$version" "$arch" ;;
opencode) printf 'https://registry.npmjs.org/opencode-darwin-%s/-/opencode-darwin-%s-%s.tgz' "$arch" "$arch" "$version" ;;
*) printf 'unknown vendor: %s (expected claude-code|codex|opencode)\n' "$vendor" >&2; exit 2 ;;
esac
}

for arch in arm64 x64; do
url="$(tarball_url "$arch")"
sha="$(curl -fsSL "$url" | shasum -a 256 | cut -d' ' -f1)"
if [ -z "$sha" ] || [ "$sha" = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" ]; then
printf 'ERROR: empty/zero-byte download for %s — check the version exists: %s\n' "$arch" "$url" >&2
exit 1
fi
printf '%s: %s\n' "$arch" "$sha"
done
5 changes: 5 additions & 0 deletions .changeset/bump-bundled-agents.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"helmor": patch
---

Update the bundled Claude Code, Cursor, OpenCode, and Kimi coding agents to their latest versions.
Loading
Loading