Releases: VladoIvankovic/Codeep
v2.13.0
Three new providers — Kimi (Moonshot), Grok (xAI), and Qwen (Alibaba) — covering the major coding models. Kimi and Qwen include their flat-fee coding-plan subscriptions alongside pay-per-use; Grok adds graded
/thinkingeffort.
Added
- Kimi (Moonshot AI).
kimidrives the Kimi Code subscription
(api.kimi.com/coding, model aliaskimi-for-coding);kimi-apiis
pay-per-use (api.moonshot.ai, defaultkimi-k2.7-code);kimi-cnfor
mainland China. Keys:KIMI_CODE_API_KEY/MOONSHOT_API_KEY. - Qwen (Alibaba Model Studio).
qwendrives the Coding Plan
subscription (coding-intl.dashscope…,sk-sp-key);qwen-apiis
pay-per-use (DashScope, defaultqwen3-coder-plus); plusqwen-cn/
qwen-cn-apiand a free ModelScope tier (modelscope). Keys:
BAILIAN_CODING_PLAN_API_KEY/DASHSCOPE_API_KEY/MODELSCOPE_API_KEY. - Grok (xAI).
grok— pay-per-use (api.x.ai), defaultgrok-build-0.1
plusgrok-4.3and the fast/reasoning variants. Key:XAI_API_KEY.
Changed
/thinkingnow covers Grok (reasoning_effort— low/medium/high). Kimi
and the Qwen coder models have no graded knob, so they stay out of the picker.- Qwen tool turns are sent non-streamed. DashScope rejects
toolswith
stream:true, so agent turns that carry tools buffer the reply (handled
transparently); other providers keep streaming. - Kimi K2.x code models fix temperature internally — Codeep withholds the
sampling params so they don't 400.
v2.12.0
New
/thinking(alias/effort) reasoning-effort control —auto · low · medium · high · max, shown beside the model in the status bar and clamped per provider+model so it never sends a value the API rejects. Plus a Codeep agent identity and iOS-testing MCP servers.
Added
/thinking(alias/effort) — thinking / reasoning-effort tiers. A single
control with five tiers (auto · low · medium · high · max) for how hard the
model reasons.auto(default) sends nothing — each model's own default. The
other tiers are clamped to the nearest level the active provider+model
actually accepts, so an unsupported value is never sent: Anthropic Opus/Sonnet
→output_config.effort, OpenAI GPT‑5.x →reasoning_effort(Max→xhigh),
Google Gemini 3 → low/high, DeepSeek V4 & Z.AI GLM‑5.2 → high/max,
OpenRouter → unifiedreasoning.effort. The active tier shows next to the
model in the status bar; models without a graded knob (Haiku, GLM‑Turbo,
Ollama, custom) hide it.- About‑Codeep persona. The agent system prompt now states what Codeep is and
points you at the right slash‑command, backed by a curated command index. - MCP marketplace: iOS‑testing servers. Added iOS Simulator and Mobile
(iOS + Android) servers for device/UI automation.
Changed
- MCP browser server is now Playwright (supersedes Puppeteer) — the de‑facto
browser‑automation MCP.
v2.11.2
Trimmed the model pickers (Claude Fable 5 is de-listed — unavailable under the US export ban — and a few older variants drop off), and editor clients (VS Code, Zed) now see API retry/backoff instead of an endless "Thinking…" spinner.
Changed
- Model picker cleanup across every provider. Claude Fable 5 is removed
from the Anthropic picker (unavailable under the US export ban; Opus 4.8 stays
the default). Z.AI dropsglm-5.1andglm-5(keepsglm-5.2+glm-5-turbo);
OpenAI dropsgpt-5.4-nano(keeps 5.5 / 5.4 / 5.4-mini). DeepSeek, Google, and
MiniMax are unchanged. All ids remain valid if set by hand — they're just no
longer offered. Context/cost tables updated to match.
Fixed
- ACP retry visibility. When a request hit a transient API error and the
agent retried with backoff, the ACP path dropped the notice (only the bare
iteration counter was suppressed, but the retry message went with it) — so
editor clients showed an indefinite "Thinking…" while the CLI was actually
retrying. Retry/backoff notices ("API 429 … retrying in 10s (1/3)") and
context warnings (⚠) are now forwarded as agent thoughts; the plain
iteration counter stays internal.
v2.11.1
Hotfix: the Z.AI default was
glm-5.2[1m], but the API rejects that id ("Unknown Model", code 1211) — so a fresh Z.AI session failed on its first request. The default is now plainglm-5.2(which works), and the non-working[1m]variant is removed from the picker.
Fixed
- Z.AI default model
glm-5.2[1m]returned "Unknown Model". The 1M-context
[1m]suffix from the devpack docs isn't accepted by the Z.AI chat API
endpoints Codeep uses, so it 400'd on every request. The default (and
cold-start default) is nowglm-5.2, andglm-5.2[1m]is dropped from all
four Z.AI providers' model lists. If your config still points at
glm-5.2[1m], switch with/model glm-5.2.
v2.11.0
New default model GLM-5.2 (1M-context
glm-5.2[1m]) across every Z.AI provider, plus TUI polish: ↑ recalls history, diffs render green/red, full/autocomplete, and/settingsvalues now stick.
Added
- GLM-5.2 — the new default Z.AI model. Added across all four Z.AI
providers (international + China, subscription + pay-per-use):glm-5.2[1m]
(1M context — the[1m]suffix selects the million-token window) is now the
default, with plainglm-5.2also offered; GLM-5.1, GLM-5 Turbo, and GLM-5
stay available. Context-window and cost tables include both new ids. (GLM-5.2
per-token pricing isn't published yet, so/costmirrors GLM-5.1 for now —
and on the GLM Coding Plan billing is a flat subscription anyway.) Editor
clients pick this up automatically over ACP.
Fixed
/settingsvalues stick now. A block of startup "migrations" ran on
every launch and silently forced user-chosen values back up —maxTokens
below 32768,agentMaxDurationbelow 480 min, API timeout and rate limits —
so the affected settings were effectively lies. They now run exactly once
per config (recorded viamigrationVersion); after that, what you set is
what you get.↑recalls prompt history on an empty input. The status bar has always
advertised "↑↓ history", but a scroll handler intercepted the arrows first,
so history recall was unreachable. Arrows now do history (like every shell);
scrolling lives on PgUp/PgDn and the mouse wheel.- New messages no longer yank you to the bottom. While you're scrolled up
reading, incoming messages (every agent action, mid-run) used to reset the
view to the bottom. The view now stays put and the status bar shows a
"↓ N new · PgDn" badge until you return.
Changed
- Diff blocks render as diffs. ```diff fences — which the agent emits on
every edit confirmation — now highlight +added lines green, -removed red,
and @@hunks cyan. Previously they fell through to JS keyword colors. - Every command in
/autocomplete has a description. 48 of 123 rows were
blank (the whole scaffold/git/devops family — /component, /pr, /docker, …)
and 9 bare single-letter aliases cluttered the list. The dropdown is now
derived from a single command registry, so a command can't ship without a
description again; the single-letter shortcuts (/c,/t, …) still work,
they're just not listed. - ~800 lines of dead UI code removed (an unused parallel chat renderer and
two unreachable fullscreen screens) — no behavior change, but edits can no
longer land in the wrong renderer by mistake. - ACP
session/newnow returns the prior transcript on resume. When an
editor client reconnects withfresh: false(e.g. a VS Code window reload),
the response carries the workspace session'shistory(user/assistant only,
mirroringsession/load) so the client can repaint the chat instead of
showing blank while the agent still holds the context. Empty on a fresh
session; older clients ignore the extra field. Powers Codeep VS Code 2.6.0's
reload restore.
v2.10.0
/tasks addnow matches the dashboard: tag a task as a bug or feature and give it a description inline (--bug/--feature/--desc), and the list tags each row with its project when global.
Added
- Task types in
/tasks add. Append--bugor--feature(or--task,
the default) to file the task under the right type on the codeep.dev
dashboard — e.g./tasks add login button misaligned --bug. The flag can sit
anywhere in the arguments and is stripped from the title; the dashboard and
the macOS app already render the type with its own icon and color, so this
brings all three surfaces to parity (the dashboard and macOS both let you pick
a type; the CLI previously hardcodedtask). - Task descriptions in
/tasks add.--desc(or--description) captures
the following words — up to the next flag — as the task's description, e.g.
/tasks add Fix login --bug --desc NPE when the email is empty. It's the same
field the dashboard and macOS app set; the/taskslist already prints it and
it's injected into the agent's task-context prompt, so a CLI-set description
immediately enriches what the agent sees. Omitted from the request when absent.
Changed
/taskslist tags each row with its project when listed globally. Running
/tasksoutside a project lists pending tasks across all projects; each row
now shows its project name (matching the macOS and dashboard task rows) so a
mixed list is legible. Inside a project the header already names it, so rows
stay uncluttered./tasksautocomplete description now reflects the full command — it
covered only "show pending tasks" and hid theadd/done/delete
subcommands and the type flags from/autocomplete.
Fixed
/statsnow shows the prompt-caching summary, and a dead duplicate cost
case is gone. The session-cost view had two switch branches sharing a
case 'cost':/costalways rendered the fullformatCostReport()(the
cross-surface report the editor clients use, with the prompt-caching section),
while the second branch — the detailed/statsview — was unreachable for
/costyet was the only one missing that caching section./statsnow
reports cache reads/writes and estimated savings too (parity with/costand
the 2.0.2 caching work), and the deadcostlabel was removed so the dispatch
is honest. What/costdisplays is unchanged./keysyncnow appears in/autocomplete. The command shipped in 2.8.0
with a description and an ACP entry, but was missing from the TUI command
list, so terminal users never saw it offered. (It always worked when typed.)
v2.9.0
Claude Fable 5 — Anthropic's most powerful model, a new tier above Opus — is now in the model picker ($10/$50 per MTok, 1M context). Opus 4.7 and 4.6 leave the picker (Opus 4.8 stays the default). Plus a real compatibility fix: temperature is no longer sent to models that reject it (Fable 5 / Opus 4.7+), which previously surfaced as an opaque 400.
Added
- Claude Fable 5 (
claude-fable-5) in the Anthropic provider — the most
powerful Claude model, a new tier above Opus. $10 input / $50 output per
MTok, 1M context window. Pick it with/model claude-fable-5.
Changed
- Opus 4.7 and 4.6 removed from the model picker now that Opus 4.8 and
Fable 5 cover both tiers. The ids remain valid — if your config still points
at one, it keeps working; it just isn't offered for new selection.
Fixed
temperatureis no longer sent to Anthropic models that reject it.
Fable 5 and Opus 4.7+ return HTTP 400 when the request includes
temperature— and that 400 was previously masked by the tools-fallback
retry, surfacing as a generic "API error". A model-aware guard
(modelRejectsSamplingParams) now omits the parameter on those models
across all three Anthropic request paths (agent, fallback, plain chat);
omission means the API default, so behavior on other models is unchanged.
v2.8.0
API keys are now keychain-first and stay local by default — syncing them to codeep.dev is an explicit opt-in (
/keysync on), andcodeep account purge-keyswipes any keys already on the server.
Added
/keysync on|off|status— opt in (or out) of syncing API keys to
codeep.dev. OFF by default: your keys live only in the OS keychain unless
you enable this. When on,codeep account push/syncupload/download keys;
the command warns that synced keys are stored server-readable. Also available
in/settings, and forced off by theCODEEP_NO_KEY_SYNCenv var (org policy).codeep account purge-keys— delete every API key stored on codeep.dev in
one shot (cloud-only; your local OS keychain is untouched). A clean exit if you
synced keys before and want them off the server.
Changed
codeep account push/account syncno longer move API keys unless cloud
key sync is enabled (/keysync on). They still push/pull personalities,
custom commands, and your profile as before — only the secret half is gated.
Existing users who relied on key sync just run/keysync ononce.
v2.7.0
A batch of review tooling: YAML review config, a
codeep hook installpre-commit reviewer,codeep review --rulesto list rule ids, and an opt-incodeep review --aisecond opinion. Plus fixes: compiled binaries report the real version (no more "vunknown"), ACP editor sessions no longer mutate the global confirmation setting, and keychain-fallback keys get swept into the keychain once it's available.
Added
- YAML review config.
.codeep/review.yml/.codeep/review.yamlare now
supported alongside.codeep/review.json(YAML preferred when present).
Single-quoted YAML keeps regex backslashes literal (pattern: '\bfoo\('),
avoiding JSON's double-escaping. Same schema; format is auto-detected. codeep hook install— installs a git pre-commit (or--pre-push) hook
that runscodeep review --fail-on <level>on your changes, blocking the
commit when issues at/above the threshold are found (honors.codeep/review.*,
no API key).codeep hook uninstallremoves it; Codeep never overwrites a hook
it didn't create.codeep review --rules— lists the built-in rule ids (the values you can
put indisablein.codeep/review.*) and exits.codeep review --ai— opt-in: after the offline pass, asks your configured
provider for a contextual second opinion, merged into the report as a clearly
tagged advisory section. Needs an API key (degrades to deterministic-only
without one) and never affects the exit code — the deterministic review stays
authoritative, so CI (the GitHub Action) is unchanged.
Fixed
-
Keychain fallback sweep. If the OS keychain was unavailable on a prior run,
API keys fell back to plaintext config. They're now swept into the keychain
automatically once it becomes available (completes the 2.5.2 key-storage work). -
Compiled binary version. The standalone binaries printed "Codeep
vunknown" because they read the version frompackage.json, which isn't on
disk in a compiled binary. The version is now baked in at build time, so
--versionis correct everywhere (npm, Homebrew, and the standalone binaries). -
ACP confirmation setting no longer leaks/races. Manual-mode editor
sessions used to flip the globalagentConfirmWriteFileconfig and restore it
non-atomically around each prompt — which could leak the session's mode into
the terminal app and race when prompts overlapped. Write/edit confirmation is
now scoped to the run via a per-call option, with no global config mutation.
v2.6.0
New: configurable code-review rules. Drop a
.codeep/review.jsoninto a repo to add your own deterministic review rules, disable built-in ones, and scope which files are reviewed — enforced the same way bycodeep review(CLI) and the Codeep GitHub Action, with zero LLM cost.
Added
.codeep/review.json— review rules as config. The deterministic
reviewer (codeep review,/review --static, and the GitHub Action) now
reads a per-project config:rules— your own checks:id,pattern(regex),message(required)
plus optionalflags,category,severity,suggestion,extensions.disable— turn off built-in rules by id (each built-in now has a stable
id, e.g.eval-usage,todo-comment,any-type,long-file).include/exclude— glob scoping (**,*,?).
A missing, malformed, or partially-invalid config never breaks a review — bad
entries are skipped with a warning and valid ones still apply.
Security
- Hardened the reviewer against untrusted custom rules. Since a PR's
.codeep/review.jsonruns in CI via the Action, custom regexes are screened
at load (length cap + a catastrophic-backtracking/ReDoS heuristic), the match
loop guards zero-width patterns (no infinite loop) and caps matches per rule,
and the GitHub Action bounds each review's wall-clock at 180s.