From 246933e9a16f931108de907fdb8ea0077cf204b0 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 11:46:32 +0400
Subject: [PATCH 01/41] Add parity harness to lock feature correctness through
 native-CLI migration

Two-layer correctness model around the existing transcript JSON contract:
- Layer 1 (committed, CI-ready): synthetic neutral transcript + caption goldens
  for all 4 styles, word-text normalization + zero-duration-floor assertions.
- Layer 2 (local/gitignored): capture_baseline.py freezes current openai-whisper
  output; compare.py gates a candidate engine on WER + word-timestamp drift.

Also adds plans/native-cli.md (full migration plan).
---
 plans/native-cli.md                      | 137 +++++++++++++++++++++++
 tests/parity/.gitignore                  |   6 +
 tests/parity/README.md                   |  84 ++++++++++++++
 tests/parity/capture_baseline.py         | 110 ++++++++++++++++++
 tests/parity/compare.py                  | 129 +++++++++++++++++++++
 tests/parity/golden/branded.ass.expected |  52 +++++++++
 tests/parity/golden/hormozi.ass.expected |  18 +++
 tests/parity/golden/karaoke.ass.expected |  16 +++
 tests/parity/golden/subtle.ass.expected  |  16 +++
 tests/parity/test_caption_parity.py      | 110 ++++++++++++++++++
 tests/parity/transcript_synthetic.json   |  22 ++++
 11 files changed, 700 insertions(+)
 create mode 100644 plans/native-cli.md
 create mode 100644 tests/parity/.gitignore
 create mode 100644 tests/parity/README.md
 create mode 100644 tests/parity/capture_baseline.py
 create mode 100644 tests/parity/compare.py
 create mode 100644 tests/parity/golden/branded.ass.expected
 create mode 100644 tests/parity/golden/hormozi.ass.expected
 create mode 100644 tests/parity/golden/karaoke.ass.expected
 create mode 100644 tests/parity/golden/subtle.ass.expected
 create mode 100644 tests/parity/test_caption_parity.py
 create mode 100644 tests/parity/transcript_synthetic.json

diff --git a/plans/native-cli.md b/plans/native-cli.md
new file mode 100644
index 0000000..32b9d0e
--- /dev/null
+++ b/plans/native-cli.md
@@ -0,0 +1,137 @@
+# podcli → Native CLI (codex-style)
+
+> Goal: turn podcli from a git-clone + `setup.sh` + venv/npm hybrid into a **native CLI you install once and that auto-updates** — like `openai/codex`. Users run `npm i -g podcli` (or `bun add -g podcli`) and `podcli process video.mp4` just works, everywhere, with no Python/Node/FFmpeg setup.
+
+## North star
+
+```
+npm i -g podcli            # or: bun add -g podcli
+podcli process pod.mp4 --top 5
+  → first run: silently provisions a hermetic runtime (one time)
+  → 9:16 clips with burned captions
+podcli                     # auto-updates itself on launch
+```
+
+No `setup.sh`. No venv. No `pip`. No `npm install` of the engine. No "is the right Python/FFmpeg on PATH?" The system environment becomes irrelevant.
+
+---
+
+## Why this is hard (the core tension)
+
+codex is a single static Rust binary with **zero** runtime deps. podcli is the opposite — a **three-runtime hybrid**:
+
+- **Python engine** (`backend/cli.py`, ~188KB): Whisper (→ PyTorch ~2GB), OpenCV face-crop, Pillow, FFmpeg, Google API.
+- **Node/TS**: MCP server (`src/server.ts`), React web UI, Remotion → headless **Chromium** (studio bookends).
+- **Bash launcher** routing PodStack AI commands to Claude/Codex, everything else to Python.
+
+You can't fold PyTorch + Chromium + FFmpeg into one static binary. So we **package the hybrid**: a tiny native launcher that provisions and drives hermetic runtimes, and we **kill the single worst dependency (PyTorch) by swapping Whisper → whisper.cpp.**
+
+---
+
+## Locked decisions
+
+| Area | Decision |
+|---|---|
+| **Target artifact** | Package the hybrid. Thin Go launcher provisions + drives hermetic runtimes; self-updates. Not a rewrite. |
+| **Launcher language** | **Go.** One `go build` → 5 static binaries. Replaces both bash `podcli` and `install.cmd`. |
+| **Runtimes** | **Fully hermetic.** Launcher downloads pinned standalone CPython, static FFmpeg, whisper.cpp (+ Node/Chromium later). System python/node/ffmpeg ignored. |
+| **Transcription** | **whisper.cpp** replaces `openai-whisper`/PyTorch. GGUF models. Metal on Apple Silicon, CUDA/CPU elsewhere. ~145MB vs ~2GB. |
+| **Bundle model** | Tiny launcher; **first run provisions the full core stack** (download-once, like today's `setup.sh` but automatic + cross-platform). |
+| **Storage** | **Global** managed dir for runtimes + model cache (`%LOCALAPPDATA%\podcli` / `~/Library/Application Support/podcli` / `~/.local/share/podcli`). **Per-project** `.podcli/` (knowledge, output, history) stays in cwd — podcli stays project-scoped like git. |
+| **Distribution** | **npm + bun only.** Thin wrapper package fetches the platform binary on install (codex-style). **No code signing, no brew/winget/curl\|sh/.exe.** |
+| **Auto-update** | On launch: fast (~250ms, short-timeout) check against GitHub Releases. Newer → update then load. Offline/slow → proceed on current version (never blocks). Self-replace the managed binary in `~/.podcli/bin/`; if that's impossible, print the matching upgrade command (`npm i -g podcli` / `bun add -g podcli`). |
+| **Update opt-out** | Persistent off switch: `podcli config set update.auto off` + `PODCLI_NO_UPDATE=1`. When off: no checks, runs installed version. `podcli update` still works on demand. |
+| **AI features** | API key preferred → AI-CLI fallback → core works without. If a key is set, call the Claude/OpenAI API directly (self-contained); else shell to installed Claude/Codex CLI (today's behavior); else the video pipeline still works and AI features print how to enable them. |
+| **Platforms** | macOS arm64, macOS x64, Linux x64, Linux arm64, Windows x64. |
+| **First milestone** | **Thin vertical slice** — `process` pipeline only, fully hermetic, whisper.cpp, npm/bun, self-update, all 5 platforms. Studio / AI-API / MCP come after. |
+
+---
+
+## Target architecture
+
+```
+┌─ podcli (Go launcher, ~8MB, per-platform) ──────────────────────────┐
+│  • on-launch self-update (GitHub Releases, throttle-free fast check) │
+│  • first-run provisioning → global managed dir                       │
+│  • subcommand routing: process/studio/thumbnails… → hermetic python  │
+│                         studio render            → hermetic node      │
+│  • config, version pinning, rollback                                 │
+└──────────────────────────────────────────────────────────────────────┘
+                         │ provisions (pinned versions)
+                         ▼
+   Global managed dir (~/.local/share/podcli, etc.)
+     bin/        podcli-<version>            (the real engine binary, self-updatable)
+     runtime/    cpython-standalone/  ffmpeg  whisper.cpp  (+ node/ later)
+     models/     ggml-base-q5_1.bin  …        (fetched/cached)
+     venv/       hermetic pip env for backend/ deps (opencv, Pillow, …)
+
+   Per-project (cwd)/.podcli/
+     knowledge/  output/  history/  presets/  cache/      (unchanged)
+```
+
+**Subcommand routing (MVP):** `process` and friends → `runtime/cpython/python backend/cli.py …` with all paths pointing at the hermetic runtime. The Go launcher sets `PYTHON`, `FFMPEG`, model paths, and env so `cli.py` never touches the system.
+
+---
+
+## Transcription swap (the keystone engine change)
+
+Clean seam: `backend/services/transcription.py::transcribe_file()` returns a fixed dict (`segments`, word-level `words`, `duration`, `language`, speaker fields). Only the engine behind it changes.
+
+- Replace `import whisper; model.transcribe(..., word_timestamps=True)` with a subprocess call to the vendored `whisper-cli` (whisper.cpp) emitting JSON, then map its output → the existing dict shape.
+- **Validation risk to prove early:** whisper.cpp word-level timestamps must be good enough for the karaoke/word-highlight captions. Build a parity test (same clip, compare word timings old vs new) before committing.
+- Diarization is already optional/off by default (Claude handles speakers; paste-transcript supports `Speaker (MM:SS)`), so it's not a blocker.
+- Models: ship/fetch `ggml-base-q5_1` (~57MB) by default; allow `--model small/medium/large` to lazily fetch bigger GGUFs into `models/`.
+
+---
+
+## Roadmap
+
+### Phase 0 — Foundation spike (de-risk everything)
+- Go launcher skeleton: arg parse, subcommand passthrough to a hand-placed python.
+- Managed-dir layout + OS-appropriate paths.
+- Hermetic provisioning: download pinned **CPython standalone**, **static FFmpeg**, **whisper.cpp** binary + base-q5 model for the **current** platform; create hermetic venv; `pip install` backend deps into it.
+- Prove `go run . process sample.mp4` produces a clip using **only** hermetic components (rename/hide system python+ffmpeg to verify).
+
+### Phase 1 — whisper.cpp engine swap
+- Reimplement `transcribe_file()` on whisper.cpp behind the existing dict contract.
+- Word-timestamp parity test vs `openai-whisper` on a fixture; tune `--max-len`/token-timestamp flags.
+- Remove `openai-whisper` from `requirements.txt`; confirm captions (karaoke/Hormozi/subtle) still render correctly.
+
+### Phase 2 — Distribution + self-update (→ first installable release)
+- CI matrix builds the Go launcher for all **5 targets**.
+- **npm + bun wrapper package**: `postinstall` downloads the platform binary into `~/.podcli/bin/`; `bin` shim execs it. Publish to npm (bun consumes the same registry).
+- GitHub Release per version carries: the 5 binaries + a **version manifest** pinning required runtime versions (so an update knows what to re-provision).
+- Self-update: fast on-launch check, atomic self-replace of the managed binary, npm/bun fallback message, `PODCLI_NO_UPDATE` + `config set update.auto off`, `podcli update`, keep-previous-binary for `podcli rollback`.
+- **Ship.** `npm i -g podcli` → `podcli process` works hermetic on all 5 platforms and auto-updates. ← *this is the MVP gate.*
+
+### Phase 3 — Lazy tiers + studio
+- Demote OpenCV face-crop to lazy (center-crop default offline; fetch opencv on first smart crop — the center-crop fallback already exists at `cli.py:621`).
+- Lazy bigger Whisper models.
+- **Studio**: provision hermetic **Node + Remotion + Chromium** on first `studio` use; route `studio` render through it.
+
+### Phase 4 — AI goes native
+- Port PodStack prompt files (`.claude/commands/*.md`) into the engine as direct **Claude API** calls.
+- `podcli config set api-key …`; precedence: API key → installed Claude/Codex CLI → "enable AI" hint.
+- Keep core pipeline fully functional without any AI.
+
+### Phase 5 — MCP / web UI (or deprecate)
+- Decide whether the MCP server still matters once AI is native via API, or it's only for "use podcli from inside Claude/Codex."
+- If kept: `podcli serve` (MCP stdio) + `podcli ui` (web dashboard) provisioned on demand via hermetic Node.
+
+---
+
+## Risks / open questions
+
+- **whisper.cpp timestamp quality** for word-highlight captions — *prove in Phase 1 before deleting PyTorch path.*
+- **Hermetic Node + Chromium on Windows** for studio (Phase 3) — Remotion's Chromium download + headless render is the heaviest non-ML surface; expect platform-specific pain.
+- **First-run download size/time** — set expectations with a clear progress UI; cache aggressively in the global dir.
+- **npm self-update vs package-manager ownership** — managed-binary-in-`~/.podcli/bin` sidesteps it; fallback message covers the rest.
+- **GPU acceleration** — whisper.cpp Metal (mac) is automatic; CUDA (linux/win) needs the right prebuilt — decide CPU-only baseline + optional CUDA fetch.
+- **Versioning** — single version for launcher + manifest of pinned runtime versions; SemVer; changelog drives the "update available" line.
+- **MCP/web UI fate** — genuinely open; resolve at Phase 5.
+
+---
+
+## Immediate next step
+
+Start **Phase 0** on a branch: stand up the Go launcher + hermetic provisioning for the current platform (darwin/arm64) and get `process` running end-to-end against only hermetic components. That single spike validates the launcher, the managed-dir model, hermetic provisioning, and the whisper.cpp integration surface in one shot.
diff --git a/tests/parity/.gitignore b/tests/parity/.gitignore
new file mode 100644
index 0000000..7ea24b7
--- /dev/null
+++ b/tests/parity/.gitignore
@@ -0,0 +1,6 @@
+# Real-audio fixtures and captured baselines may contain podcast content —
+# never commit them. The committable parity surface is the synthetic transcript,
+# the goldens (*.ass.expected), and the harness code.
+local/
+baseline/
+candidate/
diff --git a/tests/parity/README.md b/tests/parity/README.md
new file mode 100644
index 0000000..87ea258
--- /dev/null
+++ b/tests/parity/README.md
@@ -0,0 +1,84 @@
+# Parity harness — keeping every feature correct through the native-CLI migration
+
+This harness is the safety net for `plans/native-cli.md` (Go launcher, hermetic
+runtimes, **whisper.cpp** replacing openai-whisper/PyTorch). Its job: prove that
+swapping the transcription engine and relocating the runtime does **not** change
+what podcli produces.
+
+## The correctness model: two layers split by a contract that already exists
+
+The transcript JSON `{words, segments, ...}` is already a stable, multi-producer
+contract (produced by `transcribe_file`, `parse_speaker_transcript`, raw JSON
+import, **and** the on-disk cache; consumed by corrections, cropping, moment
+selection, and captions). That seam lets correctness decompose into two layers
+that are verified independently.
+
+### Layer 1 — everything *downstream* of the transcript JSON
+Moments → crop → captions → normalize → encode. **This code does not change** in
+the migration; only the runtime relocates (system → hermetic). So the rule is
+absolute: a fixed transcript must yield identical output. Any difference is a
+runtime *pinning* bug, never a logic change.
+
+- `transcript_synthetic.json` — neutral, no podcast content; packs the word-text
+  edge cases (leading-space token, number+symbol, punctuation, apostrophe,
+  whitespace-only token, zero-duration token, speaker change).
+- `test_caption_parity.py` — renders all four caption styles from that transcript
+  and diffs against committed goldens (`golden/*.ass.expected`). Also pins the
+  **word-text normalization** the whisper.cpp boundary must reproduce exactly
+  (the single highest-risk integration detail) and the 50ms zero-duration floor.
+
+Run it (fast, no media, CI-friendly):
+
+```
+venv/bin/python3 -m pytest tests/parity/test_caption_parity.py -q
+```
+
+Intentionally update goldens after a deliberate change:
+
+```
+UPDATE_GOLDENS=1 venv/bin/python3 -m pytest tests/parity/test_caption_parity.py -q
+```
+
+### Layer 2 — the engine (`audio → transcript JSON`)
+The only real change. Verified by comparing the new engine's output to a frozen
+openai-whisper baseline, with **forgiving** tolerances — because the caption
+pipeline already runs in production on evenly-spaced *synthetic* word timings
+(`transcript_parser.py:306`), so absolute timestamp fidelity is a quality nicety,
+not a correctness requirement.
+
+1. Capture the baseline **now**, while openai-whisper still works. Drop a few
+   short representative clips into `tests/parity/local/` (single speaker, two
+   speakers, music-heavy, fast speech), then:
+
+   ```
+   venv/bin/python3 tests/parity/capture_baseline.py
+   ```
+
+   Writes `baseline/<stem>/` = transcript.json + metrics.json + captions per style.
+
+2. Later, run whisper.cpp into `candidate/<stem>/transcript.json` and gate:
+
+   ```
+   venv/bin/python3 tests/parity/compare.py <stem>
+   ```
+
+   Checks WER and word-timestamp drift (median/p95) against thresholds
+   (`PARITY_MAX_WER`, `PARITY_MAX_MEDIAN_DRIFT`, `PARITY_MAX_P95_DRIFT`). Nonzero
+   exit = regression; wire it into CI as the swap gate.
+
+## Why this makes "everything still works" tractable
+
+- **Layer 1 is identical by construction** — pinned runtime + frozen-transcript
+  goldens. The boring 80% can't drift silently.
+- **Layer 2 is bounded against a floor the app already ships** — whisper.cpp only
+  has to beat evenly-spaced timings, which it does trivially.
+- **The cache protects existing work** — already-transcribed videos reuse their
+  openai-whisper JSON, so they produce byte-identical output under the new binary.
+- **Dual-engine release** (`--engine whisper-py`, planned) gives an instant
+  real-world fallback while whisper.cpp proves itself.
+
+## What is committed vs local
+
+Committed: this README, `transcript_synthetic.json`, `golden/*.ass.expected`,
+the harness scripts. **Never committed** (`.gitignore`): `local/` fixtures,
+`baseline/`, `candidate/` — they can contain podcast content.
diff --git a/tests/parity/capture_baseline.py b/tests/parity/capture_baseline.py
new file mode 100644
index 0000000..2fe593e
--- /dev/null
+++ b/tests/parity/capture_baseline.py
@@ -0,0 +1,110 @@
+"""Layer-2 engine baseline capture.
+
+Runs the CURRENT transcription engine (openai-whisper) on real-audio fixtures
+and freezes its output as ground truth. When the engine is later swapped to
+whisper.cpp, `compare.py` measures the candidate against this baseline with
+explicit tolerances.
+
+The baseline is the transcript JSON contract (`words` + `segments`) plus a
+metrics summary and the captions rendered from those real words — i.e. exactly
+the things whisper.cpp must reproduce.
+
+Usage:
+    venv/bin/python3 tests/parity/capture_baseline.py FIXTURE [FIXTURE ...]
+    venv/bin/python3 tests/parity/capture_baseline.py            # scans tests/parity/local/
+
+Outputs (gitignored) under tests/parity/baseline/<stem>/:
+    transcript.json      full {words, segments, duration, language}
+    metrics.json         n_words, n_segments, duration, language, engine, model
+    captions_<style>.ass captions rendered from the real transcript words
+
+Capture this WHILE the current system still works — it is the frozen reference
+everything else is measured against.
+"""
+
+import json
+import os
+import sys
+
+HERE = os.path.dirname(os.path.abspath(__file__))
+ROOT = os.path.abspath(os.path.join(HERE, "..", ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+LOCAL_DIR = os.path.join(HERE, "local")
+BASELINE_DIR = os.path.join(HERE, "baseline")
+MEDIA_EXTS = (".mp4", ".mov", ".mkv", ".wav", ".mp3", ".m4a", ".aac")
+STYLES = ["hormozi", "karaoke", "subtle", "branded"]
+MODEL = os.environ.get("PARITY_MODEL", "base")
+
+
+def _discover():
+    if not os.path.isdir(LOCAL_DIR):
+        return []
+    return [
+        os.path.join(LOCAL_DIR, f)
+        for f in sorted(os.listdir(LOCAL_DIR))
+        if f.lower().endswith(MEDIA_EXTS)
+    ]
+
+
+def capture(media_path: str):
+    from services.transcription import transcribe_file
+    from services.caption_renderer import render_captions
+
+    stem = os.path.splitext(os.path.basename(media_path))[0]
+    out_dir = os.path.join(BASELINE_DIR, stem)
+    os.makedirs(out_dir, exist_ok=True)
+
+    print(f"  transcribing {os.path.basename(media_path)} (model={MODEL}) ...")
+    result = transcribe_file(
+        file_path=media_path,
+        model_size=MODEL,
+        enable_diarization=False,  # deterministic; diarization is optional/off by default
+    )
+    words = result.get("words", [])
+    segments = result.get("segments", [])
+
+    with open(os.path.join(out_dir, "transcript.json"), "w", encoding="utf-8") as f:
+        json.dump(result, f, indent=2, ensure_ascii=False)
+
+    metrics = {
+        "engine": "openai-whisper",
+        "model": MODEL,
+        "n_words": len(words),
+        "n_segments": len(segments),
+        "duration": result.get("duration"),
+        "language": result.get("language"),
+    }
+    with open(os.path.join(out_dir, "metrics.json"), "w", encoding="utf-8") as f:
+        json.dump(metrics, f, indent=2)
+
+    for style in STYLES:
+        render_captions(words, style, os.path.join(out_dir, f"captions_{style}.ass"))
+
+    print(f"  -> {out_dir}  ({len(words)} words, {len(segments)} segments)")
+    return metrics
+
+
+def main(argv):
+    fixtures = argv[1:] or _discover()
+    if not fixtures:
+        print(
+            "No fixtures. Pass media paths, or drop clips into tests/parity/local/ .\n"
+            "Use short representative clips (single speaker, two speakers, music-heavy, "
+            "fast speech). They stay local — never committed.",
+            file=sys.stderr,
+        )
+        return 1
+    for media in fixtures:
+        if not os.path.exists(media):
+            print(f"  skip (not found): {media}", file=sys.stderr)
+            continue
+        capture(media)
+    print("\nBaseline captured. Compare a new engine later with tests/parity/compare.py")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main(sys.argv))
diff --git a/tests/parity/compare.py b/tests/parity/compare.py
new file mode 100644
index 0000000..063a9f3
--- /dev/null
+++ b/tests/parity/compare.py
@@ -0,0 +1,129 @@
+"""Compare a candidate transcription engine against the captured baseline.
+
+Layer-2 gate: bounds whisper.cpp (or any new engine) against openai-whisper
+ground truth on text accuracy and word-timestamp drift. The acceptance bar is
+deliberately forgiving because the caption pipeline already runs in production
+on evenly-spaced synthetic word timings (see transcript_parser.py) — absolute
+timestamp fidelity is a quality nicety, not a correctness requirement.
+
+Usage:
+    venv/bin/python3 tests/parity/compare.py <stem>
+        compares baseline/<stem>/transcript.json vs candidate/<stem>/transcript.json
+    venv/bin/python3 tests/parity/compare.py <baseline.json> <candidate.json>
+
+Exit code is nonzero if any threshold is exceeded — wire it into CI as the gate
+for the whisper.cpp swap.
+"""
+
+import json
+import os
+import sys
+
+HERE = os.path.dirname(os.path.abspath(__file__))
+
+# Thresholds — tune against real fixtures, then lock in CI.
+MAX_WER = float(os.environ.get("PARITY_MAX_WER", "0.08"))           # 8% word error
+MAX_MEDIAN_DRIFT = float(os.environ.get("PARITY_MAX_MEDIAN_DRIFT", "0.10"))  # 100ms
+MAX_P95_DRIFT = float(os.environ.get("PARITY_MAX_P95_DRIFT", "0.30"))        # 300ms
+
+
+def _norm(w: str) -> str:
+    return (w or "").strip().lower().strip(".,!?;:\"'")
+
+
+def _words(path: str):
+    with open(path, encoding="utf-8") as f:
+        data = json.load(f)
+    return data.get("words", []) if isinstance(data, dict) else data
+
+
+def _wer(ref_tokens, hyp_tokens) -> float:
+    """Word error rate via Levenshtein distance over token sequences."""
+    n, m = len(ref_tokens), len(hyp_tokens)
+    if n == 0:
+        return 0.0 if m == 0 else 1.0
+    prev = list(range(m + 1))
+    for i in range(1, n + 1):
+        cur = [i] + [0] * m
+        for j in range(1, m + 1):
+            cost = 0 if ref_tokens[i - 1] == hyp_tokens[j - 1] else 1
+            cur[j] = min(prev[j] + 1, cur[j - 1] + 1, prev[j - 1] + cost)
+        prev = cur
+    return prev[m] / n
+
+
+def _drift(ref, hyp):
+    """Median / p95 abs start-time diff over the order-preserving common
+    subsequence of matching word tokens. Words present in only one transcript
+    are ignored for drift (they're counted by WER)."""
+    rt = [_norm(w.get("word", "")) for w in ref]
+    ht = [_norm(w.get("word", "")) for w in hyp]
+    n, m = len(rt), len(ht)
+    # LCS backtrace to align matching words by position.
+    dp = [[0] * (m + 1) for _ in range(n + 1)]
+    for i in range(1, n + 1):
+        for j in range(1, m + 1):
+            dp[i][j] = dp[i - 1][j - 1] + 1 if rt[i - 1] == ht[j - 1] else max(dp[i - 1][j], dp[i][j - 1])
+    i, j, diffs = n, m, []
+    while i > 0 and j > 0:
+        if rt[i - 1] == ht[j - 1]:
+            try:
+                diffs.append(abs(float(ref[i - 1]["start"]) - float(hyp[j - 1]["start"])))
+            except (KeyError, TypeError, ValueError):
+                pass
+            i, j = i - 1, j - 1
+        elif dp[i - 1][j] >= dp[i][j - 1]:
+            i -= 1
+        else:
+            j -= 1
+    diffs.sort()
+    if not diffs:
+        return None, None, 0
+    median = diffs[len(diffs) // 2]
+    p95 = diffs[min(len(diffs) - 1, int(len(diffs) * 0.95))]
+    return median, p95, len(diffs)
+
+
+def compare(baseline_path: str, candidate_path: str) -> bool:
+    ref, hyp = _words(baseline_path), _words(candidate_path)
+    wer = _wer([_norm(w.get("word", "")) for w in ref], [_norm(w.get("word", "")) for w in hyp])
+    median, p95, matched = _drift(ref, hyp)
+
+    print(f"  baseline words: {len(ref)}   candidate words: {len(hyp)}")
+    print(f"  WER:            {wer:.3f}   (max {MAX_WER})")
+    if median is None:
+        print("  drift:          n/a (no aligned words)")
+    else:
+        print(f"  drift median:   {median:.3f}s (max {MAX_MEDIAN_DRIFT})   p95: {p95:.3f}s (max {MAX_P95_DRIFT})   aligned: {matched}")
+
+    ok = wer <= MAX_WER
+    if median is not None:
+        ok = ok and median <= MAX_MEDIAN_DRIFT and p95 <= MAX_P95_DRIFT
+    print("  RESULT:        ", "PASS" if ok else "FAIL")
+    return ok
+
+
+def _resolve(arg):
+    cand = os.path.join(HERE, "candidate", arg, "transcript.json")
+    base = os.path.join(HERE, "baseline", arg, "transcript.json")
+    if os.path.exists(base) and os.path.exists(cand):
+        return base, cand
+    return arg, None
+
+
+def main(argv):
+    if len(argv) == 2:
+        base, cand = _resolve(argv[1])
+        if cand is None:
+            print(f"Need baseline/<stem> and candidate/<stem>, or two explicit paths.", file=sys.stderr)
+            return 2
+    elif len(argv) == 3:
+        base, cand = argv[1], argv[2]
+    else:
+        print(__doc__)
+        return 2
+    return 0 if compare(base, cand) else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main(sys.argv))
diff --git a/tests/parity/golden/branded.ass.expected b/tests/parity/golden/branded.ass.expected
new file mode 100644
index 0000000..e6e5650
--- /dev/null
+++ b/tests/parity/golden/branded.ass.expected
@@ -0,0 +1,52 @@
+[Script Info]
+Title: Podcast Clip Captions (Branded)
+ScriptType: v4.00+
+PlayResX: 1080
+PlayResY: 1920
+WrapStyle: 0
+ScaledBorderAndShadow: yes
+
+[V4+ Styles]
+Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
+Style: BrandedNormal,Arial,80,&H00FFFFFF,&H00FFFFFF,&H90000000,&H00000000,-1,0,0,0,100,100,2,0,1,1,1,2,60,60,500,1
+
+[Events]
+Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
+Dialogue: 0,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an7\pos(208,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 168 0 b 176 0 183 7 183 15 l 183 85 b 183 93 176 100 168 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
+Dialogue: 0,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an7\pos(373,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 257 0 b 265 0 272 7 272 15 l 272 85 b 272 93 265 100 257 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
+Dialogue: 0,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an7\pos(627,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 230 0 b 238 0 245 7 245 15 l 245 85 b 245 93 238 100 230 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
+Dialogue: 0,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an7\pos(150,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 419 0 b 427 0 434 7 434 15 l 434 85 b 434 93 427 100 419 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
+Dialogue: 0,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an7\pos(566,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 191 0 b 199 0 206 7 206 15 l 206 85 b 206 93 199 100 191 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
+Dialogue: 0,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an7\pos(754,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 160 0 b 168 0 175 7 175 15 l 175 85 b 175 93 168 100 160 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
+Dialogue: 0,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an7\pos(167,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 291 0 b 299 0 306 7 306 15 l 306 85 b 306 93 299 100 291 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
+Dialogue: 0,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an7\pos(455,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 272 0 b 280 0 287 7 287 15 l 287 85 b 287 93 280 100 272 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
+Dialogue: 0,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an7\pos(724,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 173 0 b 181 0 188 7 188 15 l 188 85 b 188 93 181 100 173 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
+Dialogue: 0,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an7\pos(296,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 472 0 b 480 0 487 7 487 15 l 487 85 b 487 93 480 100 472 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an5\pos(539,1380)}Expensive?
diff --git a/tests/parity/golden/hormozi.ass.expected b/tests/parity/golden/hormozi.ass.expected
new file mode 100644
index 0000000..2092e44
--- /dev/null
+++ b/tests/parity/golden/hormozi.ass.expected
@@ -0,0 +1,18 @@
+[Script Info]
+Title: Podcast Clip Captions
+ScriptType: v4.00+
+PlayResX: 1080
+PlayResY: 1920
+WrapStyle: 0
+ScaledBorderAndShadow: yes
+
+[V4+ Styles]
+Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
+Style: Default,Arial,80,&H00FFFFFF,&H00FFFFFF,&H00000000,&H80000000,-1,0,0,0,100,100,0,0,3,0,0,2,40,40,180,1
+
+[Events]
+Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
+Dialogue: 0,0:00:00.00,0:00:01.05,Default,,0,0,0,,{\c&H0000FFFF\2c&H00FFFFFF}{\kf34}THE {\kf37}JAMES {\kf33}WEBB
+Dialogue: 0,0:00:01.05,0:00:02.40,Default,,0,0,0,,{\c&H0000FFFF\2c&H00FFFFFF}{\kf57}TELESCOPE {\kf32}COST {\kf44}$10
+Dialogue: 0,0:00:02.40,0:00:03.60,Default,,0,0,0,,{\c&H0000FFFF\2c&H00FFFFFF}{\kf55}BILLION. {\kf44}WASN'T {\kf4}THAT
+Dialogue: 0,0:00:03.55,0:00:04.20,Default,,0,0,0,,{\c&H0000FFFF\2c&H00FFFFFF}{\kf65}EXPENSIVE?
diff --git a/tests/parity/golden/karaoke.ass.expected b/tests/parity/golden/karaoke.ass.expected
new file mode 100644
index 0000000..c7d5337
--- /dev/null
+++ b/tests/parity/golden/karaoke.ass.expected
@@ -0,0 +1,16 @@
+[Script Info]
+Title: Podcast Clip Captions
+ScriptType: v4.00+
+PlayResX: 1080
+PlayResY: 1920
+WrapStyle: 0
+ScaledBorderAndShadow: yes
+
+[V4+ Styles]
+Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
+Style: Default,Arial,60,&H00808080,&H00808080,&H00000000,&H80000000,0,0,0,0,100,100,0,0,1,3,1,2,40,40,160,1
+
+[Events]
+Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
+Dialogue: 0,0:00:00.00,0:00:01.95,Default,,0,0,0,,{\c&H00FFFFFF}{\2c&H00808080}{\kf34}The {\kf37}James {\kf33}Webb {\kf57}telescope {\kf32}cost
+Dialogue: 0,0:00:01.95,0:00:04.20,Default,,0,0,0,,{\c&H00FFFFFF}{\2c&H00808080}{\kf44}$10 {\kf55}billion. {\kf44}Wasn't {\kf4}that {\kf65}expensive?
diff --git a/tests/parity/golden/subtle.ass.expected b/tests/parity/golden/subtle.ass.expected
new file mode 100644
index 0000000..85a143d
--- /dev/null
+++ b/tests/parity/golden/subtle.ass.expected
@@ -0,0 +1,16 @@
+[Script Info]
+Title: Podcast Clip Captions
+ScriptType: v4.00+
+PlayResX: 1080
+PlayResY: 1920
+WrapStyle: 0
+ScaledBorderAndShadow: yes
+
+[V4+ Styles]
+Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
+Style: Default,Arial,52,&H00FFFFFF,&H00FFFFFF,&H00000000,&H80000000,0,0,0,0,100,100,0,0,1,2,2,2,40,40,100,1
+
+[Events]
+Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
+Dialogue: 0,0:00:00.00,0:00:01.95,Default,,0,0,0,,The James Webb telescope cost
+Dialogue: 0,0:00:01.95,0:00:04.20,Default,,0,0,0,,$10 billion. Wasn't that expensive?
diff --git a/tests/parity/test_caption_parity.py b/tests/parity/test_caption_parity.py
new file mode 100644
index 0000000..d5cc250
--- /dev/null
+++ b/tests/parity/test_caption_parity.py
@@ -0,0 +1,110 @@
+"""Layer-1 caption parity: lock the deterministic, engine-independent surface.
+
+Everything downstream of the transcript JSON contract (the `{words, segments}`
+dict produced by transcribe_file / parse_speaker_transcript / JSON import /
+cache) is pure code that does NOT change when we swap the transcription engine
+or relocate the runtime. This test pins the most timestamp-sensitive part of
+that surface — caption (ASS) generation — against committed goldens.
+
+If a future change (whisper.cpp swap, hermetic runtime, a refactor) alters
+caption output for a fixed transcript, this test fails. That is the whole point:
+the engine may change, the contract's consumers may not — silently.
+
+Regenerate goldens intentionally with:  UPDATE_GOLDENS=1 pytest tests/parity/test_caption_parity.py
+Goldens use the .ass.expected extension because *.ass is gitignored.
+"""
+
+import json
+import os
+import sys
+import tempfile
+
+import pytest
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+from services.caption_renderer import render_captions, _sanitize_words  # noqa: E402
+
+HERE = os.path.dirname(__file__)
+GOLDEN_DIR = os.path.join(HERE, "golden")
+TRANSCRIPT = os.path.join(HERE, "transcript_synthetic.json")
+STYLES = ["hormozi", "karaoke", "subtle", "branded"]
+
+
+def _load_words():
+    with open(TRANSCRIPT, encoding="utf-8") as f:
+        return json.load(f)["words"]
+
+
+def _render(style: str) -> str:
+    words = _load_words()
+    out = tempfile.mktemp(suffix=".ass")
+    try:
+        render_captions(words, style, out, time_offset=0.0)
+        with open(out, encoding="utf-8") as f:
+            return f.read()
+    finally:
+        if os.path.exists(out):
+            os.remove(out)
+
+
+@pytest.mark.parametrize("style", STYLES)
+def test_caption_output_matches_golden(style):
+    produced = _render(style)
+    golden_path = os.path.join(GOLDEN_DIR, f"{style}.ass.expected")
+
+    if os.environ.get("UPDATE_GOLDENS") == "1":
+        os.makedirs(GOLDEN_DIR, exist_ok=True)
+        with open(golden_path, "w", encoding="utf-8") as f:
+            f.write(produced)
+        pytest.skip(f"golden updated: {style}")
+
+    assert os.path.exists(golden_path), (
+        f"missing golden for {style}. Generate with UPDATE_GOLDENS=1."
+    )
+    with open(golden_path, encoding="utf-8") as f:
+        expected = f.read()
+    assert produced == expected, (
+        f"caption output for '{style}' diverged from golden. "
+        f"If this is an intended change, regenerate with UPDATE_GOLDENS=1."
+    )
+
+
+def test_word_text_normalization():
+    """The exact normalization the whisper.cpp boundary must reproduce.
+
+    whisper.cpp emits leading-space token markers and can emit empty/whitespace
+    tokens; corrections + caption spacing match on stripped word text. If the
+    new engine's word text isn't normalized identically here, captions and
+    apply_corrections() silently diverge. This is the single highest-risk
+    integration detail in the engine swap.
+    """
+    words = _load_words()
+    cleaned = _sanitize_words(words)
+    texts = [w["word"] for w in cleaned]
+
+    # leading space stripped
+    assert texts[0] == "The"
+    # whitespace-only token dropped entirely
+    assert "" not in texts
+    assert all(t.strip() == t and t for t in texts)
+    # punctuation / apostrophe / number+symbol survive verbatim
+    assert "billion." in texts
+    assert "Wasn't" in texts
+    assert "$10" in texts
+    assert "expensive?" in texts
+    # exactly one token (the whitespace-only one) was dropped
+    assert len(cleaned) == len(words) - 1
+
+
+def test_zero_duration_word_gets_floor():
+    """A token with end <= start must be widened to the 50ms floor, never
+    crash or render a zero/negative-length event (a real whisper.cpp quirk)."""
+    words = _load_words()
+    cleaned = _sanitize_words(words)
+    for w in cleaned:
+        assert w["end"] > w["start"]
+        assert (w["end"] - w["start"]) >= 0.05 - 1e-9
diff --git a/tests/parity/transcript_synthetic.json b/tests/parity/transcript_synthetic.json
new file mode 100644
index 0000000..e7377fc
--- /dev/null
+++ b/tests/parity/transcript_synthetic.json
@@ -0,0 +1,22 @@
+{
+  "_comment": "Neutral synthetic transcript for Layer-1 caption parity. No personal/podcast content. Deliberately exercises the word-text edge cases most likely to diverge across transcription engines: a leading-space token marker, a number+symbol token, end-of-sentence punctuation, an apostrophe, a whitespace-only token (must be dropped), a zero-duration token (must get the 50ms floor), and a speaker change.",
+  "segments": [
+    {"id": 0, "start": 0.0, "end": 2.95, "text": "The James Webb telescope cost $10 billion.", "speaker": "SPEAKER_00"},
+    {"id": 1, "start": 3.1, "end": 4.2, "text": "Wasn't that expensive?", "speaker": "SPEAKER_01"}
+  ],
+  "words": [
+    {"word": " The", "start": 0.0, "end": 0.34, "speaker": "SPEAKER_00"},
+    {"word": "James", "start": 0.34, "end": 0.72, "speaker": "SPEAKER_00"},
+    {"word": "Webb", "start": 0.72, "end": 1.05, "speaker": "SPEAKER_00"},
+    {"word": "telescope", "start": 1.05, "end": 1.62, "speaker": "SPEAKER_00"},
+    {"word": "cost", "start": 1.62, "end": 1.95, "speaker": "SPEAKER_00"},
+    {"word": "$10", "start": 1.95, "end": 2.4, "speaker": "SPEAKER_00"},
+    {"word": "billion.", "start": 2.4, "end": 2.95, "speaker": "SPEAKER_00"},
+    {"word": "   ", "start": 2.95, "end": 2.95, "speaker": "SPEAKER_00"},
+    {"word": "Wasn't", "start": 3.1, "end": 3.55, "speaker": "SPEAKER_01"},
+    {"word": "that", "start": 3.55, "end": 3.55, "speaker": "SPEAKER_01"},
+    {"word": "expensive?", "start": 3.55, "end": 4.2, "speaker": "SPEAKER_01"}
+  ],
+  "duration": 4.2,
+  "language": "en"
+}

From 3b843d3b3624941091edb4ed77dd155109a8d705 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 12:22:48 +0400
Subject: [PATCH 02/41] Add whisper.cpp transcription adapter behind the
 transcribe_file contract

Shells to whisper-cli, merges subword tokens into words on the leading-space
boundary, and applies the same word-text normalization (strip) the caption
pipeline and apply_corrections() expect. Emits the same {words, segments,
duration, language} dict as the openai-whisper path, so it is a drop-in engine.

Validated against an openai-whisper baseline via tests/parity: matching text
(WER 0.059) and segmentation; benign bulk word-timing drift (~0.18s median).
Known gap: base model can stretch trailing words across silence; --vad is the fix.
---
 backend/services/transcription_whispercpp.py | 140 +++++++++++++++++++
 1 file changed, 140 insertions(+)
 create mode 100644 backend/services/transcription_whispercpp.py

diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
new file mode 100644
index 0000000..135b2d8
--- /dev/null
+++ b/backend/services/transcription_whispercpp.py
@@ -0,0 +1,140 @@
+"""whisper.cpp transcription adapter — emits the same contract dict as
+services.transcription.transcribe_file (segments + word-level timestamps), so it
+is a drop-in engine behind that seam.
+
+whisper.cpp emits subword *tokens* with a literal leading-space convention
+(" and", " just", continuation/punctuation tokens have no leading space). We
+merge tokens into words on that boundary and apply the exact same word-text
+normalization the rest of the pipeline expects (strip) — this is the single
+highest-risk integration detail: apply_corrections() and caption spacing match
+on stripped word text, so the new engine's words must normalize identically.
+
+Requires a whisper-cli binary and a ggml model. In production these come from
+the hermetic provisioner; here they are parameters/env so the parity harness can
+point at a local install.
+"""
+
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from typing import Optional
+
+_SPECIAL = re.compile(r"^\[.*\]$")  # [_BEG_], [_TT_...], etc.
+
+
+def _extract_wav(media_path: str, wav_path: str, ffmpeg: str = "ffmpeg") -> None:
+    subprocess.run(
+        [ffmpeg, "-y", "-loglevel", "error", "-i", media_path,
+         "-ar", "16000", "-ac", "1", wav_path],
+        check=True,
+    )
+
+
+def _tokens_to_words(tokens: list[dict]) -> list[dict]:
+    """Merge whisper.cpp subword tokens into words with start/end seconds."""
+    words: list[dict] = []
+    cur_text = ""
+    cur_start: Optional[float] = None
+    cur_end: Optional[float] = None
+
+    def flush():
+        nonlocal cur_text, cur_start, cur_end
+        text = cur_text.strip()
+        if text and cur_start is not None:
+            words.append({
+                "word": text,
+                "start": round(cur_start / 1000.0, 3),
+                "end": round(cur_end / 1000.0, 3),
+                "speaker": None,
+            })
+        cur_text, cur_start, cur_end = "", None, None
+
+    for tok in tokens:
+        raw = tok.get("text", "")
+        if _SPECIAL.match(raw.strip()):
+            continue
+        off = tok.get("offsets") or {}
+        t0, t1 = off.get("from"), off.get("to")
+        starts_word = raw.startswith(" ") or raw.startswith("▁")
+        if starts_word and cur_text:
+            flush()
+        if cur_start is None and t0 is not None:
+            cur_start = t0
+        cur_text += raw
+        if t1 is not None:
+            cur_end = t1
+    flush()
+    return words
+
+
+def transcribe_file(
+    file_path: str,
+    model_path: str,
+    whisper_cli: str = "whisper-cli",
+    ffmpeg: str = "ffmpeg",
+    language: Optional[str] = "en",
+    dtw_model: str = "base",
+    threads: int = 4,
+    **_ignored,
+) -> dict:
+    if not os.path.exists(file_path):
+        raise FileNotFoundError(file_path)
+    if not os.path.exists(model_path):
+        raise FileNotFoundError(f"ggml model not found: {model_path}")
+
+    tmpdir = tempfile.mkdtemp(prefix="wcpp_")
+    wav = os.path.join(tmpdir, "audio.wav")
+    out_base = os.path.join(tmpdir, "out")
+    _extract_wav(file_path, wav, ffmpeg)
+
+    cmd = [whisper_cli, "-m", model_path, "-f", wav, "-ojf",
+           "-of", out_base, "-t", str(threads)]
+    if dtw_model:
+        cmd += ["-dtw", dtw_model]
+    if language:
+        cmd += ["-l", language]
+    subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+
+    with open(out_base + ".json", encoding="utf-8") as f:
+        data = json.load(f)
+
+    transcription = data.get("transcription", [])
+    segments, words = [], []
+    for i, seg in enumerate(transcription):
+        off = seg.get("offsets") or {}
+        seg_start = round((off.get("from") or 0) / 1000.0, 3)
+        seg_end = round((off.get("to") or 0) / 1000.0, 3)
+        segments.append({
+            "id": i,
+            "start": seg_start,
+            "end": seg_end,
+            "text": (seg.get("text") or "").strip(),
+            "speaker": None,
+        })
+        words.extend(_tokens_to_words(seg.get("tokens", [])))
+
+    duration = segments[-1]["end"] if segments else 0.0
+    return {
+        "transcript": " ".join(s["text"] for s in segments).strip(),
+        "segments": segments,
+        "words": words,
+        "duration": duration,
+        "language": (data.get("params") or {}).get("language") or language or "en",
+    }
+
+
+if __name__ == "__main__":
+    # Quick CLI: transcribe_whispercpp <media> <model> [out.json]
+    media, model = sys.argv[1], sys.argv[2]
+    out = sys.argv[3] if len(sys.argv) > 3 else None
+    result = transcribe_file(media, model)
+    payload = json.dumps(result, indent=2, ensure_ascii=False)
+    if out:
+        with open(out, "w", encoding="utf-8") as f:
+            f.write(payload)
+        print(f"{len(result['words'])} words, {len(result['segments'])} segments -> {out}")
+    else:
+        print(payload)

From 59322734dad4404de219d8d8c7d7c653151870fe Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 12:22:48 +0400
Subject: [PATCH 03/41] Add Go native launcher (Phase 0)

cli/ Go module that replaces the bash podcli launcher and install.cmd:
- paths: per-OS global managed dir (runtime/models/bin/config)
- engine: resolve interpreter + backend/cli.py (hermetic > dev venv > system),
  exec with UTF-8 + ffmpeg env, propagate exit code
- main: version/doctor commands, update/setup stubs, pass-through routing

Drives the existing Python engine end-to-end (e.g. podcli clips list). Hermetic
provisioning and self-update land in later phases.
---
 cli/go.mod                    |   3 +
 cli/internal/engine/engine.go | 119 ++++++++++++++++++++++++++++++++++
 cli/internal/paths/paths.go   |  52 +++++++++++++++
 cli/main.go                   |  89 +++++++++++++++++++++++++
 4 files changed, 263 insertions(+)
 create mode 100644 cli/go.mod
 create mode 100644 cli/internal/engine/engine.go
 create mode 100644 cli/internal/paths/paths.go
 create mode 100644 cli/main.go

diff --git a/cli/go.mod b/cli/go.mod
new file mode 100644
index 0000000..848df68
--- /dev/null
+++ b/cli/go.mod
@@ -0,0 +1,3 @@
+module podcli
+
+go 1.23
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
new file mode 100644
index 0000000..c8c0e18
--- /dev/null
+++ b/cli/internal/engine/engine.go
@@ -0,0 +1,119 @@
+// Package engine routes podcli subcommands to the Python backend. In Phase 0 it
+// resolves an interpreter and backend/cli.py and execs them; later phases swap
+// the resolved interpreter/ffmpeg to hermetically provisioned ones without
+// changing this routing.
+package engine
+
+import (
+	"fmt"
+	"os"
+	"os/exec"
+	"path/filepath"
+	"runtime"
+
+	"podcli/internal/paths"
+)
+
+func exists(p string) bool {
+	_, err := os.Stat(p)
+	return err == nil
+}
+
+// BackendRoot locates the directory containing cli.py: explicit override, then
+// the dev repo (walk up for backend/cli.py), then the provisioned location.
+func BackendRoot() (string, bool) {
+	if b := os.Getenv("PODCLI_BACKEND"); b != "" && exists(filepath.Join(b, "cli.py")) {
+		return b, true
+	}
+	if dir, err := os.Getwd(); err == nil {
+		for {
+			cand := filepath.Join(dir, "backend")
+			if exists(filepath.Join(cand, "cli.py")) {
+				return cand, true
+			}
+			parent := filepath.Dir(dir)
+			if parent == dir {
+				break
+			}
+			dir = parent
+		}
+	}
+	cand := filepath.Join(paths.RuntimeDir(), "backend")
+	if exists(filepath.Join(cand, "cli.py")) {
+		return cand, true
+	}
+	return "", false
+}
+
+// Python resolves the interpreter: explicit override, hermetic runtime, dev
+// venv next to the backend, then system python3.
+func Python() string {
+	if p := os.Getenv("PODCLI_PYTHON"); p != "" {
+		return p
+	}
+	hermetic := []string{
+		filepath.Join(paths.RuntimeDir(), "python", "bin", "python3"),
+		filepath.Join(paths.RuntimeDir(), "python", "python.exe"),
+	}
+	for _, p := range hermetic {
+		if exists(p) {
+			return p
+		}
+	}
+	if root, ok := BackendRoot(); ok {
+		venv := filepath.Join(filepath.Dir(root), "venv", "bin", "python3")
+		if runtime.GOOS == "windows" {
+			venv = filepath.Join(filepath.Dir(root), "venv", "Scripts", "python.exe")
+		}
+		if exists(venv) {
+			return venv
+		}
+	}
+	return "python3"
+}
+
+// FFmpeg resolves a hermetic ffmpeg if present, else lets the backend fall back
+// to PATH (current behavior).
+func FFmpeg() string {
+	cands := []string{
+		filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg"),
+		filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg.exe"),
+	}
+	for _, p := range cands {
+		if exists(p) {
+			return p
+		}
+	}
+	return ""
+}
+
+// Run execs the Python backend with args, inheriting stdio. Returns the child's
+// exit code.
+func Run(args []string) (int, error) {
+	root, ok := BackendRoot()
+	if !ok {
+		return 1, fmt.Errorf("python backend not found — set PODCLI_BACKEND or run inside the repo")
+	}
+	cli := filepath.Join(root, "cli.py")
+	full := append([]string{"-W", "ignore::UserWarning", cli}, args...)
+
+	cmd := exec.Command(Python(), full...)
+	cmd.Stdin, cmd.Stdout, cmd.Stderr = os.Stdin, os.Stdout, os.Stderr
+	env := append(os.Environ(),
+		"OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES",
+		"PYTHONIOENCODING=utf-8",
+		"PYTHONUTF8=1",
+	)
+	if ff := FFmpeg(); ff != "" {
+		env = append(env, "PODCLI_FFMPEG="+ff)
+	}
+	cmd.Env = env
+
+	if err := cmd.Run(); err != nil {
+		if ee, ok := err.(*exec.ExitError); ok {
+			return ee.ExitCode(), nil
+		}
+		return 1, err
+	}
+	return 0, nil
+}
diff --git a/cli/internal/paths/paths.go b/cli/internal/paths/paths.go
new file mode 100644
index 0000000..345cd32
--- /dev/null
+++ b/cli/internal/paths/paths.go
@@ -0,0 +1,52 @@
+// Package paths resolves the global managed directory where podcli keeps its
+// hermetic runtimes, models, binaries, and config. Per-project working data
+// (.podcli/) stays in the user's current directory and is not handled here.
+package paths
+
+import (
+	"os"
+	"path/filepath"
+	"runtime"
+)
+
+// Home is the global managed dir. Override with PODCLI_HOME (used by tests and
+// power users). Layout matches plans/native-cli.md:
+//
+//	darwin  ~/Library/Application Support/podcli
+//	windows %LOCALAPPDATA%\podcli
+//	linux   $XDG_DATA_HOME/podcli  (or ~/.local/share/podcli)
+func Home() string {
+	if h := os.Getenv("PODCLI_HOME"); h != "" {
+		return h
+	}
+	home, err := os.UserHomeDir()
+	if err != nil {
+		home = "."
+	}
+	switch runtime.GOOS {
+	case "darwin":
+		return filepath.Join(home, "Library", "Application Support", "podcli")
+	case "windows":
+		if d := os.Getenv("LOCALAPPDATA"); d != "" {
+			return filepath.Join(d, "podcli")
+		}
+		return filepath.Join(home, "AppData", "Local", "podcli")
+	default:
+		if d := os.Getenv("XDG_DATA_HOME"); d != "" {
+			return filepath.Join(d, "podcli")
+		}
+		return filepath.Join(home, ".local", "share", "podcli")
+	}
+}
+
+// RuntimeDir holds hermetic CPython, ffmpeg, whisper.cpp, etc.
+func RuntimeDir() string { return filepath.Join(Home(), "runtime") }
+
+// ModelsDir holds fetched ggml models.
+func ModelsDir() string { return filepath.Join(Home(), "models") }
+
+// BinDir holds the self-updatable engine binary.
+func BinDir() string { return filepath.Join(Home(), "bin") }
+
+// ConfigPath is the launcher config (update opt-out, pins).
+func ConfigPath() string { return filepath.Join(Home(), "config.json") }
diff --git a/cli/main.go b/cli/main.go
new file mode 100644
index 0000000..dbdb1e9
--- /dev/null
+++ b/cli/main.go
@@ -0,0 +1,89 @@
+// podcli — native launcher.
+//
+// Phase 0: resolves the Python backend + interpreter and routes subcommands to
+// it (replacing the bash `podcli` and install.cmd). Reserved launcher verbs
+// (version, doctor, update, setup) are handled here; everything else is passed
+// through to the engine. update/setup are stubs until Phases 0+/2.
+package main
+
+import (
+	"fmt"
+	"os"
+
+	"podcli/internal/engine"
+	"podcli/internal/paths"
+)
+
+// Version is set at build time via -ldflags "-X main.Version=...".
+var Version = "2.0.0-dev"
+
+func main() {
+	args := os.Args[1:]
+	if len(args) == 0 {
+		printHelp()
+		return
+	}
+
+	switch args[0] {
+	case "version", "--version", "-v":
+		fmt.Printf("podcli %s\n", Version)
+	case "doctor":
+		doctor()
+	case "update":
+		fmt.Println("self-update: not yet implemented (Phase 2 — GitHub Releases + atomic swap)")
+	case "setup":
+		fmt.Println("hermetic provisioning: not yet implemented (Phase 0+ — fetch pinned python/ffmpeg/whisper.cpp)")
+	case "help", "--help", "-h":
+		printHelp()
+	default:
+		code, err := engine.Run(args)
+		if err != nil {
+			fmt.Fprintln(os.Stderr, "podcli:", err)
+			os.Exit(1)
+		}
+		os.Exit(code)
+	}
+}
+
+func doctor() {
+	fmt.Printf("podcli %s\n\n", Version)
+	fmt.Println("Paths")
+	fmt.Printf("  home:     %s\n", paths.Home())
+	fmt.Printf("  runtime:  %s\n", paths.RuntimeDir())
+	fmt.Printf("  models:   %s\n", paths.ModelsDir())
+	fmt.Println("\nEngine resolution")
+	if root, ok := engine.BackendRoot(); ok {
+		fmt.Printf("  backend:  %s\n", root)
+	} else {
+		fmt.Printf("  backend:  NOT FOUND (set PODCLI_BACKEND or run inside the repo)\n")
+	}
+	fmt.Printf("  python:   %s\n", engine.Python())
+	if ff := engine.FFmpeg(); ff != "" {
+		fmt.Printf("  ffmpeg:   %s (hermetic)\n", ff)
+	} else {
+		fmt.Printf("  ffmpeg:   PATH fallback (not yet hermetic)\n")
+	}
+}
+
+func printHelp() {
+	fmt.Printf(`podcli %s — AI podcast clip generator
+
+Usage:
+  podcli <command> [args]
+
+Engine commands (routed to the processing backend):
+  process <video>      Transcribe a video and export short-form clips
+  studio <video>       Cut a fragment + intro/outro bookends
+  clips                Browse and edit saved clips
+  thumbnails           Generate thumbnails
+  knowledge | presets | assets | youtube | config | cache | info
+
+Launcher commands:
+  doctor               Show resolved paths, interpreter, backend, ffmpeg
+  version              Print version
+  update               Self-update (coming in Phase 2)
+  setup                Provision hermetic runtimes (coming soon)
+
+Run a command with --help for its options.
+`, Version)
+}

From 040f3326c2fd6f0dfedb0b4fff2cdb1c17afe772 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 12:28:41 +0400
Subject: [PATCH 04/41] Wire transcription engine dispatch (whisper-py |
 whispercpp)

transcribe_file() now dispatches on PODCLI_ENGINE: the default whisper-py path is
untouched; whispercpp routes to the whisper.cpp adapter, resolving whisper-cli
and the ggml model from the managed dir (env-overridable). Adds a first-class
--engine flag to the process command and optional VAD support to the adapter
(off by default). This is the dual-engine escape hatch from the migration plan:
users can pick the native PyTorch-free path or fall back instantly.

Verified: --engine routes through the Go launcher; engine dispatch returns the
contract dict; Layer-1 parity (6) and unit tests (27) green.
---
 backend/cli.py                               |  3 ++
 backend/services/transcription.py            | 46 ++++++++++++++++++++
 backend/services/transcription_whispercpp.py |  7 +++
 3 files changed, 56 insertions(+)

diff --git a/backend/cli.py b/backend/cli.py
index 9bf6a98..a5fb82e 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -416,6 +416,8 @@ def cmd_process(args):
             save_corrections(merged)
 
     # CLI overrides
+    if getattr(args, "engine", None):
+        os.environ["PODCLI_ENGINE"] = args.engine
     if args.caption_style:
         config["caption_style"] = args.caption_style
     if args.crop:
@@ -3235,6 +3237,7 @@ def main():
     proc.add_argument("-n", "--top", type=int, help="Number of top clips to export (default: 5)")
     proc.add_argument("-o", "--output", help="Output directory (default: ./clips)")
     proc.add_argument("-p", "--preset", help="Load a saved preset")
+    proc.add_argument("--engine", choices=["whisper-py", "whispercpp"], help="Transcription engine (default: whisper-py; whispercpp is the native, PyTorch-free path)")
     proc.add_argument("--fast", action="store_true", help="Draft mode: tiny Whisper, heuristic selection, center crop, low quality")
     proc.add_argument("--caption-style", choices=["branded", "hormozi", "karaoke", "subtle"])
     proc.add_argument("--crop", choices=["center", "face", "speaker", "speaker-hardcut"])
diff --git a/backend/services/transcription.py b/backend/services/transcription.py
index 69f7493..8cdac5e 100644
--- a/backend/services/transcription.py
+++ b/backend/services/transcription.py
@@ -14,6 +14,48 @@
 from typing import Optional, Callable
 
 
+def _managed_home() -> str:
+    h = os.environ.get("PODCLI_HOME")
+    if h:
+        return h
+    home = os.path.expanduser("~")
+    if sys.platform == "darwin":
+        return os.path.join(home, "Library", "Application Support", "podcli")
+    if sys.platform == "win32":
+        return os.environ.get("LOCALAPPDATA") or os.path.join(home, "AppData", "Local", "podcli")
+    return os.environ.get("XDG_DATA_HOME") or os.path.join(home, ".local", "share", "podcli")
+
+
+def _transcribe_with_whispercpp(file_path, model_size, language, progress_callback):
+    from services import transcription_whispercpp as wcpp
+
+    if progress_callback:
+        progress_callback(10, "Transcribing with whisper.cpp...")
+
+    cli = os.environ.get("PODCLI_WHISPER_CLI", "whisper-cli")
+    model = os.environ.get("PODCLI_WHISPERCPP_MODEL") or os.path.join(
+        _managed_home(), "models", f"ggml-{model_size}.bin"
+    )
+    if not os.path.exists(model):
+        raise FileNotFoundError(
+            f"whisper.cpp model not found: {model}. "
+            "Set PODCLI_WHISPERCPP_MODEL or run provisioning."
+        )
+    vad = os.environ.get("PODCLI_WHISPERCPP_VAD", "").strip().lower() in ("1", "true", "yes", "on")
+    result = wcpp.transcribe_file(
+        file_path,
+        model_path=model,
+        whisper_cli=cli,
+        ffmpeg=os.environ.get("PODCLI_FFMPEG", "ffmpeg"),
+        language=language,
+        vad=vad,
+        vad_model=os.environ.get("PODCLI_WHISPERCPP_VAD_MODEL") or None,
+    )
+    if progress_callback:
+        progress_callback(100, "Transcription complete")
+    return result
+
+
 def transcribe_file(
     file_path: str,
     model_size: str = "base",
@@ -39,6 +81,10 @@ def transcribe_file(
     if not os.path.exists(file_path):
         raise FileNotFoundError(f"File not found: {file_path}")
 
+    engine = os.environ.get("PODCLI_ENGINE", "whisper-py").strip().lower()
+    if engine in ("whispercpp", "whisper-cpp", "whisper.cpp", "cpp"):
+        return _transcribe_with_whispercpp(file_path, model_size, language, progress_callback)
+
     # ================================================================
     # Step 1: Whisper transcription
     # ================================================================
diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index 135b2d8..113de55 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -78,6 +78,8 @@ def transcribe_file(
     language: Optional[str] = "en",
     dtw_model: str = "base",
     threads: int = 4,
+    vad: bool = False,
+    vad_model: Optional[str] = None,
     **_ignored,
 ) -> dict:
     if not os.path.exists(file_path):
@@ -94,6 +96,11 @@ def transcribe_file(
            "-of", out_base, "-t", str(threads)]
     if dtw_model:
         cmd += ["-dtw", dtw_model]
+    if vad and vad_model and os.path.exists(vad_model):
+        # VAD removes the trailing-words-into-silence failure mode but currently
+        # adds a small systematic early bias (silence-removal remapping). Off by
+        # default; opt in via PODCLI_WHISPERCPP_VAD.
+        cmd += ["--vad", "--vad-model", vad_model]
     if language:
         cmd += ["-l", language]
     subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

From 4a4eca1ce950e9539075ada9fe295993fa86b73a Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 13:04:18 +0400
Subject: [PATCH 05/41] Add model provisioner to the Go launcher

podcli setup downloads pinned ggml models into the managed dir with resumable
(HTTP Range) transfers, checksum verification, and atomic writes; first use of
--engine whispercpp auto-provisions the base model. doctor reports model state.
The Python whispercpp path resolves the model from the same managed dir, so no
env wiring is needed once provisioned.

Also trims narrating comments from the launcher down to WHY-only.
---
 backend/services/transcription_whispercpp.py |  19 +-
 cli/internal/engine/engine.go                |  13 +-
 cli/internal/paths/paths.go                  |  11 +-
 cli/internal/provision/provision.go          | 211 +++++++++++++++++++
 cli/main.go                                  |  84 +++++++-
 5 files changed, 295 insertions(+), 43 deletions(-)
 create mode 100644 cli/internal/provision/provision.go

diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index 113de55..752fc8c 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -1,17 +1,8 @@
-"""whisper.cpp transcription adapter — emits the same contract dict as
-services.transcription.transcribe_file (segments + word-level timestamps), so it
-is a drop-in engine behind that seam.
-
-whisper.cpp emits subword *tokens* with a literal leading-space convention
-(" and", " just", continuation/punctuation tokens have no leading space). We
-merge tokens into words on that boundary and apply the exact same word-text
-normalization the rest of the pipeline expects (strip) — this is the single
-highest-risk integration detail: apply_corrections() and caption spacing match
-on stripped word text, so the new engine's words must normalize identically.
-
-Requires a whisper-cli binary and a ggml model. In production these come from
-the hermetic provisioner; here they are parameters/env so the parity harness can
-point at a local install.
+"""whisper.cpp adapter behind the transcribe_file contract.
+
+Tokens carry a leading-space convention (" and", continuations have none); we
+merge on that boundary and strip word text. The strip must match the rest of the
+pipeline exactly — apply_corrections() and caption spacing key on stripped text.
 """
 
 import json
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index c8c0e18..07d45a7 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -1,7 +1,4 @@
-// Package engine routes podcli subcommands to the Python backend. In Phase 0 it
-// resolves an interpreter and backend/cli.py and execs them; later phases swap
-// the resolved interpreter/ffmpeg to hermetically provisioned ones without
-// changing this routing.
+// Package engine routes podcli subcommands to the Python backend.
 package engine
 
 import (
@@ -19,8 +16,6 @@ func exists(p string) bool {
 	return err == nil
 }
 
-// BackendRoot locates the directory containing cli.py: explicit override, then
-// the dev repo (walk up for backend/cli.py), then the provisioned location.
 func BackendRoot() (string, bool) {
 	if b := os.Getenv("PODCLI_BACKEND"); b != "" && exists(filepath.Join(b, "cli.py")) {
 		return b, true
@@ -45,8 +40,6 @@ func BackendRoot() (string, bool) {
 	return "", false
 }
 
-// Python resolves the interpreter: explicit override, hermetic runtime, dev
-// venv next to the backend, then system python3.
 func Python() string {
 	if p := os.Getenv("PODCLI_PYTHON"); p != "" {
 		return p
@@ -72,8 +65,6 @@ func Python() string {
 	return "python3"
 }
 
-// FFmpeg resolves a hermetic ffmpeg if present, else lets the backend fall back
-// to PATH (current behavior).
 func FFmpeg() string {
 	cands := []string{
 		filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg"),
@@ -87,8 +78,6 @@ func FFmpeg() string {
 	return ""
 }
 
-// Run execs the Python backend with args, inheriting stdio. Returns the child's
-// exit code.
 func Run(args []string) (int, error) {
 	root, ok := BackendRoot()
 	if !ok {
diff --git a/cli/internal/paths/paths.go b/cli/internal/paths/paths.go
index 345cd32..d510e3d 100644
--- a/cli/internal/paths/paths.go
+++ b/cli/internal/paths/paths.go
@@ -39,14 +39,7 @@ func Home() string {
 	}
 }
 
-// RuntimeDir holds hermetic CPython, ffmpeg, whisper.cpp, etc.
 func RuntimeDir() string { return filepath.Join(Home(), "runtime") }
-
-// ModelsDir holds fetched ggml models.
-func ModelsDir() string { return filepath.Join(Home(), "models") }
-
-// BinDir holds the self-updatable engine binary.
-func BinDir() string { return filepath.Join(Home(), "bin") }
-
-// ConfigPath is the launcher config (update opt-out, pins).
+func ModelsDir() string  { return filepath.Join(Home(), "models") }
+func BinDir() string     { return filepath.Join(Home(), "bin") }
 func ConfigPath() string { return filepath.Join(Home(), "config.json") }
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
new file mode 100644
index 0000000..a00b2eb
--- /dev/null
+++ b/cli/internal/provision/provision.go
@@ -0,0 +1,211 @@
+// Package provision fetches pinned models into the global managed dir.
+package provision
+
+import (
+	"crypto/sha256"
+	"encoding/hex"
+	"fmt"
+	"io"
+	"net/http"
+	"os"
+	"path/filepath"
+	"time"
+
+	"podcli/internal/paths"
+)
+
+type model struct {
+	URL    string
+	SHA256 string // empty: verification skipped
+}
+
+var models = map[string]model{
+	"base": {
+		URL:    "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin",
+		SHA256: "60ed5bc3dd14eea856493d334349b405782ddcaf0028d4b5df4088345fba2efe",
+	},
+	"tiny.en": {
+		URL: "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.en.bin",
+	},
+	"small": {
+		URL: "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin",
+	},
+}
+
+const vadURL = "https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v5.1.2.bin"
+const vadSHA = "29940d98d42b91fbd05ce489f3ecf7c72f0a42f027e4875919a28fb4c04ea2cf"
+
+func ModelPath(size string) string {
+	return filepath.Join(paths.ModelsDir(), "ggml-"+size+".bin")
+}
+
+func VADModelPath() string {
+	return filepath.Join(paths.ModelsDir(), "ggml-silero-v5.1.2.bin")
+}
+
+func have(p string) bool {
+	if fi, err := os.Stat(p); err == nil && fi.Size() > 0 {
+		return true
+	}
+	return false
+}
+
+func EnsureModel(size string) (string, error) {
+	dest := ModelPath(size)
+	if have(dest) {
+		return dest, nil
+	}
+	m, ok := models[size]
+	if !ok {
+		return "", fmt.Errorf("unknown model size %q (known: base, tiny.en, small)", size)
+	}
+	if err := download(m.URL, dest, m.SHA256, "ggml-"+size); err != nil {
+		return "", err
+	}
+	return dest, nil
+}
+
+func EnsureVADModel() (string, error) {
+	dest := VADModelPath()
+	if have(dest) {
+		return dest, nil
+	}
+	if err := download(vadURL, dest, vadSHA, "silero-vad"); err != nil {
+		return "", err
+	}
+	return dest, nil
+}
+
+const maxAttempts = 6
+
+// download resumes via HTTP Range across transient stalls rather than
+// restarting, then verifies the pinned checksum and renames atomically.
+func download(url, dest, wantSHA, label string) error {
+	if err := os.MkdirAll(filepath.Dir(dest), 0o755); err != nil {
+		return err
+	}
+	tmp := dest + ".part"
+
+	var lastErr error
+	for attempt := 1; attempt <= maxAttempts; attempt++ {
+		done, err := downloadOnce(url, tmp, label)
+		if err == nil && done {
+			lastErr = nil
+			break
+		}
+		lastErr = err
+		fmt.Fprintf(os.Stderr, "\n  %s interrupted (attempt %d/%d): %v — resuming\n", label, attempt, maxAttempts, err)
+		time.Sleep(time.Duration(attempt) * time.Second)
+	}
+	if lastErr != nil {
+		os.Remove(tmp)
+		return lastErr
+	}
+
+	if wantSHA != "" {
+		got, err := sha256file(tmp)
+		if err != nil {
+			return err
+		}
+		if got != wantSHA {
+			os.Remove(tmp)
+			return fmt.Errorf("checksum mismatch for %s: got %s want %s", label, got, wantSHA)
+		}
+	} else {
+		fmt.Fprintf(os.Stderr, "  (no pinned checksum for %s — skipped verification)\n", label)
+	}
+	return os.Rename(tmp, dest)
+}
+
+func downloadOnce(url, tmp, label string) (bool, error) {
+	var start int64
+	if fi, err := os.Stat(tmp); err == nil {
+		start = fi.Size()
+	}
+
+	req, err := http.NewRequest(http.MethodGet, url, nil)
+	if err != nil {
+		return false, err
+	}
+	if start > 0 {
+		req.Header.Set("Range", fmt.Sprintf("bytes=%d-", start))
+	}
+	client := &http.Client{Transport: &http.Transport{ResponseHeaderTimeout: 60 * time.Second}}
+	resp, err := client.Do(req)
+	if err != nil {
+		return false, err
+	}
+	defer resp.Body.Close()
+
+	switch resp.StatusCode {
+	case http.StatusRequestedRangeNotSatisfiable:
+		return true, nil // already complete on disk
+	case http.StatusOK:
+		if start > 0 { // server ignored Range — restart cleanly
+			os.Truncate(tmp, 0)
+			start = 0
+		}
+	case http.StatusPartialContent:
+	default:
+		return false, fmt.Errorf("HTTP %d", resp.StatusCode)
+	}
+
+	out, err := os.OpenFile(tmp, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0o644)
+	if err != nil {
+		return false, err
+	}
+	defer out.Close()
+
+	pw := &progress{label: label, total: start + resp.ContentLength, written: start}
+	_, copyErr := io.Copy(io.MultiWriter(out, pw), resp.Body)
+	if copyErr != nil {
+		return false, copyErr
+	}
+	pw.done()
+	return true, nil
+}
+
+func sha256file(path string) (string, error) {
+	f, err := os.Open(path)
+	if err != nil {
+		return "", err
+	}
+	defer f.Close()
+	h := sha256.New()
+	if _, err := io.Copy(h, f); err != nil {
+		return "", err
+	}
+	return hex.EncodeToString(h.Sum(nil)), nil
+}
+
+type progress struct {
+	label    string
+	total    int64
+	written  int64
+	lastPct  int
+	lastTick time.Time
+}
+
+func (p *progress) Write(b []byte) (int, error) {
+	n := len(b)
+	p.written += int64(n)
+	now := time.Now()
+	if now.Sub(p.lastTick) < 200*time.Millisecond {
+		return n, nil
+	}
+	p.lastTick = now
+	if p.total > 0 {
+		pct := int(p.written * 100 / p.total)
+		if pct != p.lastPct {
+			p.lastPct = pct
+			fmt.Fprintf(os.Stderr, "\r  fetching %s ... %d%% (%d/%d MB)", p.label, pct, p.written>>20, p.total>>20)
+		}
+	} else {
+		fmt.Fprintf(os.Stderr, "\r  fetching %s ... %d MB", p.label, p.written>>20)
+	}
+	return n, nil
+}
+
+func (p *progress) done() {
+	fmt.Fprintf(os.Stderr, "\r  fetching %s ... done (%d MB)%s\n", p.label, p.written>>20, "          ")
+}
diff --git a/cli/main.go b/cli/main.go
index dbdb1e9..3d80285 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -1,17 +1,15 @@
-// podcli — native launcher.
-//
-// Phase 0: resolves the Python backend + interpreter and routes subcommands to
-// it (replacing the bash `podcli` and install.cmd). Reserved launcher verbs
-// (version, doctor, update, setup) are handled here; everything else is passed
-// through to the engine. update/setup are stubs until Phases 0+/2.
+// podcli — native launcher. Reserved verbs are handled here; everything else
+// routes to the Python engine.
 package main
 
 import (
 	"fmt"
 	"os"
+	"strings"
 
 	"podcli/internal/engine"
 	"podcli/internal/paths"
+	"podcli/internal/provision"
 )
 
 // Version is set at build time via -ldflags "-X main.Version=...".
@@ -32,10 +30,16 @@ func main() {
 	case "update":
 		fmt.Println("self-update: not yet implemented (Phase 2 — GitHub Releases + atomic swap)")
 	case "setup":
-		fmt.Println("hermetic provisioning: not yet implemented (Phase 0+ — fetch pinned python/ffmpeg/whisper.cpp)")
+		os.Exit(setup(args[1:]))
 	case "help", "--help", "-h":
 		printHelp()
 	default:
+		if usesWhisperCpp(args) {
+			if _, err := provision.EnsureModel("base"); err != nil {
+				fmt.Fprintln(os.Stderr, "podcli: provisioning model:", err)
+				os.Exit(1)
+			}
+		}
 		code, err := engine.Run(args)
 		if err != nil {
 			fmt.Fprintln(os.Stderr, "podcli:", err)
@@ -45,6 +49,59 @@ func main() {
 	}
 }
 
+func usesWhisperCpp(args []string) bool {
+	cmd := args[0]
+	if cmd != "process" && cmd != "studio" {
+		return false
+	}
+	engineSel := strings.ToLower(os.Getenv("PODCLI_ENGINE"))
+	for i, a := range args {
+		if a == "--engine" && i+1 < len(args) {
+			engineSel = strings.ToLower(args[i+1])
+		} else if strings.HasPrefix(a, "--engine=") {
+			engineSel = strings.ToLower(strings.TrimPrefix(a, "--engine="))
+		}
+	}
+	switch engineSel {
+	case "whispercpp", "whisper-cpp", "whisper.cpp", "cpp":
+		return true
+	}
+	return false
+}
+
+func setup(args []string) int {
+	size := "base"
+	vad := false
+	for i := 0; i < len(args); i++ {
+		switch args[i] {
+		case "--model":
+			if i+1 < len(args) {
+				size = args[i+1]
+				i++
+			}
+		case "--vad":
+			vad = true
+		}
+	}
+	fmt.Printf("Provisioning into %s\n", paths.Home())
+	p, err := provision.EnsureModel(size)
+	if err != nil {
+		fmt.Fprintln(os.Stderr, "podcli: setup:", err)
+		return 1
+	}
+	fmt.Printf("  model:  %s\n", p)
+	if vad {
+		vp, err := provision.EnsureVADModel()
+		if err != nil {
+			fmt.Fprintln(os.Stderr, "podcli: setup:", err)
+			return 1
+		}
+		fmt.Printf("  vad:    %s\n", vp)
+	}
+	fmt.Println("Done. (hermetic python/ffmpeg/whisper.cpp provisioning lands in a later phase)")
+	return 0
+}
+
 func doctor() {
 	fmt.Printf("podcli %s\n\n", Version)
 	fmt.Println("Paths")
@@ -63,6 +120,16 @@ func doctor() {
 	} else {
 		fmt.Printf("  ffmpeg:   PATH fallback (not yet hermetic)\n")
 	}
+	fmt.Println("\nModels")
+	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
+	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))
+}
+
+func presence(p string) string {
+	if fi, err := os.Stat(p); err == nil && fi.Size() > 0 {
+		return fmt.Sprintf("%s (%d MB)", p, fi.Size()>>20)
+	}
+	return "not provisioned — run `podcli setup`"
 }
 
 func printHelp() {
@@ -82,7 +149,8 @@ Launcher commands:
   doctor               Show resolved paths, interpreter, backend, ffmpeg
   version              Print version
   update               Self-update (coming in Phase 2)
-  setup                Provision hermetic runtimes (coming soon)
+  setup [--model base] [--vad]
+                       Provision models into the managed dir (runtimes: later phase)
 
 Run a command with --help for its options.
 `, Version)

From db113b254cf423bce9d1d0dfe9b9c09a4076ccfc Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 13:19:00 +0400
Subject: [PATCH 06/41] Honor a hermetic ffmpeg/ffprobe across the backend

proc.run() resolves bare ffmpeg/ffprobe invocations to PODCLI_FFMPEG /
PODCLI_FFPROBE when set, so the whole backend uses a provisioned binary with one
central change instead of touching ~15 call sites. encoder's fingerprint helper
honors the same override. The Go launcher resolves ffprobe alongside ffmpeg from
the runtime dir and exports both env vars; doctor reports ffprobe state.

Verified: override is used when set, falls back to PATH otherwise, and leaves
non-ffmpeg commands untouched (proc + encoder unit tests green).
---
 backend/services/encoder.py   |  2 +-
 backend/utils/proc.py         | 13 +++++++++++++
 cli/internal/engine/engine.go | 17 +++++++++++------
 cli/main.go                   |  3 +++
 4 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/backend/services/encoder.py b/backend/services/encoder.py
index 346fcb5..1479ef1 100644
--- a/backend/services/encoder.py
+++ b/backend/services/encoder.py
@@ -159,7 +159,7 @@ def _encoder_cache_path() -> str:
 def _ffmpeg_fingerprint() -> str:
     """Cheap fingerprint (path + mtime) to invalidate cache when ffmpeg changes."""
     import shutil
-    ffbin = shutil.which("ffmpeg") or "ffmpeg"
+    ffbin = os.environ.get("PODCLI_FFMPEG") or shutil.which("ffmpeg") or "ffmpeg"
     try:
         st = os.stat(ffbin)
         return f"{ffbin}:{int(st.st_mtime)}:{st.st_size}"
diff --git a/backend/utils/proc.py b/backend/utils/proc.py
index d118dd9..e48b530 100644
--- a/backend/utils/proc.py
+++ b/backend/utils/proc.py
@@ -8,12 +8,24 @@
 from __future__ import annotations
 
 import logging
+import os
 import subprocess
 import time
 from typing import Sequence
 
 log = logging.getLogger("podcli.proc")
 
+_TOOL_ENV = {"ffmpeg": "PODCLI_FFMPEG", "ffprobe": "PODCLI_FFPROBE"}
+
+
+def _resolve_tool(cmd: Sequence[str]) -> list[str]:
+    if not cmd:
+        return list(cmd)
+    override = os.environ.get(_TOOL_ENV.get(cmd[0], ""))
+    if override and os.path.exists(override):
+        return [override, *cmd[1:]]
+    return list(cmd)
+
 
 class ProcError(RuntimeError):
     """Raised when a wrapped subprocess fails or times out."""
@@ -45,6 +57,7 @@ def run(
     """
     if not cmd:
         raise ValueError("proc.run: cmd must be non-empty")
+    cmd = _resolve_tool(cmd)
     tool = cmd[0]
     t0 = time.monotonic()
     log.debug("proc.start tool=%s argc=%d timeout=%.0fs", tool, len(cmd), timeout)
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index 07d45a7..91a35e9 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -65,12 +65,11 @@ func Python() string {
 	return "python3"
 }
 
-func FFmpeg() string {
-	cands := []string{
-		filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg"),
-		filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg.exe"),
-	}
-	for _, p := range cands {
+func runtimeBin(name string) string {
+	for _, p := range []string{
+		filepath.Join(paths.RuntimeDir(), "ffmpeg", name),
+		filepath.Join(paths.RuntimeDir(), "ffmpeg", name+".exe"),
+	} {
 		if exists(p) {
 			return p
 		}
@@ -78,6 +77,9 @@ func FFmpeg() string {
 	return ""
 }
 
+func FFmpeg() string  { return runtimeBin("ffmpeg") }
+func FFprobe() string { return runtimeBin("ffprobe") }
+
 func Run(args []string) (int, error) {
 	root, ok := BackendRoot()
 	if !ok {
@@ -96,6 +98,9 @@ func Run(args []string) (int, error) {
 	if ff := FFmpeg(); ff != "" {
 		env = append(env, "PODCLI_FFMPEG="+ff)
 	}
+	if fp := FFprobe(); fp != "" {
+		env = append(env, "PODCLI_FFPROBE="+fp)
+	}
 	cmd.Env = env
 
 	if err := cmd.Run(); err != nil {
diff --git a/cli/main.go b/cli/main.go
index 3d80285..977b01b 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -120,6 +120,9 @@ func doctor() {
 	} else {
 		fmt.Printf("  ffmpeg:   PATH fallback (not yet hermetic)\n")
 	}
+	if fp := engine.FFprobe(); fp != "" {
+		fmt.Printf("  ffprobe:  %s (hermetic)\n", fp)
+	}
 	fmt.Println("\nModels")
 	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
 	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))

From 181bed9bfec177ec5ada5e02e82177da377aa02c Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Thu, 11 Jun 2026 13:30:47 +0400
Subject: [PATCH 07/41] Provision a hermetic ffmpeg/ffprobe via the Go launcher
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

podcli setup now fetches static ffmpeg builds per platform (evermeet on macOS,
johnvansickle tar.xz on Linux, BtbN zip on Windows), extracting the binaries
into the managed runtime dir. Archives are downloaded with the resumable
transfer and unpacked by sniffing the format (zip via stdlib, tar.xz via tar).
ffmpeg failure is non-fatal — the backend falls back to PATH.

The launcher exports the provisioned ffmpeg/ffprobe so the backend uses them.
Static sources aren't checksum-pinned yet (upstream "latest" URLs); they get our
own pinned builds once podcli hosts releases.

Verified on macOS: both binaries provision, run (8.1.1), and the backend's
proc.run uses the hermetic ffprobe; setup is idempotent.
---
 cli/internal/provision/provision.go | 192 +++++++++++++++++++++++++---
 cli/main.go                         |   7 +-
 2 files changed, 183 insertions(+), 16 deletions(-)

diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index a00b2eb..f4d8651 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -2,13 +2,16 @@
 package provision
 
 import (
+	"archive/zip"
 	"crypto/sha256"
 	"encoding/hex"
 	"fmt"
 	"io"
 	"net/http"
 	"os"
+	"os/exec"
 	"path/filepath"
+	"runtime"
 	"time"
 
 	"podcli/internal/paths"
@@ -78,14 +81,13 @@ func EnsureVADModel() (string, error) {
 
 const maxAttempts = 6
 
-// download resumes via HTTP Range across transient stalls rather than
-// restarting, then verifies the pinned checksum and renames atomically.
-func download(url, dest, wantSHA, label string) error {
+// fetch resumes via HTTP Range across transient stalls rather than restarting,
+// writing to dest atomically.
+func fetch(url, dest, label string) error {
 	if err := os.MkdirAll(filepath.Dir(dest), 0o755); err != nil {
 		return err
 	}
 	tmp := dest + ".part"
-
 	var lastErr error
 	for attempt := 1; attempt <= maxAttempts; attempt++ {
 		done, err := downloadOnce(url, tmp, label)
@@ -101,20 +103,26 @@ func download(url, dest, wantSHA, label string) error {
 		os.Remove(tmp)
 		return lastErr
 	}
+	return os.Rename(tmp, dest)
+}
 
-	if wantSHA != "" {
-		got, err := sha256file(tmp)
-		if err != nil {
-			return err
-		}
-		if got != wantSHA {
-			os.Remove(tmp)
-			return fmt.Errorf("checksum mismatch for %s: got %s want %s", label, got, wantSHA)
-		}
-	} else {
+func download(url, dest, wantSHA, label string) error {
+	if err := fetch(url, dest, label); err != nil {
+		return err
+	}
+	if wantSHA == "" {
 		fmt.Fprintf(os.Stderr, "  (no pinned checksum for %s — skipped verification)\n", label)
+		return nil
 	}
-	return os.Rename(tmp, dest)
+	got, err := sha256file(dest)
+	if err != nil {
+		return err
+	}
+	if got != wantSHA {
+		os.Remove(dest)
+		return fmt.Errorf("checksum mismatch for %s: got %s want %s", label, got, wantSHA)
+	}
+	return nil
 }
 
 func downloadOnce(url, tmp, label string) (bool, error) {
@@ -178,6 +186,160 @@ func sha256file(path string) (string, error) {
 	return hex.EncodeToString(h.Sum(nil)), nil
 }
 
+type ffArchive struct {
+	URL  string
+	Bins []string
+}
+
+// Static ffmpeg sources are not yet pinned by checksum (upstream "latest" URLs);
+// they get our own pinned builds once podcli hosts releases.
+var ffmpegSpecs = map[string][]ffArchive{
+	"darwin/amd64": {
+		{URL: "https://evermeet.cx/ffmpeg/getrelease/ffmpeg/zip", Bins: []string{"ffmpeg"}},
+		{URL: "https://evermeet.cx/ffmpeg/getrelease/ffprobe/zip", Bins: []string{"ffprobe"}},
+	},
+	"darwin/arm64": {
+		{URL: "https://evermeet.cx/ffmpeg/getrelease/ffmpeg/zip", Bins: []string{"ffmpeg"}},
+		{URL: "https://evermeet.cx/ffmpeg/getrelease/ffprobe/zip", Bins: []string{"ffprobe"}},
+	},
+	"linux/amd64": {
+		{URL: "https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz", Bins: []string{"ffmpeg", "ffprobe"}},
+	},
+	"linux/arm64": {
+		{URL: "https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-arm64-static.tar.xz", Bins: []string{"ffmpeg", "ffprobe"}},
+	},
+	"windows/amd64": {
+		{URL: "https://github.com/BtbN/FFmpeg-Builds/releases/download/latest/ffmpeg-master-latest-win64-gpl.zip", Bins: []string{"ffmpeg.exe", "ffprobe.exe"}},
+	},
+}
+
+func exeSuffix() string {
+	if runtime.GOOS == "windows" {
+		return ".exe"
+	}
+	return ""
+}
+
+func FFmpegBin() string {
+	return filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg"+exeSuffix())
+}
+
+func EnsureFFmpeg() (string, error) {
+	bin := FFmpegBin()
+	if have(bin) {
+		return bin, nil
+	}
+	specs, ok := ffmpegSpecs[runtime.GOOS+"/"+runtime.GOARCH]
+	if !ok {
+		return "", fmt.Errorf("no ffmpeg build for %s/%s", runtime.GOOS, runtime.GOARCH)
+	}
+	dir := filepath.Join(paths.RuntimeDir(), "ffmpeg")
+	if err := os.MkdirAll(dir, 0o755); err != nil {
+		return "", err
+	}
+	for _, a := range specs {
+		sum := sha256.Sum256([]byte(a.URL))
+		archive := filepath.Join(os.TempDir(), "podcli-ff-"+hex.EncodeToString(sum[:8]))
+		if err := fetch(a.URL, archive, "ffmpeg-archive"); err != nil {
+			return "", err
+		}
+		err := extractBins(archive, a.Bins, dir)
+		os.Remove(archive)
+		if err != nil {
+			return "", err
+		}
+	}
+	if !have(bin) {
+		return "", fmt.Errorf("ffmpeg missing after extraction in %s", dir)
+	}
+	return bin, nil
+}
+
+func extractBins(archive string, bins []string, dest string) error {
+	f, err := os.Open(archive)
+	if err != nil {
+		return err
+	}
+	magic := make([]byte, 6)
+	io.ReadFull(f, magic)
+	f.Close()
+	switch {
+	case magic[0] == 'P' && magic[1] == 'K':
+		return extractZip(archive, bins, dest)
+	case magic[0] == 0xFD && string(magic[1:6]) == "7zXZ\x00":
+		return extractTarXz(archive, bins, dest)
+	default:
+		return fmt.Errorf("unrecognized archive format")
+	}
+}
+
+func wantSet(bins []string) map[string]bool {
+	m := make(map[string]bool, len(bins))
+	for _, b := range bins {
+		m[b] = true
+	}
+	return m
+}
+
+func extractZip(archive string, bins []string, dest string) error {
+	zr, err := zip.OpenReader(archive)
+	if err != nil {
+		return err
+	}
+	defer zr.Close()
+	want := wantSet(bins)
+	for _, zf := range zr.File {
+		if zf.FileInfo().IsDir() || !want[filepath.Base(zf.Name)] {
+			continue
+		}
+		rc, err := zf.Open()
+		if err != nil {
+			return err
+		}
+		err = writeBin(rc, filepath.Join(dest, filepath.Base(zf.Name)))
+		rc.Close()
+		if err != nil {
+			return err
+		}
+	}
+	return nil
+}
+
+func extractTarXz(archive string, bins []string, dest string) error {
+	tmp, err := os.MkdirTemp("", "podcli-ffx-")
+	if err != nil {
+		return err
+	}
+	defer os.RemoveAll(tmp)
+	cmd := exec.Command("tar", "-xf", archive, "-C", tmp)
+	cmd.Stderr = os.Stderr
+	if err := cmd.Run(); err != nil {
+		return fmt.Errorf("tar extract (is tar installed?): %w", err)
+	}
+	want := wantSet(bins)
+	return filepath.WalkDir(tmp, func(p string, d os.DirEntry, err error) error {
+		if err != nil || d.IsDir() || !want[filepath.Base(p)] {
+			return err
+		}
+		in, err := os.Open(p)
+		if err != nil {
+			return err
+		}
+		defer in.Close()
+		return writeBin(in, filepath.Join(dest, filepath.Base(p)))
+	})
+}
+
+func writeBin(r io.Reader, dest string) error {
+	out, err := os.OpenFile(dest, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, 0o755)
+	if err != nil {
+		return err
+	}
+	defer out.Close()
+	_, err = io.Copy(out, r)
+	return err
+}
+
 type progress struct {
 	label    string
 	total    int64
diff --git a/cli/main.go b/cli/main.go
index 977b01b..0f5f6f8 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -98,7 +98,12 @@ func setup(args []string) int {
 		}
 		fmt.Printf("  vad:    %s\n", vp)
 	}
-	fmt.Println("Done. (hermetic python/ffmpeg/whisper.cpp provisioning lands in a later phase)")
+	if fp, err := provision.EnsureFFmpeg(); err != nil {
+		fmt.Fprintf(os.Stderr, "  ffmpeg: skipped (%v) — backend will use PATH ffmpeg\n", err)
+	} else {
+		fmt.Printf("  ffmpeg: %s\n", fp)
+	}
+	fmt.Println("Done. (hermetic python/whisper.cpp provisioning lands in a later phase)")
 	return 0
 }
 

From 827421726216d4fba111e93c6430e2c89c906239 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 16:03:45 +0400
Subject: [PATCH 08/41] Provision a hermetic CPython with slim, PyTorch-free
 deps
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

podcli setup now provisions python-build-standalone (resolved via the GitHub
latest-release API so it tracks upstream), extracting the install_only tarball
into the managed runtime dir, then pip-installs requirements-runtime.txt — the
backend deps minus openai-whisper/torch. That is the ~2GB the native install
saves; the shipped binary transcribes with whisper.cpp only.

Because the hermetic interpreter has no openai-whisper, the launcher defaults
process/studio to whisper.cpp when it resolves the hermetic Python (whisper-py
stays available on dev/source installs). A guarded integration test covers asset
resolution, tar.gz extraction (incl. the symlink-before-parent-dir ordering),
and that the interpreter runs.

Verified on macOS: Python 3.12.13 provisions and runs; cv2/numpy/Pillow and the
rest of the slim set import cleanly.
---
 backend/requirements-runtime.txt              |  11 ++
 cli/internal/engine/engine.go                 |   7 +
 cli/internal/provision/provision.go           | 161 ++++++++++++++++++
 .../provision/provision_manual_test.go        |  47 +++++
 cli/main.go                                   |  39 +++--
 5 files changed, 253 insertions(+), 12 deletions(-)
 create mode 100644 backend/requirements-runtime.txt
 create mode 100644 cli/internal/provision/provision_manual_test.go

diff --git a/backend/requirements-runtime.txt b/backend/requirements-runtime.txt
new file mode 100644
index 0000000..8f5143d
--- /dev/null
+++ b/backend/requirements-runtime.txt
@@ -0,0 +1,11 @@
+# Hermetic native-CLI runtime deps. Transcription is whisper.cpp (a native
+# binary), so openai-whisper/torch are intentionally absent — that is the ~2GB
+# the native install saves. --engine whisper-py remains a dev/source-only option
+# via backend/requirements.txt.
+opencv-python-headless>=4.8.0
+numpy>=1.24.0
+Pillow>=10.0.0
+questionary>=2.0.0
+python-dotenv>=1.0.0
+google-api-python-client>=2.0.0
+google-auth-oauthlib>=1.0.0
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index 91a35e9..f98d008 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -7,10 +7,17 @@ import (
 	"os/exec"
 	"path/filepath"
 	"runtime"
+	"strings"
 
 	"podcli/internal/paths"
 )
 
+// IsHermeticPython reports whether the resolved interpreter is the provisioned
+// one (which lacks openai-whisper, so transcription must default to whisper.cpp).
+func IsHermeticPython() bool {
+	return strings.HasPrefix(Python(), paths.RuntimeDir())
+}
+
 func exists(p string) bool {
 	_, err := os.Stat(p)
 	return err == nil
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index f4d8651..f86c9ce 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -2,9 +2,12 @@
 package provision
 
 import (
+	"archive/tar"
 	"archive/zip"
+	"compress/gzip"
 	"crypto/sha256"
 	"encoding/hex"
+	"encoding/json"
 	"fmt"
 	"io"
 	"net/http"
@@ -12,6 +15,7 @@ import (
 	"os/exec"
 	"path/filepath"
 	"runtime"
+	"strings"
 	"time"
 
 	"podcli/internal/paths"
@@ -340,6 +344,163 @@ func writeBin(r io.Reader, dest string) error {
 	return err
 }
 
+var pyTriples = map[string]string{
+	"darwin/amd64":  "x86_64-apple-darwin",
+	"darwin/arm64":  "aarch64-apple-darwin",
+	"linux/amd64":   "x86_64-unknown-linux-gnu",
+	"linux/arm64":   "aarch64-unknown-linux-gnu",
+	"windows/amd64": "x86_64-pc-windows-msvc",
+}
+
+func PythonBin() string {
+	if runtime.GOOS == "windows" {
+		return filepath.Join(paths.RuntimeDir(), "python", "python.exe")
+	}
+	return filepath.Join(paths.RuntimeDir(), "python", "bin", "python3")
+}
+
+// pythonAssetURL resolves a python-build-standalone install_only tarball for
+// this platform via the GitHub latest-release API, so it tracks upstream
+// without a hardcoded version that rots.
+func pythonAssetURL() (string, error) {
+	triple, ok := pyTriples[runtime.GOOS+"/"+runtime.GOARCH]
+	if !ok {
+		return "", fmt.Errorf("no python build for %s/%s", runtime.GOOS, runtime.GOARCH)
+	}
+	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/astral-sh/python-build-standalone/releases/latest", nil)
+	req.Header.Set("Accept", "application/vnd.github+json")
+	resp, err := http.DefaultClient.Do(req)
+	if err != nil {
+		return "", err
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return "", fmt.Errorf("github api: HTTP %d", resp.StatusCode)
+	}
+	var rel struct {
+		Assets []struct {
+			Name string `json:"name"`
+			URL  string `json:"browser_download_url"`
+		} `json:"assets"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&rel); err != nil {
+		return "", err
+	}
+	match := func(prefer string) string {
+		for _, a := range rel.Assets {
+			if strings.Contains(a.Name, triple) && strings.HasSuffix(a.Name, "install_only.tar.gz") && strings.Contains(a.Name, prefer) {
+				return a.URL
+			}
+		}
+		return ""
+	}
+	if u := match("cpython-3.12."); u != "" {
+		return u, nil
+	}
+	if u := match("cpython-3."); u != "" {
+		return u, nil
+	}
+	return "", fmt.Errorf("no install_only python asset for %s", triple)
+}
+
+func EnsurePython(requirements string) (string, error) {
+	bin := PythonBin()
+	if !have(bin) {
+		url, err := pythonAssetURL()
+		if err != nil {
+			return "", err
+		}
+		sum := sha256.Sum256([]byte(url))
+		archive := filepath.Join(os.TempDir(), "podcli-py-"+hex.EncodeToString(sum[:8])+".tar.gz")
+		if err := fetch(url, archive, "cpython"); err != nil {
+			return "", err
+		}
+		err = extractTarGz(archive, paths.RuntimeDir())
+		os.Remove(archive)
+		if err != nil {
+			return "", err
+		}
+		if !have(bin) {
+			return "", fmt.Errorf("python missing after extraction")
+		}
+	}
+	if requirements != "" {
+		if err := pipInstall(bin, requirements); err != nil {
+			return "", err
+		}
+	}
+	return bin, nil
+}
+
+func pipInstall(pybin, requirements string) error {
+	fmt.Fprintf(os.Stderr, "  installing python deps (%s)\n", filepath.Base(requirements))
+	cmd := exec.Command(pybin, "-m", "pip", "install", "--disable-pip-version-check", "-q", "-r", requirements)
+	cmd.Stdout, cmd.Stderr = os.Stderr, os.Stderr
+	return cmd.Run()
+}
+
+func extractTarGz(archive, dest string) error {
+	f, err := os.Open(archive)
+	if err != nil {
+		return err
+	}
+	defer f.Close()
+	gz, err := gzip.NewReader(f)
+	if err != nil {
+		return err
+	}
+	defer gz.Close()
+	tr := tar.NewReader(gz)
+	root := filepath.Clean(dest) + string(os.PathSeparator)
+	for {
+		h, err := tr.Next()
+		if err == io.EOF {
+			break
+		}
+		if err != nil {
+			return err
+		}
+		target := filepath.Join(dest, h.Name)
+		if !strings.HasPrefix(target, root) {
+			return fmt.Errorf("unsafe path in archive: %s", h.Name)
+		}
+		switch h.Typeflag {
+		case tar.TypeDir:
+			if err := os.MkdirAll(target, 0o755); err != nil {
+				return err
+			}
+		case tar.TypeReg:
+			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+				return err
+			}
+			out, err := os.OpenFile(target, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, os.FileMode(h.Mode))
+			if err != nil {
+				return err
+			}
+			_, err = io.Copy(out, tr)
+			out.Close()
+			if err != nil {
+				return err
+			}
+		case tar.TypeSymlink:
+			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+				return err
+			}
+			os.Remove(target)
+			if err := os.Symlink(h.Linkname, target); err != nil {
+				return err
+			}
+		case tar.TypeLink:
+			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+				return err
+			}
+			os.Remove(target)
+			os.Link(filepath.Join(dest, h.Linkname), target)
+		}
+	}
+	return nil
+}
+
 type progress struct {
 	label    string
 	total    int64
diff --git a/cli/internal/provision/provision_manual_test.go b/cli/internal/provision/provision_manual_test.go
new file mode 100644
index 0000000..d44c3a9
--- /dev/null
+++ b/cli/internal/provision/provision_manual_test.go
@@ -0,0 +1,47 @@
+package provision
+
+import (
+	"os"
+	"os/exec"
+	"strings"
+	"testing"
+
+	"podcli/internal/paths"
+)
+
+// Guarded network integration check (set PODCLI_MANUAL=1). Validates GitHub-API
+// asset resolution, tar.gz extraction (incl. symlinks), and that the provisioned
+// interpreter runs. PODCLI_PY_ARCHIVE=<file> extracts a local tarball instead of
+// downloading; PODCLI_REQS also tests pip install.
+func TestEnsurePythonManual(t *testing.T) {
+	if os.Getenv("PODCLI_MANUAL") == "" {
+		t.Skip("set PODCLI_MANUAL=1 to run")
+	}
+	if arc := os.Getenv("PODCLI_PY_ARCHIVE"); arc != "" {
+		if err := os.MkdirAll(paths.RuntimeDir(), 0o755); err != nil {
+			t.Fatal(err)
+		}
+		if err := extractTarGz(arc, paths.RuntimeDir()); err != nil {
+			t.Fatal(err)
+		}
+		assertRuns(t, PythonBin())
+		return
+	}
+	bin, err := EnsurePython(os.Getenv("PODCLI_REQS"))
+	if err != nil {
+		t.Fatal(err)
+	}
+	assertRuns(t, bin)
+}
+
+func assertRuns(t *testing.T, bin string) {
+	out, err := exec.Command(bin, "--version").CombinedOutput()
+	if err != nil {
+		t.Fatal(err)
+	}
+	v := strings.TrimSpace(string(out))
+	t.Logf("python %s -> %s", bin, v)
+	if !strings.HasPrefix(v, "Python 3.") {
+		t.Fatalf("unexpected version: %q", v)
+	}
+}
diff --git a/cli/main.go b/cli/main.go
index 0f5f6f8..84307ab 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -5,6 +5,7 @@ package main
 import (
 	"fmt"
 	"os"
+	"path/filepath"
 	"strings"
 
 	"podcli/internal/engine"
@@ -34,11 +35,12 @@ func main() {
 	case "help", "--help", "-h":
 		printHelp()
 	default:
-		if usesWhisperCpp(args) {
+		if transcribeEngine(args) == "whispercpp" {
 			if _, err := provision.EnsureModel("base"); err != nil {
 				fmt.Fprintln(os.Stderr, "podcli: provisioning model:", err)
 				os.Exit(1)
 			}
+			os.Setenv("PODCLI_ENGINE", "whispercpp")
 		}
 		code, err := engine.Run(args)
 		if err != nil {
@@ -49,24 +51,29 @@ func main() {
 	}
 }
 
-func usesWhisperCpp(args []string) bool {
-	cmd := args[0]
-	if cmd != "process" && cmd != "studio" {
-		return false
+// transcribeEngine resolves which engine a process/studio run will use, honoring
+// --engine, PODCLI_ENGINE, then defaulting to whisper.cpp on a hermetic Python
+// (which has no openai-whisper).
+func transcribeEngine(args []string) string {
+	if args[0] != "process" && args[0] != "studio" {
+		return ""
 	}
-	engineSel := strings.ToLower(os.Getenv("PODCLI_ENGINE"))
+	sel := strings.ToLower(os.Getenv("PODCLI_ENGINE"))
 	for i, a := range args {
 		if a == "--engine" && i+1 < len(args) {
-			engineSel = strings.ToLower(args[i+1])
+			sel = strings.ToLower(args[i+1])
 		} else if strings.HasPrefix(a, "--engine=") {
-			engineSel = strings.ToLower(strings.TrimPrefix(a, "--engine="))
+			sel = strings.ToLower(strings.TrimPrefix(a, "--engine="))
 		}
 	}
-	switch engineSel {
+	if sel == "" && engine.IsHermeticPython() {
+		sel = "whispercpp"
+	}
+	switch sel {
 	case "whispercpp", "whisper-cpp", "whisper.cpp", "cpp":
-		return true
+		return "whispercpp"
 	}
-	return false
+	return sel
 }
 
 func setup(args []string) int {
@@ -103,7 +110,15 @@ func setup(args []string) int {
 	} else {
 		fmt.Printf("  ffmpeg: %s\n", fp)
 	}
-	fmt.Println("Done. (hermetic python/whisper.cpp provisioning lands in a later phase)")
+	if root, ok := engine.BackendRoot(); ok {
+		reqs := filepath.Join(root, "requirements-runtime.txt")
+		if pb, err := provision.EnsurePython(reqs); err != nil {
+			fmt.Fprintf(os.Stderr, "  python: skipped (%v) — using dev venv / system python\n", err)
+		} else {
+			fmt.Printf("  python: %s\n", pb)
+		}
+	}
+	fmt.Println("Done. (whisper.cpp binary provisioning lands once podcli hosts builds)")
 	return 0
 }
 

From 918db263748d69f91ae5cf18bf9f0c8080b1c482 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 16:56:11 +0400
Subject: [PATCH 09/41] Add self-update check and the auto-update off-switch

The launcher now checks GitHub Releases for a newer version: a fast, non-fatal
on-launch notice before engine commands, and `podcli update` for an explicit
check. With no binary release published yet, update points the user at the
npm/bun reinstall (the designed fallback); the in-place binary swap lands once
releases publish per-platform assets.

Adds a config package persisting the off-switch: `podcli config set update.auto
off` (or PODCLI_NO_UPDATE=1) silences all checks. `config get/set` are owned by
the launcher; other `config` subcommands (status/export/import/use) still pass
through to the Python backend.

Verified: off-switch round-trips to config.json, update degrades gracefully with
no release, unknown keys error, and Python config passthrough works.
---
 cli/internal/config/config.go | 83 ++++++++++++++++++++++++++++++++
 cli/internal/update/update.go | 89 +++++++++++++++++++++++++++++++++++
 cli/main.go                   | 62 +++++++++++++++++++-----
 3 files changed, 221 insertions(+), 13 deletions(-)
 create mode 100644 cli/internal/config/config.go
 create mode 100644 cli/internal/update/update.go

diff --git a/cli/internal/config/config.go b/cli/internal/config/config.go
new file mode 100644
index 0000000..ef972fe
--- /dev/null
+++ b/cli/internal/config/config.go
@@ -0,0 +1,83 @@
+// Package config persists launcher settings (currently the auto-update
+// off-switch) in the managed dir's config.json.
+package config
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"strings"
+
+	"podcli/internal/paths"
+)
+
+type Config struct {
+	Update struct {
+		Auto *bool `json:"auto,omitempty"`
+	} `json:"update"`
+}
+
+func Load() Config {
+	var c Config
+	if b, err := os.ReadFile(paths.ConfigPath()); err == nil {
+		json.Unmarshal(b, &c)
+	}
+	return c
+}
+
+func (c Config) Save() error {
+	if err := os.MkdirAll(paths.Home(), 0o755); err != nil {
+		return err
+	}
+	b, _ := json.MarshalIndent(c, "", "  ")
+	return os.WriteFile(paths.ConfigPath(), b, 0o644)
+}
+
+func truthy(s string) bool {
+	switch strings.ToLower(strings.TrimSpace(s)) {
+	case "1", "true", "yes", "on":
+		return true
+	}
+	return false
+}
+
+// AutoUpdate is true unless disabled via config update.auto or PODCLI_NO_UPDATE.
+func AutoUpdate() bool {
+	if truthy(os.Getenv("PODCLI_NO_UPDATE")) {
+		return false
+	}
+	if a := Load().Update.Auto; a != nil {
+		return *a
+	}
+	return true
+}
+
+func Get(key string) (string, error) {
+	switch key {
+	case "update.auto":
+		if a := Load().Update.Auto; a != nil && !*a {
+			return "off", nil
+		}
+		return "on", nil
+	}
+	return "", fmt.Errorf("unknown config key %q (known: update.auto)", key)
+}
+
+func Set(key, val string) error {
+	switch key {
+	case "update.auto":
+		on := !offValue(val)
+		c := Load()
+		c.Update.Auto = &on
+		return c.Save()
+	}
+	return fmt.Errorf("unknown config key %q (known: update.auto)", key)
+}
+
+func offValue(s string) bool {
+	switch strings.ToLower(strings.TrimSpace(s)) {
+	case "off", "false", "no", "0", "disable", "disabled":
+		return true
+	}
+	return false
+}
diff --git a/cli/internal/update/update.go b/cli/internal/update/update.go
new file mode 100644
index 0000000..4a3ada1
--- /dev/null
+++ b/cli/internal/update/update.go
@@ -0,0 +1,89 @@
+// Package update checks GitHub Releases for a newer podcli and (once releases
+// publish per-platform binaries) applies it. For now a manual update points the
+// user at their package manager, matching the npm/bun reinstall fallback.
+package update
+
+import (
+	"encoding/json"
+	"fmt"
+	"net/http"
+	"os"
+	"strconv"
+	"strings"
+	"time"
+
+	"podcli/internal/config"
+)
+
+const repo = "nmbrthirteen/podcli"
+
+func latestTag(timeout time.Duration) (string, error) {
+	client := &http.Client{Timeout: timeout}
+	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/"+repo+"/releases/latest", nil)
+	req.Header.Set("Accept", "application/vnd.github+json")
+	resp, err := client.Do(req)
+	if err != nil {
+		return "", err
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return "", fmt.Errorf("no published release (HTTP %d)", resp.StatusCode)
+	}
+	var rel struct {
+		Tag string `json:"tag_name"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&rel); err != nil {
+		return "", err
+	}
+	return strings.TrimPrefix(rel.Tag, "v"), nil
+}
+
+func parseVer(v string) [3]int {
+	v = strings.TrimPrefix(v, "v")
+	v = strings.SplitN(v, "-", 2)[0] // drop -dev / pre-release
+	var out [3]int
+	for i, p := range strings.SplitN(v, ".", 3) {
+		out[i], _ = strconv.Atoi(p)
+	}
+	return out
+}
+
+func newer(remote, current string) bool {
+	r, c := parseVer(remote), parseVer(current)
+	for i := 0; i < 3; i++ {
+		if r[i] != c[i] {
+			return r[i] > c[i]
+		}
+	}
+	return false
+}
+
+// NotifyIfOutdated prints a one-line notice when a newer release exists. Fast,
+// silent on any error, and respects the off-switch.
+func NotifyIfOutdated(current string) {
+	if !config.AutoUpdate() {
+		return
+	}
+	tag, err := latestTag(1500 * time.Millisecond)
+	if err != nil {
+		return
+	}
+	if newer(tag, current) {
+		fmt.Fprintf(os.Stderr, "  ↑ podcli %s available (you have %s) — run `podcli update`\n", tag, current)
+	}
+}
+
+func Run(current string) int {
+	tag, err := latestTag(10 * time.Second)
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "podcli: update check failed: %v\n", err)
+		return 1
+	}
+	if !newer(tag, current) {
+		fmt.Printf("podcli %s is up to date.\n", current)
+		return 0
+	}
+	fmt.Printf("podcli %s available (you have %s).\n", tag, current)
+	fmt.Println("Reinstall via your package manager:  npm i -g podcli   (or: bun add -g podcli)")
+	return 0
+}
diff --git a/cli/main.go b/cli/main.go
index 84307ab..cc9594b 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -8,9 +8,11 @@ import (
 	"path/filepath"
 	"strings"
 
+	"podcli/internal/config"
 	"podcli/internal/engine"
 	"podcli/internal/paths"
 	"podcli/internal/provision"
+	"podcli/internal/update"
 )
 
 // Version is set at build time via -ldflags "-X main.Version=...".
@@ -29,26 +31,58 @@ func main() {
 	case "doctor":
 		doctor()
 	case "update":
-		fmt.Println("self-update: not yet implemented (Phase 2 — GitHub Releases + atomic swap)")
+		os.Exit(update.Run(Version))
 	case "setup":
 		os.Exit(setup(args[1:]))
+	case "config":
+		if len(args) >= 2 && (args[1] == "get" || args[1] == "set") {
+			os.Exit(configCmd(args[1:]))
+		}
+		os.Exit(runEngine(args)) // status/export/import/use → Python
 	case "help", "--help", "-h":
 		printHelp()
 	default:
-		if transcribeEngine(args) == "whispercpp" {
-			if _, err := provision.EnsureModel("base"); err != nil {
-				fmt.Fprintln(os.Stderr, "podcli: provisioning model:", err)
-				os.Exit(1)
-			}
-			os.Setenv("PODCLI_ENGINE", "whispercpp")
+		os.Exit(runEngine(args))
+	}
+}
+
+func runEngine(args []string) int {
+	update.NotifyIfOutdated(Version)
+	if transcribeEngine(args) == "whispercpp" {
+		if _, err := provision.EnsureModel("base"); err != nil {
+			fmt.Fprintln(os.Stderr, "podcli: provisioning model:", err)
+			return 1
 		}
-		code, err := engine.Run(args)
+		os.Setenv("PODCLI_ENGINE", "whispercpp")
+	}
+	code, err := engine.Run(args)
+	if err != nil {
+		fmt.Fprintln(os.Stderr, "podcli:", err)
+		return 1
+	}
+	return code
+}
+
+func configCmd(args []string) int {
+	switch {
+	case args[0] == "get" && len(args) == 2:
+		v, err := config.Get(args[1])
 		if err != nil {
 			fmt.Fprintln(os.Stderr, "podcli:", err)
-			os.Exit(1)
+			return 1
 		}
-		os.Exit(code)
+		fmt.Println(v)
+	case args[0] == "set" && len(args) == 3:
+		if err := config.Set(args[1], args[2]); err != nil {
+			fmt.Fprintln(os.Stderr, "podcli:", err)
+			return 1
+		}
+		fmt.Printf("%s = %s\n", args[1], args[2])
+	default:
+		fmt.Fprintln(os.Stderr, "usage: podcli config get <key> | config set <key> <value>")
+		return 2
 	}
+	return 0
 }
 
 // transcribeEngine resolves which engine a process/studio run will use, honoring
@@ -169,11 +203,13 @@ Engine commands (routed to the processing backend):
   knowledge | presets | assets | youtube | config | cache | info
 
 Launcher commands:
-  doctor               Show resolved paths, interpreter, backend, ffmpeg
+  doctor               Show resolved paths, interpreter, backend, ffmpeg, models
   version              Print version
-  update               Self-update (coming in Phase 2)
+  update               Check for and apply a newer release
   setup [--model base] [--vad]
-                       Provision models into the managed dir (runtimes: later phase)
+                       Provision runtimes + models into the managed dir
+  config set update.auto off    Disable auto-update (also: PODCLI_NO_UPDATE=1)
+  config get update.auto
 
 Run a command with --help for its options.
 `, Version)

From ecd82388f4e934064d0b80e2b2e5a7c7b448d055 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 16:59:23 +0400
Subject: [PATCH 10/41] Add npm/bun wrapper package for distribution

A thin Node package (name: podcli) whose postinstall fetches the platform's
native binary into the managed dir and whose bin shim execs it, passing args and
exit codes through. If the pre-fetch is blocked, the shim fetches lazily on first
run, so a failed postinstall never bricks the install. PODCLI_BINARY_SRC copies a
local binary instead of downloading (testing + the release build).

This is the npm i -g podcli / bun add -g podcli front door; it pairs with the
self-update fallback that points users back here.

Verified locally: install, shim exec with arg + exit-code passthrough, and lazy
first-run fetch all work against a local binary.
---
 npm/README.md          | 18 ++++++++
 npm/bin/podcli.js      | 24 +++++++++++
 npm/package.json       | 31 ++++++++++++++
 npm/scripts/install.js | 94 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 167 insertions(+)
 create mode 100644 npm/README.md
 create mode 100644 npm/bin/podcli.js
 create mode 100644 npm/package.json
 create mode 100644 npm/scripts/install.js

diff --git a/npm/README.md b/npm/README.md
new file mode 100644
index 0000000..5cfa5c1
--- /dev/null
+++ b/npm/README.md
@@ -0,0 +1,18 @@
+# podcli
+
+AI podcast clip generator — transcribe, find viral moments, and export vertical
+short-form clips with burned captions.
+
+```sh
+npm i -g podcli      # or: bun add -g podcli
+podcli setup         # one-time: provisions a self-contained runtime
+podcli process episode.mp4 --top 5
+```
+
+Installing this package downloads the native `podcli` binary for your platform
+into a managed directory; the `podcli` command is a thin shim that runs it. The
+binary is self-contained — `podcli setup` provisions its own Python, ffmpeg,
+whisper.cpp, and models, so there is nothing else to install.
+
+Updates: `podcli update` checks for a newer release. Disable auto-update checks
+with `podcli config set update.auto off` (or `PODCLI_NO_UPDATE=1`).
diff --git a/npm/bin/podcli.js b/npm/bin/podcli.js
new file mode 100644
index 0000000..afc9bbd
--- /dev/null
+++ b/npm/bin/podcli.js
@@ -0,0 +1,24 @@
+#!/usr/bin/env node
+'use strict';
+
+const fs = require('fs');
+const { spawnSync } = require('child_process');
+const { binPath, ensure } = require('../scripts/install');
+
+(async () => {
+  let bin = binPath();
+  if (!fs.existsSync(bin)) {
+    try {
+      bin = await ensure();
+    } catch (e) {
+      console.error('podcli: failed to fetch native binary:', e.message);
+      process.exit(1);
+    }
+  }
+  const r = spawnSync(bin, process.argv.slice(2), { stdio: 'inherit' });
+  if (r.error) {
+    console.error('podcli:', r.error.message);
+    process.exit(1);
+  }
+  process.exit(r.status == null ? 1 : r.status);
+})();
diff --git a/npm/package.json b/npm/package.json
new file mode 100644
index 0000000..318a39c
--- /dev/null
+++ b/npm/package.json
@@ -0,0 +1,31 @@
+{
+  "name": "podcli",
+  "version": "2.0.0",
+  "description": "AI podcast clip generator — native CLI. Transcribe, find viral moments, export vertical clips with burned captions.",
+  "bin": {
+    "podcli": "bin/podcli.js"
+  },
+  "scripts": {
+    "postinstall": "node scripts/install.js"
+  },
+  "files": [
+    "bin/",
+    "scripts/"
+  ],
+  "engines": {
+    "node": ">=16"
+  },
+  "license": "AGPL-3.0-only",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/nmbrthirteen/podcli.git"
+  },
+  "keywords": [
+    "podcast",
+    "clips",
+    "shorts",
+    "whisper",
+    "ffmpeg",
+    "cli"
+  ]
+}
diff --git a/npm/scripts/install.js b/npm/scripts/install.js
new file mode 100644
index 0000000..d0ebd9b
--- /dev/null
+++ b/npm/scripts/install.js
@@ -0,0 +1,94 @@
+'use strict';
+
+// Fetches the native podcli binary for this platform into the managed dir. Runs
+// on npm/bun postinstall, and again lazily from the bin shim if the binary is
+// missing. PODCLI_BINARY_SRC=<path> copies a local binary instead of downloading
+// (used for testing and from the release build before publishing).
+
+const os = require('os');
+const fs = require('fs');
+const path = require('path');
+const https = require('https');
+const { version } = require('../package.json');
+
+const REPO = 'nmbrthirteen/podcli';
+
+const TARGETS = {
+  'darwin-x64': 'darwin-amd64',
+  'darwin-arm64': 'darwin-arm64',
+  'linux-x64': 'linux-amd64',
+  'linux-arm64': 'linux-arm64',
+  'win32-x64': 'windows-amd64',
+};
+
+function defaultHome() {
+  const h = os.homedir();
+  if (process.platform === 'darwin') return path.join(h, 'Library', 'Application Support', 'podcli');
+  if (process.platform === 'win32') return process.env.LOCALAPPDATA || path.join(h, 'AppData', 'Local', 'podcli');
+  return process.env.XDG_DATA_HOME ? path.join(process.env.XDG_DATA_HOME, 'podcli') : path.join(h, '.local', 'share', 'podcli');
+}
+
+function binPath() {
+  const home = process.env.PODCLI_HOME || defaultHome();
+  return path.join(home, 'bin', process.platform === 'win32' ? 'podcli.exe' : 'podcli');
+}
+
+function target() {
+  const key = `${process.platform}-${process.arch}`;
+  const t = TARGETS[key];
+  if (!t) throw new Error(`unsupported platform ${key}`);
+  return t;
+}
+
+function download(url, dest, redirects) {
+  redirects = redirects || 0;
+  return new Promise((resolve, reject) => {
+    https
+      .get(url, { headers: { 'User-Agent': 'podcli-install' } }, (res) => {
+        if ([301, 302, 307, 308].includes(res.statusCode) && res.headers.location && redirects < 6) {
+          res.resume();
+          return resolve(download(res.headers.location, dest, redirects + 1));
+        }
+        if (res.statusCode !== 200) {
+          res.resume();
+          return reject(new Error(`HTTP ${res.statusCode} for ${url}`));
+        }
+        const tmp = dest + '.part';
+        const file = fs.createWriteStream(tmp);
+        res.pipe(file);
+        file.on('finish', () => file.close(() => {
+          fs.renameSync(tmp, dest);
+          resolve();
+        }));
+        file.on('error', reject);
+      })
+      .on('error', reject);
+  });
+}
+
+async function ensure() {
+  const dest = binPath();
+  fs.mkdirSync(path.dirname(dest), { recursive: true });
+  const src = process.env.PODCLI_BINARY_SRC;
+  if (src) {
+    fs.copyFileSync(src, dest);
+  } else {
+    const ext = process.platform === 'win32' ? '.exe' : '';
+    const url = `https://github.com/${REPO}/releases/download/v${version}/podcli-${target()}${ext}`;
+    await download(url, dest);
+  }
+  if (process.platform !== 'win32') fs.chmodSync(dest, 0o755);
+  return dest;
+}
+
+module.exports = { binPath, ensure };
+
+if (require.main === module) {
+  ensure()
+    .then((d) => console.log(`podcli binary ready: ${d}`))
+    .catch((e) => {
+      // Don't hard-fail the install — the bin shim fetches it on first run.
+      console.error(`podcli: could not pre-fetch binary (${e.message}); it will be fetched on first run.`);
+      process.exit(0);
+    });
+}

From db2c3496724b3bf35ea6acca6680cb372c0c2a5f Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 17:05:23 +0400
Subject: [PATCH 11/41] Add release workflow: build 5 launcher binaries,
 release, publish npm

On a v* tag, cross-compiles the Go launcher for darwin/linux (amd64+arm64) and
windows/amd64 with the version stamped via ldflags, publishes them as release
assets under the names the npm wrapper and self-update expect, then publishes the
npm package pinned to the tag.

whisper.cpp per-platform builds are left as a marked TODO (they need native
runners); until then transcription uses a PATH whisper-cli.

Verified locally: the build command cross-compiles (e.g. linux/arm64) and stamps
main.Version; the workflow YAML parses.
---
 .github/workflows/release.yml | 81 +++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 .github/workflows/release.yml

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
new file mode 100644
index 0000000..a9ad7a7
--- /dev/null
+++ b/.github/workflows/release.yml
@@ -0,0 +1,81 @@
+# Cuts a release on a v* tag: cross-compiles the Go launcher for all 5 targets,
+# publishes them as release assets (the names the npm wrapper + self-update
+# expect), then publishes the npm package pinned to the tag.
+#
+# TODO: a separate matrix job should build whisper.cpp per platform on native
+# runners and attach whisper-cli-<os>-<arch> assets, so the provisioner can fetch
+# a hermetic binary instead of relying on a PATH/brew install.
+name: release
+
+on:
+  push:
+    tags:
+      - 'v*'
+
+permissions:
+  contents: write
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        include:
+          - { goos: darwin, goarch: amd64 }
+          - { goos: darwin, goarch: arm64 }
+          - { goos: linux, goarch: amd64 }
+          - { goos: linux, goarch: arm64 }
+          - { goos: windows, goarch: amd64 }
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-go@v5
+        with:
+          go-version: '1.23'
+      - name: Build launcher
+        working-directory: cli
+        env:
+          GOOS: ${{ matrix.goos }}
+          GOARCH: ${{ matrix.goarch }}
+          CGO_ENABLED: '0'
+        run: |
+          VERSION="${GITHUB_REF_NAME#v}"
+          EXT=""
+          [ "${{ matrix.goos }}" = "windows" ] && EXT=".exe"
+          OUT="podcli-${{ matrix.goos }}-${{ matrix.goarch }}${EXT}"
+          go build -trimpath -ldflags "-s -w -X main.Version=${VERSION}" -o "../${OUT}" .
+          echo "ASSET=${OUT}" >> "$GITHUB_ENV"
+      - uses: actions/upload-artifact@v4
+        with:
+          name: ${{ env.ASSET }}
+          path: ${{ env.ASSET }}
+
+  release:
+    needs: build
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/download-artifact@v4
+        with:
+          path: dist
+          merge-multiple: true
+      - uses: softprops/action-gh-release@v2
+        with:
+          files: dist/*
+          generate_release_notes: true
+
+  publish-npm:
+    needs: release
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          registry-url: 'https://registry.npmjs.org'
+      - name: Publish wrapper pinned to the tag
+        working-directory: npm
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: |
+          VERSION="${GITHUB_REF_NAME#v}"
+          npm version "$VERSION" --no-git-tag-version --allow-same-version
+          npm publish --access public

From 6ded3d2e8f5c8171c80a622201097442d47db37f Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 17:09:12 +0400
Subject: [PATCH 12/41] Implement self-update binary download + atomic swap
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

podcli update now downloads the release binary for this platform and swaps the
managed binary (~/.podcli/bin/podcli) in place — the file both the npm shim and
direct installs exec, so it updates regardless of install method. Unix uses an
atomic rename; Windows moves the running .exe aside first. If the swap fails it
falls back to the npm/bun reinstall message.

Unit tests cover the swap and the version comparison.
---
 cli/internal/update/update.go      | 95 +++++++++++++++++++++++++++++-
 cli/internal/update/update_test.go | 45 ++++++++++++++
 2 files changed, 138 insertions(+), 2 deletions(-)
 create mode 100644 cli/internal/update/update_test.go

diff --git a/cli/internal/update/update.go b/cli/internal/update/update.go
index 4a3ada1..b3a2d8e 100644
--- a/cli/internal/update/update.go
+++ b/cli/internal/update/update.go
@@ -6,17 +6,39 @@ package update
 import (
 	"encoding/json"
 	"fmt"
+	"io"
 	"net/http"
 	"os"
+	"path/filepath"
+	"runtime"
 	"strconv"
 	"strings"
 	"time"
 
 	"podcli/internal/config"
+	"podcli/internal/paths"
 )
 
 const repo = "nmbrthirteen/podcli"
 
+func exeExt() string {
+	if runtime.GOOS == "windows" {
+		return ".exe"
+	}
+	return ""
+}
+
+// managedBin is the binary the npm shim and direct installs both exec, so
+// replacing it updates podcli regardless of how it was installed.
+func managedBin() string {
+	return filepath.Join(paths.BinDir(), "podcli"+exeExt())
+}
+
+func assetURL(tag string) string {
+	return fmt.Sprintf("https://github.com/%s/releases/download/v%s/podcli-%s-%s%s",
+		repo, tag, runtime.GOOS, runtime.GOARCH, exeExt())
+}
+
 func latestTag(timeout time.Duration) (string, error) {
 	client := &http.Client{Timeout: timeout}
 	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/"+repo+"/releases/latest", nil)
@@ -83,7 +105,76 @@ func Run(current string) int {
 		fmt.Printf("podcli %s is up to date.\n", current)
 		return 0
 	}
-	fmt.Printf("podcli %s available (you have %s).\n", tag, current)
-	fmt.Println("Reinstall via your package manager:  npm i -g podcli   (or: bun add -g podcli)")
+	fmt.Printf("Updating podcli %s → %s ...\n", current, tag)
+	if err := apply(tag); err != nil {
+		fmt.Fprintf(os.Stderr, "podcli: self-update failed (%v).\n", err)
+		fmt.Fprintln(os.Stderr, "Reinstall via your package manager:  npm i -g podcli   (or: bun add -g podcli)")
+		return 1
+	}
+	fmt.Printf("Updated to podcli %s.\n", tag)
 	return 0
 }
+
+// apply downloads the release binary for this platform and swaps the managed
+// binary atomically.
+func apply(tag string) error {
+	dest := managedBin()
+	if err := os.MkdirAll(filepath.Dir(dest), 0o755); err != nil {
+		return err
+	}
+	staged := dest + ".new"
+	if err := downloadFile(assetURL(tag), staged); err != nil {
+		return err
+	}
+	if runtime.GOOS != "windows" {
+		if err := os.Chmod(staged, 0o755); err != nil {
+			return err
+		}
+	}
+	return swap(staged, dest)
+}
+
+// swap replaces dest with staged. On Windows a running .exe can't be overwritten,
+// so the old binary is moved aside first; on Unix the rename is atomic.
+func swap(staged, dest string) error {
+	if runtime.GOOS == "windows" {
+		old := dest + ".old"
+		os.Remove(old)
+		if _, err := os.Stat(dest); err == nil {
+			if err := os.Rename(dest, old); err != nil {
+				return err
+			}
+		}
+	}
+	return os.Rename(staged, dest)
+}
+
+func downloadFile(url, dest string, redirects ...int) error {
+	depth := 0
+	if len(redirects) > 0 {
+		depth = redirects[0]
+	}
+	resp, err := http.Get(url)
+	if err != nil {
+		return err
+	}
+	defer resp.Body.Close()
+	if loc := resp.Header.Get("Location"); loc != "" && resp.StatusCode/100 == 3 && depth < 6 {
+		return downloadFile(loc, dest, depth+1)
+	}
+	if resp.StatusCode != http.StatusOK {
+		return fmt.Errorf("HTTP %d for %s", resp.StatusCode, url)
+	}
+	tmp := dest + ".part"
+	f, err := os.Create(tmp)
+	if err != nil {
+		return err
+	}
+	_, err = io.Copy(f, resp.Body)
+	f.Close()
+	if err != nil {
+		os.Remove(tmp)
+		return err
+	}
+	return os.Rename(tmp, dest)
+}
diff --git a/cli/internal/update/update_test.go b/cli/internal/update/update_test.go
new file mode 100644
index 0000000..7be03c2
--- /dev/null
+++ b/cli/internal/update/update_test.go
@@ -0,0 +1,45 @@
+package update
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+func TestSwapReplacesBinary(t *testing.T) {
+	dir := t.TempDir()
+	dest := filepath.Join(dir, "podcli")
+	if err := os.WriteFile(dest, []byte("OLD"), 0o755); err != nil {
+		t.Fatal(err)
+	}
+	staged := dest + ".new"
+	if err := os.WriteFile(staged, []byte("NEW"), 0o755); err != nil {
+		t.Fatal(err)
+	}
+	if err := swap(staged, dest); err != nil {
+		t.Fatal(err)
+	}
+	b, _ := os.ReadFile(dest)
+	if string(b) != "NEW" {
+		t.Fatalf("swap left %q, want NEW", b)
+	}
+}
+
+func TestNewer(t *testing.T) {
+	cases := []struct {
+		remote, current string
+		want            bool
+	}{
+		{"2.0.1", "2.0.0", true},
+		{"2.0.0", "2.0.0", false},
+		{"1.9.9", "2.0.0", false},
+		{"2.1.0", "2.0.9", true},
+		{"v2.0.1", "2.0.0", true},
+		{"3.0.0", "2.9.9", true},
+	}
+	for _, c := range cases {
+		if got := newer(c.remote, c.current); got != c.want {
+			t.Errorf("newer(%q, %q) = %v, want %v", c.remote, c.current, got, c.want)
+		}
+	}
+}

From 723892a6865b44c0cb1e0747519983b7fbfc5f87 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 17:15:10 +0400
Subject: [PATCH 13/41] Provision a hermetic whisper-cli (+ CI build job)

The launcher resolves whisper-cli from the managed runtime dir and falls back to
PATH, exporting PODCLI_WHISPER_CLI so the backend uses the hermetic binary.
EnsureWhisperCpp fetches whisper-cli-<os>-<arch> from the latest release (PATH
fallback until one is published); doctor reports its state.

The release workflow gains a per-platform whisper.cpp matrix (static, Metal
embedded on macOS) producing the matching assets. That matrix is unverified
until the first tagged run.

Verified locally: resolution prefers a runtime whisper-cli over PATH; YAML parses;
Go build + tests green.
---
 .github/workflows/release.yml       | 50 ++++++++++++++++++++++----
 cli/internal/engine/engine.go       | 14 +++++---
 cli/internal/provision/provision.go | 56 +++++++++++++++++++++++++++++
 cli/main.go                         | 12 ++++++-
 4 files changed, 120 insertions(+), 12 deletions(-)

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index a9ad7a7..a21b15a 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -1,10 +1,9 @@
 # Cuts a release on a v* tag: cross-compiles the Go launcher for all 5 targets,
-# publishes them as release assets (the names the npm wrapper + self-update
-# expect), then publishes the npm package pinned to the tag.
+# builds whisper.cpp per platform, publishes both as release assets (the names
+# the npm wrapper, self-update, and provisioner expect), then publishes npm.
 #
-# TODO: a separate matrix job should build whisper.cpp per platform on native
-# runners and attach whisper-cli-<os>-<arch> assets, so the provisioner can fetch
-# a hermetic binary instead of relying on a PATH/brew install.
+# The whisper.cpp matrix below is unverified until the first tagged run — the
+# binary name / cmake flags may need adjustment for the pinned whisper.cpp ref.
 name: release
 
 on:
@@ -49,8 +48,47 @@ jobs:
           name: ${{ env.ASSET }}
           path: ${{ env.ASSET }}
 
+  whisper:
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - { runner: macos-14, goos: darwin, goarch: arm64 }
+          - { runner: macos-13, goos: darwin, goarch: amd64 }
+          - { runner: ubuntu-latest, goos: linux, goarch: amd64 }
+          - { runner: ubuntu-24.04-arm, goos: linux, goarch: arm64 }
+          - { runner: windows-latest, goos: windows, goarch: amd64 }
+    runs-on: ${{ matrix.runner }}
+    env:
+      WHISPER_REF: v1.7.4 # pin a known-good whisper.cpp tag
+    steps:
+      - name: Clone whisper.cpp
+        uses: actions/checkout@v4
+        with:
+          repository: ggml-org/whisper.cpp
+          ref: ${{ env.WHISPER_REF }}
+      - name: Build (static; Metal embedded on macOS)
+        shell: bash
+        run: |
+          EXTRA=""
+          if [ "${{ matrix.goos }}" = "darwin" ]; then EXTRA="-DGGML_METAL_EMBED_LIBRARY=ON"; fi
+          cmake -B build -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release $EXTRA
+          cmake --build build --config Release --target whisper-cli -j
+      - name: Stage asset
+        shell: bash
+        run: |
+          EXT=""; [ "${{ matrix.goos }}" = "windows" ] && EXT=".exe"
+          SRC=$(find build -name "whisper-cli${EXT}" -type f | head -1)
+          OUT="whisper-cli-${{ matrix.goos }}-${{ matrix.goarch }}${EXT}"
+          cp "$SRC" "$OUT"
+          echo "ASSET=$OUT" >> "$GITHUB_ENV"
+      - uses: actions/upload-artifact@v4
+        with:
+          name: ${{ env.ASSET }}
+          path: ${{ env.ASSET }}
+
   release:
-    needs: build
+    needs: [build, whisper]
     runs-on: ubuntu-latest
     steps:
       - uses: actions/download-artifact@v4
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index f98d008..08f976b 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -72,10 +72,10 @@ func Python() string {
 	return "python3"
 }
 
-func runtimeBin(name string) string {
+func runtimeBin(sub, name string) string {
 	for _, p := range []string{
-		filepath.Join(paths.RuntimeDir(), "ffmpeg", name),
-		filepath.Join(paths.RuntimeDir(), "ffmpeg", name+".exe"),
+		filepath.Join(paths.RuntimeDir(), sub, name),
+		filepath.Join(paths.RuntimeDir(), sub, name+".exe"),
 	} {
 		if exists(p) {
 			return p
@@ -84,8 +84,9 @@ func runtimeBin(name string) string {
 	return ""
 }
 
-func FFmpeg() string  { return runtimeBin("ffmpeg") }
-func FFprobe() string { return runtimeBin("ffprobe") }
+func FFmpeg() string     { return runtimeBin("ffmpeg", "ffmpeg") }
+func FFprobe() string    { return runtimeBin("ffmpeg", "ffprobe") }
+func WhisperCLI() string { return runtimeBin("whisper", "whisper-cli") }
 
 func Run(args []string) (int, error) {
 	root, ok := BackendRoot()
@@ -108,6 +109,9 @@ func Run(args []string) (int, error) {
 	if fp := FFprobe(); fp != "" {
 		env = append(env, "PODCLI_FFPROBE="+fp)
 	}
+	if wc := WhisperCLI(); wc != "" {
+		env = append(env, "PODCLI_WHISPER_CLI="+wc)
+	}
 	cmd.Env = env
 
 	if err := cmd.Run(); err != nil {
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index f86c9ce..1fdb83b 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -224,6 +224,62 @@ func exeSuffix() string {
 	return ""
 }
 
+const podcliRepo = "nmbrthirteen/podcli"
+
+func WhisperCLIBin() string {
+	return filepath.Join(paths.RuntimeDir(), "whisper", "whisper-cli"+exeSuffix())
+}
+
+func latestReleaseAssetURL(name string) (string, error) {
+	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/"+podcliRepo+"/releases/latest", nil)
+	req.Header.Set("Accept", "application/vnd.github+json")
+	resp, err := http.DefaultClient.Do(req)
+	if err != nil {
+		return "", err
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return "", fmt.Errorf("no published release (HTTP %d)", resp.StatusCode)
+	}
+	var rel struct {
+		Assets []struct {
+			Name string `json:"name"`
+			URL  string `json:"browser_download_url"`
+		} `json:"assets"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&rel); err != nil {
+		return "", err
+	}
+	for _, a := range rel.Assets {
+		if a.Name == name {
+			return a.URL, nil
+		}
+	}
+	return "", fmt.Errorf("asset %s not in latest release", name)
+}
+
+func EnsureWhisperCpp() (string, error) {
+	bin := WhisperCLIBin()
+	if have(bin) {
+		return bin, nil
+	}
+	name := fmt.Sprintf("whisper-cli-%s-%s%s", runtime.GOOS, runtime.GOARCH, exeSuffix())
+	url, err := latestReleaseAssetURL(name)
+	if err != nil {
+		return "", err
+	}
+	if err := os.MkdirAll(filepath.Dir(bin), 0o755); err != nil {
+		return "", err
+	}
+	if err := fetch(url, bin, "whisper-cli"); err != nil {
+		return "", err
+	}
+	if runtime.GOOS != "windows" {
+		os.Chmod(bin, 0o755)
+	}
+	return bin, nil
+}
+
 func FFmpegBin() string {
 	return filepath.Join(paths.RuntimeDir(), "ffmpeg", "ffmpeg"+exeSuffix())
 }
diff --git a/cli/main.go b/cli/main.go
index cc9594b..379380a 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -152,7 +152,12 @@ func setup(args []string) int {
 			fmt.Printf("  python: %s\n", pb)
 		}
 	}
-	fmt.Println("Done. (whisper.cpp binary provisioning lands once podcli hosts builds)")
+	if wc, err := provision.EnsureWhisperCpp(); err != nil {
+		fmt.Fprintf(os.Stderr, "  whisper: skipped (%v) — backend will use PATH whisper-cli\n", err)
+	} else {
+		fmt.Printf("  whisper: %s\n", wc)
+	}
+	fmt.Println("Done.")
 	return 0
 }
 
@@ -177,6 +182,11 @@ func doctor() {
 	if fp := engine.FFprobe(); fp != "" {
 		fmt.Printf("  ffprobe:  %s (hermetic)\n", fp)
 	}
+	if wc := engine.WhisperCLI(); wc != "" {
+		fmt.Printf("  whisper:  %s (hermetic)\n", wc)
+	} else {
+		fmt.Printf("  whisper:  PATH fallback (install whisper-cli, or provisioned once hosted)\n")
+	}
 	fmt.Println("\nModels")
 	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
 	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))

From 97ef053885a6c80c6f0408806d1cca292363b8cc Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Fri, 12 Jun 2026 17:24:27 +0400
Subject: [PATCH 14/41] Snap whisper.cpp word timings out of trailing silence

whisper.cpp can stretch the final phrase across trailing silence, desyncing
captions. After transcription the adapter clamps word timings to the voiced span
(RMS energy threshold) and re-flows them, so words stranded in silence snap back
near the last voiced frame while words over speech are untouched. On by default;
PODCLI_WHISPERCPP_NO_SNAP=1 disables it. Also fixes a leaked temp dir.

Validated on synthetic true-silence audio (voiced word untouched; a word
stranded at 1.55s pulled back to 1.15s). On a clip with an outro the trailing
audio is genuinely voiced, so the snap correctly no-ops there.
---
 backend/services/transcription_whispercpp.py | 156 ++++++++++++++-----
 tests/test_whispercpp_snap.py                |  70 +++++++++
 2 files changed, 185 insertions(+), 41 deletions(-)
 create mode 100644 tests/test_whispercpp_snap.py

diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index 752fc8c..2920356 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -8,6 +8,7 @@
 import json
 import os
 import re
+import shutil
 import subprocess
 import sys
 import tempfile
@@ -61,6 +62,76 @@ def flush():
     return words
 
 
+def _voiced_intervals(wav_path: str, bridge: float = 0.3, thresh_ratio: float = 0.07):
+    import wave
+
+    import numpy as np
+
+    try:
+        w = wave.open(wav_path, "rb")
+        sr, width, n = w.getframerate(), w.getsampwidth(), w.getnframes()
+        raw = w.readframes(n)
+        w.close()
+    except Exception:
+        return []
+    if width != 2 or sr <= 0 or not raw:
+        return []
+    samples = np.frombuffer(raw, dtype=np.int16).astype(np.float32)
+    hop = max(1, int(sr * 0.010))
+    frame = max(hop, int(sr * 0.025))
+    if len(samples) < frame:
+        return []
+    nf = 1 + (len(samples) - frame) // hop
+    idx = np.arange(nf)[:, None] * hop + np.arange(frame)[None, :]
+    rms = np.sqrt((samples[idx] ** 2).mean(axis=1))
+    peak = float(rms.max())
+    if peak <= 0:
+        return []
+    voiced = rms > thresh_ratio * peak
+    intervals, start = [], None
+    for i, v in enumerate(voiced):
+        if v and start is None:
+            start = i * hop / sr
+        elif not v and start is not None:
+            intervals.append([start, i * hop / sr])
+            start = None
+    if start is not None:
+        intervals.append([start, nf * hop / sr])
+    merged = []
+    for iv in intervals:
+        if merged and iv[0] - merged[-1][1] <= bridge:
+            merged[-1][1] = iv[1]
+        else:
+            merged.append(iv)
+    return merged
+
+
+def _snap_words_to_voiced(words: list[dict], wav_path: str) -> list[dict]:
+    """Pull word timings back into the voiced span. whisper.cpp sometimes
+    stretches trailing words across trailing silence; clamping to [first voiced,
+    last voiced] (with a small pad) and re-flowing keeps captions in sync without
+    disturbing words that already overlap speech."""
+    if not words:
+        return words
+    intervals = _voiced_intervals(wav_path)
+    if not intervals:
+        return words
+    pad = 0.15
+    lo = max(0.0, intervals[0][0] - pad)
+    hi = intervals[-1][1] + pad
+    out, prev_end = [], lo
+    for w in words:
+        s = min(max(float(w.get("start", 0.0)), lo), hi)
+        e = min(max(float(w.get("end", 0.0)), lo), hi)
+        if s < prev_end:
+            s = prev_end
+        if e <= s:
+            e = min(s + 0.05, hi)
+        prev_end = e
+        out.append({**w, "start": round(s, 3), "end": round(e, 3)})
+    return out
+
+
 def transcribe_file(
     file_path: str,
     model_path: str,
@@ -81,47 +152,50 @@ def transcribe_file(
     tmpdir = tempfile.mkdtemp(prefix="wcpp_")
     wav = os.path.join(tmpdir, "audio.wav")
     out_base = os.path.join(tmpdir, "out")
-    _extract_wav(file_path, wav, ffmpeg)
-
-    cmd = [whisper_cli, "-m", model_path, "-f", wav, "-ojf",
-           "-of", out_base, "-t", str(threads)]
-    if dtw_model:
-        cmd += ["-dtw", dtw_model]
-    if vad and vad_model and os.path.exists(vad_model):
-        # VAD removes the trailing-words-into-silence failure mode but currently
-        # adds a small systematic early bias (silence-removal remapping). Off by
-        # default; opt in via PODCLI_WHISPERCPP_VAD.
-        cmd += ["--vad", "--vad-model", vad_model]
-    if language:
-        cmd += ["-l", language]
-    subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
-
-    with open(out_base + ".json", encoding="utf-8") as f:
-        data = json.load(f)
-
-    transcription = data.get("transcription", [])
-    segments, words = [], []
-    for i, seg in enumerate(transcription):
-        off = seg.get("offsets") or {}
-        seg_start = round((off.get("from") or 0) / 1000.0, 3)
-        seg_end = round((off.get("to") or 0) / 1000.0, 3)
-        segments.append({
-            "id": i,
-            "start": seg_start,
-            "end": seg_end,
-            "text": (seg.get("text") or "").strip(),
-            "speaker": None,
-        })
-        words.extend(_tokens_to_words(seg.get("tokens", [])))
-
-    duration = segments[-1]["end"] if segments else 0.0
-    return {
-        "transcript": " ".join(s["text"] for s in segments).strip(),
-        "segments": segments,
-        "words": words,
-        "duration": duration,
-        "language": (data.get("params") or {}).get("language") or language or "en",
-    }
+    try:
+        _extract_wav(file_path, wav, ffmpeg)
+
+        cmd = [whisper_cli, "-m", model_path, "-f", wav, "-ojf",
+               "-of", out_base, "-t", str(threads)]
+        if dtw_model:
+            cmd += ["-dtw", dtw_model]
+        if vad and vad_model and os.path.exists(vad_model):
+            # VAD removes the trailing-words-into-silence failure mode but adds a
+            # systematic early bias (silence-removal remapping). Off by default;
+            # the energy-snap below addresses the same defect without the bias.
+            cmd += ["--vad", "--vad-model", vad_model]
+        if language:
+            cmd += ["-l", language]
+        subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+
+        with open(out_base + ".json", encoding="utf-8") as f:
+            data = json.load(f)
+
+        transcription = data.get("transcription", [])
+        segments, words = [], []
+        for i, seg in enumerate(transcription):
+            off = seg.get("offsets") or {}
+            segments.append({
+                "id": i,
+                "start": round((off.get("from") or 0) / 1000.0, 3),
+                "end": round((off.get("to") or 0) / 1000.0, 3),
+                "text": (seg.get("text") or "").strip(),
+                "speaker": None,
+            })
+            words.extend(_tokens_to_words(seg.get("tokens", [])))
+
+        if os.environ.get("PODCLI_WHISPERCPP_NO_SNAP", "").strip().lower() not in ("1", "true", "yes", "on"):
+            words = _snap_words_to_voiced(words, wav)
+
+        return {
+            "transcript": " ".join(s["text"] for s in segments).strip(),
+            "segments": segments,
+            "words": words,
+            "duration": segments[-1]["end"] if segments else 0.0,
+            "language": (data.get("params") or {}).get("language") or language or "en",
+        }
+    finally:
+        shutil.rmtree(tmpdir, ignore_errors=True)
 
 
 if __name__ == "__main__":
diff --git a/tests/test_whispercpp_snap.py b/tests/test_whispercpp_snap.py
new file mode 100644
index 0000000..bc5d9f2
--- /dev/null
+++ b/tests/test_whispercpp_snap.py
@@ -0,0 +1,70 @@
+"""Energy-snap for whisper.cpp word timings: trailing words stranded in true
+silence are pulled back into the voiced span, while words over speech are left
+alone. (whisper.cpp sometimes stretches the final phrase across trailing
+silence; this keeps captions in sync.)"""
+
+import math
+import os
+import struct
+import sys
+import tempfile
+import unittest
+import wave
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+from services.transcription_whispercpp import _snap_words_to_voiced, _voiced_intervals
+
+
+def _make_wav(path, voiced_s=1.0, silence_s=1.0, sr=16000):
+    w = wave.open(path, "wb")
+    w.setnchannels(1)
+    w.setsampwidth(2)
+    w.setframerate(sr)
+    buf = bytearray()
+    for i in range(int(sr * voiced_s)):
+        buf += struct.pack("<h", int(8000 * math.sin(2 * math.pi * 200 * i / sr)))
+    buf += b"\x00\x00" * int(sr * silence_s)
+    w.writeframes(bytes(buf))
+    w.close()
+
+
+class EnergySnapTests(unittest.TestCase):
+    def setUp(self):
+        self.wav = tempfile.mktemp(suffix=".wav")
+        _make_wav(self.wav)
+
+    def tearDown(self):
+        if os.path.exists(self.wav):
+            os.remove(self.wav)
+
+    def test_voiced_interval_detected(self):
+        iv = _voiced_intervals(self.wav)
+        self.assertTrue(iv)
+        self.assertAlmostEqual(iv[0][0], 0.0, delta=0.05)
+        self.assertAlmostEqual(iv[-1][1], 1.0, delta=0.1)
+
+    def test_trailing_word_pulled_out_of_silence(self):
+        words = [
+            {"word": "hello", "start": 0.1, "end": 0.5},
+            {"word": "world", "start": 1.55, "end": 1.95},
+        ]
+        out = _snap_words_to_voiced(words, self.wav)
+        self.assertEqual(out[0]["start"], 0.1)  # word over speech untouched
+        self.assertLessEqual(out[1]["start"], 1.2)  # stranded word clamped back
+
+    def test_all_silence_leaves_words_unchanged(self):
+        silent = tempfile.mktemp(suffix=".wav")
+        _make_wav(silent, voiced_s=0.0, silence_s=1.0)
+        words = [{"word": "x", "start": 0.1, "end": 0.5}]
+        try:
+            self.assertEqual(_snap_words_to_voiced(words, silent), words)
+        finally:
+            os.remove(silent)
+
+
+if __name__ == "__main__":
+    unittest.main()

From 936fe7269d30e23506471267f0a4d1a523b48de6 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 12:57:13 +0400
Subject: [PATCH 15/41] Show pip progress during setup, not a silent hang

First-run dep install ran pip with -q, so the multi-second download
showed nothing and read as frozen. Stream pip's progress bars (drop -q,
force --progress-bar=on) and unbuffer output so it surfaces on Windows
pipes too.
---
 cli/internal/provision/provision.go | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index 1fdb83b..7605b7a 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -489,9 +489,10 @@ func EnsurePython(requirements string) (string, error) {
 }
 
 func pipInstall(pybin, requirements string) error {
-	fmt.Fprintf(os.Stderr, "  installing python deps (%s)\n", filepath.Base(requirements))
-	cmd := exec.Command(pybin, "-m", "pip", "install", "--disable-pip-version-check", "-q", "-r", requirements)
+	fmt.Fprintf(os.Stderr, "  installing python deps (%s) — pulls ~80MB, first run takes a minute\n", filepath.Base(requirements))
+	cmd := exec.Command(pybin, "-m", "pip", "install", "--disable-pip-version-check", "--progress-bar=on", "-r", requirements)
 	cmd.Stdout, cmd.Stderr = os.Stderr, os.Stderr
+	cmd.Env = append(os.Environ(), "PYTHONUNBUFFERED=1")
 	return cmd.Run()
 }
 

From 874791202aab556d05a9135ab308a466af7f85e0 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 12:57:13 +0400
Subject: [PATCH 16/41] Scale model size units in doctor (KB/MB), not floor to
 0 MB

Sub-megabyte models (the 864 KB Silero VAD) floored to "0 MB" and
looked like a failed download.
---
 cli/main.go | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/cli/main.go b/cli/main.go
index 379380a..8c46610 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -194,11 +194,22 @@ func doctor() {
 
 func presence(p string) string {
 	if fi, err := os.Stat(p); err == nil && fi.Size() > 0 {
-		return fmt.Sprintf("%s (%d MB)", p, fi.Size()>>20)
+		return fmt.Sprintf("%s (%s)", p, humanBytes(fi.Size()))
 	}
 	return "not provisioned — run `podcli setup`"
 }
 
+func humanBytes(n int64) string {
+	switch {
+	case n >= 1<<20:
+		return fmt.Sprintf("%d MB", n>>20)
+	case n >= 1<<10:
+		return fmt.Sprintf("%d KB", n>>10)
+	default:
+		return fmt.Sprintf("%d B", n)
+	}
+}
+
 func printHelp() {
 	fmt.Printf(`podcli %s — AI podcast clip generator
 

From f231eb1ec051dff54209bc793efcd3cda614bea5 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 13:16:49 +0400
Subject: [PATCH 17/41] Embed the Python backend in the launcher binary

An installed podcli had no backend off the source tree, so it only ran
inside the repo. Embed backend/ via go:embed (synced into files/ at build
time by `go generate`, gitignored) and extract it to runtime/backend on
setup. The CLI now resolves its backend from the managed dir and runs from
any working directory, like a real native binary.
---
 .github/workflows/release.yml |  1 +
 .gitignore                    |  1 +
 cli/internal/backend/embed.go | 48 +++++++++++++++++++++++++++++++++++
 cli/internal/backend/sync.sh  | 17 +++++++++++++
 cli/main.go                   | 12 +++++++--
 5 files changed, 77 insertions(+), 2 deletions(-)
 create mode 100644 cli/internal/backend/embed.go
 create mode 100644 cli/internal/backend/sync.sh

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index a21b15a..fe5b629 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -38,6 +38,7 @@ jobs:
           CGO_ENABLED: '0'
         run: |
           VERSION="${GITHUB_REF_NAME#v}"
+          go generate ./internal/backend/   # sync repo backend into files/ for go:embed
           EXT=""
           [ "${{ matrix.goos }}" = "windows" ] && EXT=".exe"
           OUT="podcli-${{ matrix.goos }}-${{ matrix.goarch }}${EXT}"
diff --git a/.gitignore b/.gitignore
index 977be1e..3b68a83 100644
--- a/.gitignore
+++ b/.gitignore
@@ -44,3 +44,4 @@ episodes/
 *.srt
 uploads/
 output/
+cli/internal/backend/files/
diff --git a/cli/internal/backend/embed.go b/cli/internal/backend/embed.go
new file mode 100644
index 0000000..be0a363
--- /dev/null
+++ b/cli/internal/backend/embed.go
@@ -0,0 +1,48 @@
+// Package backend ships the Python processing backend inside the launcher
+// binary so an installed podcli runs without the source repo. The files/ tree is
+// synced from the repo backend/ at build time (`go generate ./...` or CI) and is
+// gitignored — never edit files/ by hand.
+package backend
+
+import (
+	"embed"
+	"io/fs"
+	"os"
+	"path/filepath"
+)
+
+//go:generate sh sync.sh
+//go:embed all:files
+var files embed.FS
+
+// Extract replaces dest with the embedded backend tree. dest is removed first so
+// a stale dev symlink or an older extracted copy never shadows the shipped one.
+func Extract(dest string) error {
+	if err := os.RemoveAll(dest); err != nil {
+		return err
+	}
+	return fs.WalkDir(files, "files", func(p string, d fs.DirEntry, err error) error {
+		if err != nil {
+			return err
+		}
+		rel, err := filepath.Rel("files", p)
+		if err != nil {
+			return err
+		}
+		if rel == "." {
+			return nil
+		}
+		target := filepath.Join(dest, rel)
+		if d.IsDir() {
+			return os.MkdirAll(target, 0o755)
+		}
+		data, err := files.ReadFile(p)
+		if err != nil {
+			return err
+		}
+		if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+			return err
+		}
+		return os.WriteFile(target, data, 0o644)
+	})
+}
diff --git a/cli/internal/backend/sync.sh b/cli/internal/backend/sync.sh
new file mode 100644
index 0000000..b84af4f
--- /dev/null
+++ b/cli/internal/backend/sync.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+# Sync the repo Python backend into files/ for go:embed. Run via `go generate
+# ./...` before building. files/ is gitignored; this is its only writer.
+set -e
+here="$(cd "$(dirname "$0")" && pwd)"
+src="$here/../../../backend"
+dest="$here/files"
+rm -rf "$dest"
+mkdir -p "$dest"
+rsync -a \
+  --exclude='__pycache__' \
+  --exclude='*.pyc' \
+  --exclude='venv' \
+  --exclude='.venv' \
+  --exclude='requirements.txt' \
+  "$src"/ "$dest"/
+echo "synced backend -> $dest"
diff --git a/cli/main.go b/cli/main.go
index 8c46610..4ab355e 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -8,6 +8,7 @@ import (
 	"path/filepath"
 	"strings"
 
+	"podcli/internal/backend"
 	"podcli/internal/config"
 	"podcli/internal/engine"
 	"podcli/internal/paths"
@@ -144,8 +145,15 @@ func setup(args []string) int {
 	} else {
 		fmt.Printf("  ffmpeg: %s\n", fp)
 	}
-	if root, ok := engine.BackendRoot(); ok {
-		reqs := filepath.Join(root, "requirements-runtime.txt")
+	backendDir := filepath.Join(paths.RuntimeDir(), "backend")
+	if err := backend.Extract(backendDir); err != nil {
+		fmt.Fprintf(os.Stderr, "  backend: skipped (%v) — falling back to repo/PODCLI_BACKEND\n", err)
+		backendDir, _ = engine.BackendRoot()
+	} else {
+		fmt.Printf("  backend: %s\n", backendDir)
+	}
+	if backendDir != "" {
+		reqs := filepath.Join(backendDir, "requirements-runtime.txt")
 		if pb, err := provision.EnsurePython(reqs); err != nil {
 			fmt.Fprintf(os.Stderr, "  python: skipped (%v) — using dev venv / system python\n", err)
 		} else {

From 2dabd7e1a987226d2a72660a083c5789fe8ab1a6 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 13:27:05 +0400
Subject: [PATCH 18/41] Restore the interactive menu and PodStack commands in
 the native CLI
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The Go launcher showed its own help on a bare invocation and dropped the
PodStack workflow commands the old bash wrapper forwarded to Claude/Codex.
Bare `podcli` now opens the backend's branded interactive menu, and
auto/generate-titles/… launch the AI agent with the matching slash command
(embedded and installed into the project on demand), honoring
--claude/--codex/--ai and PODCLI_AI with a Codex fallback.
---
 .gitignore                        |   1 +
 cli/internal/podstack/podstack.go | 173 ++++++++++++++++++++++++++++++
 cli/internal/podstack/sync.sh     |  12 +++
 cli/main.go                       |  15 ++-
 4 files changed, 198 insertions(+), 3 deletions(-)
 create mode 100644 cli/internal/podstack/podstack.go
 create mode 100644 cli/internal/podstack/sync.sh

diff --git a/.gitignore b/.gitignore
index 3b68a83..5992c77 100644
--- a/.gitignore
+++ b/.gitignore
@@ -45,3 +45,4 @@ episodes/
 uploads/
 output/
 cli/internal/backend/files/
+cli/internal/podstack/commands/
diff --git a/cli/internal/podstack/podstack.go b/cli/internal/podstack/podstack.go
new file mode 100644
index 0000000..9a5d346
--- /dev/null
+++ b/cli/internal/podstack/podstack.go
@@ -0,0 +1,173 @@
+// Package podstack forwards PodStack workflow commands (auto, generate-titles,
+// …) to an AI agent CLI (Claude Code or Codex), porting the old bash launcher.
+// The commands are Claude Code slash commands driving the podcli MCP tools, so
+// they run inside the agent, not the terminal. The slash-command files are
+// embedded and installed into the working project on demand.
+package podstack
+
+import (
+	"embed"
+	"fmt"
+	"io/fs"
+	"os"
+	"os/exec"
+	"path/filepath"
+	"sort"
+	"strings"
+	"syscall"
+)
+
+//go:generate sh sync.sh
+//go:embed all:commands
+var commands embed.FS
+
+const (
+	colAccent = "\033[38;2;212;135;74m"
+	colGreen  = "\033[38;2;74;222;128m"
+	colYellow = "\033[38;2;250;204;21m"
+	colBold   = "\033[1m"
+	colDim    = "\033[2m"
+	colReset  = "\033[0m"
+)
+
+// Names lists the supported PodStack commands, derived from the embedded files.
+func Names() []string {
+	entries, err := commands.ReadDir("commands")
+	if err != nil {
+		return nil
+	}
+	var names []string
+	for _, e := range entries {
+		if strings.HasSuffix(e.Name(), ".md") {
+			names = append(names, strings.TrimSuffix(e.Name(), ".md"))
+		}
+	}
+	sort.Strings(names)
+	return names
+}
+
+func IsCommand(name string) bool {
+	name = strings.TrimPrefix(strings.TrimPrefix(name, "--"), "/")
+	for _, n := range Names() {
+		if n == name {
+			return true
+		}
+	}
+	return false
+}
+
+// installCommands writes the embedded slash-command files into
+// <project>/.claude/commands so the agent can resolve /<cmd>. Existing files are
+// left untouched, so a user's local edits win.
+func installCommands(project string) error {
+	dest := filepath.Join(project, ".claude", "commands")
+	if err := os.MkdirAll(dest, 0o755); err != nil {
+		return err
+	}
+	return fs.WalkDir(commands, "commands", func(p string, d fs.DirEntry, err error) error {
+		if err != nil || d.IsDir() {
+			return err
+		}
+		target := filepath.Join(dest, filepath.Base(p))
+		if _, err := os.Stat(target); err == nil {
+			return nil
+		}
+		data, err := commands.ReadFile(p)
+		if err != nil {
+			return err
+		}
+		return os.WriteFile(target, data, 0o644)
+	})
+}
+
+// Run launches the agent for cmd with the remaining args as the slash-command
+// arguments. Engine selection: --claude / --codex / --ai <engine>, else
+// PODCLI_AI, else auto (Claude preferred, Codex fallback).
+func Run(cmd string, args []string) int {
+	cmd = strings.TrimPrefix(strings.TrimPrefix(cmd, "--"), "/")
+	engine := os.Getenv("PODCLI_AI")
+	if engine == "" {
+		engine = "auto"
+	}
+	var promptArgs []string
+	for i := 0; i < len(args); i++ {
+		switch a := args[i]; {
+		case a == "--codex":
+			engine = "codex"
+		case a == "--claude":
+			engine = "claude"
+		case a == "--ai":
+			if i+1 < len(args) {
+				engine = args[i+1]
+				i++
+			}
+		case strings.HasPrefix(a, "--ai="):
+			engine = strings.TrimPrefix(a, "--ai=")
+		default:
+			promptArgs = append(promptArgs, a)
+		}
+	}
+	switch engine {
+	case "auto", "claude", "codex":
+	default:
+		fmt.Fprintf(os.Stderr, "  %sInvalid AI engine:%s %s — use --claude, --codex, or --ai auto\n", colBold, colReset, engine)
+		return 1
+	}
+
+	project, err := os.Getwd()
+	if err != nil {
+		project = "."
+	}
+	if err := installCommands(project); err != nil {
+		fmt.Fprintf(os.Stderr, "  %swarning:%s could not install slash commands: %v\n", colYellow, colReset, err)
+	}
+
+	prompt := "/" + cmd
+	if len(promptArgs) > 0 {
+		prompt += " " + strings.Join(promptArgs, " ")
+	}
+	codexPrompt := fmt.Sprintf("Run the PodStack workflow from .claude/commands/%s.md with these arguments, then follow that workflow exactly: %s", cmd, prompt)
+
+	claudeBin, _ := exec.LookPath("claude")
+	codexBin, _ := exec.LookPath("codex")
+
+	if engine == "codex" && codexBin == "" {
+		fmt.Fprintf(os.Stderr, "\n  %sCodex not found in PATH.%s\n  Install it, then run:\n    %scodex --cd %q %q%s\n\n", colBold, colReset, colAccent, project, codexPrompt, colReset)
+		return 1
+	}
+	if engine == "claude" && claudeBin == "" {
+		fmt.Fprintf(os.Stderr, "\n  %sClaude Code not found in PATH.%s\n  Install it, then run:\n    %sclaude %q%s\n\n", colBold, colReset, colAccent, prompt, colReset)
+		return 1
+	}
+
+	if engine != "codex" && claudeBin != "" {
+		fmt.Fprintf(os.Stderr, "\n  %s▶%s Launching Claude Code with: %s%s%s\n  %scwd: %s%s\n\n", colGreen, colReset, colAccent, prompt, colReset, colDim, project, colReset)
+		if code := runIn(project, claudeBin, prompt); code == 0 || engine == "claude" || codexBin == "" {
+			return code
+		}
+		fmt.Fprintf(os.Stderr, "\n  %s⚠%s Claude exited nonzero; trying Codex...\n", colYellow, colReset)
+	}
+
+	if codexBin == "" {
+		fmt.Fprintf(os.Stderr, "\n  %sNo AI agent CLI found in PATH.%s\n  Install Claude Code or Codex, then run one of:\n    %sclaude %q%s\n    %scodex --cd %q %q%s\n\n", colBold, colReset, colAccent, prompt, colReset, colAccent, project, codexPrompt, colReset)
+		return 1
+	}
+	fmt.Fprintf(os.Stderr, "\n  %s▶%s Launching Codex with: %s%s%s\n  %scwd: %s%s\n\n", colGreen, colReset, colAccent, prompt, colReset, colDim, project, colReset)
+	return runIn(project, codexBin, "--cd", project, codexPrompt)
+}
+
+func runIn(dir, bin string, args ...string) int {
+	cmd := exec.Command(bin, args...)
+	cmd.Dir = dir
+	cmd.Stdin, cmd.Stdout, cmd.Stderr = os.Stdin, os.Stdout, os.Stderr
+	if err := cmd.Run(); err != nil {
+		if ee, ok := err.(*exec.ExitError); ok {
+			if ws, ok := ee.Sys().(syscall.WaitStatus); ok {
+				return ws.ExitStatus()
+			}
+			return 1
+		}
+		return 1
+	}
+	return 0
+}
diff --git a/cli/internal/podstack/sync.sh b/cli/internal/podstack/sync.sh
new file mode 100644
index 0000000..8b20678
--- /dev/null
+++ b/cli/internal/podstack/sync.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+# Sync the repo PodStack slash commands into commands/ for go:embed. Run via
+# `go generate ./...` before building. commands/ is gitignored; this is its
+# only writer.
+set -e
+here="$(cd "$(dirname "$0")" && pwd)"
+src="$here/../../../.claude/commands"
+dest="$here/commands"
+rm -rf "$dest"
+mkdir -p "$dest"
+cp "$src"/*.md "$dest"/
+echo "synced PodStack commands -> $dest"
diff --git a/cli/main.go b/cli/main.go
index 4ab355e..127da04 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -12,6 +12,7 @@ import (
 	"podcli/internal/config"
 	"podcli/internal/engine"
 	"podcli/internal/paths"
+	"podcli/internal/podstack"
 	"podcli/internal/provision"
 	"podcli/internal/update"
 )
@@ -22,8 +23,7 @@ var Version = "2.0.0-dev"
 func main() {
 	args := os.Args[1:]
 	if len(args) == 0 {
-		printHelp()
-		return
+		os.Exit(runEngine(args)) // backend's branded interactive menu
 	}
 
 	switch args[0] {
@@ -43,6 +43,9 @@ func main() {
 	case "help", "--help", "-h":
 		printHelp()
 	default:
+		if podstack.IsCommand(args[0]) {
+			os.Exit(podstack.Run(args[0], args[1:]))
+		}
 		os.Exit(runEngine(args))
 	}
 }
@@ -90,7 +93,7 @@ func configCmd(args []string) int {
 // --engine, PODCLI_ENGINE, then defaulting to whisper.cpp on a hermetic Python
 // (which has no openai-whisper).
 func transcribeEngine(args []string) string {
-	if args[0] != "process" && args[0] != "studio" {
+	if len(args) == 0 || (args[0] != "process" && args[0] != "studio") {
 		return ""
 	}
 	sel := strings.ToLower(os.Getenv("PODCLI_ENGINE"))
@@ -231,6 +234,12 @@ Engine commands (routed to the processing backend):
   thumbnails           Generate thumbnails
   knowledge | presets | assets | youtube | config | cache | info
 
+PodStack commands (run inside Claude Code / Codex):
+  auto <video>         One-verb pipeline: drop footage, get rendered clips
+  generate-titles | generate-descriptions | plan-thumbnails | plan-episode
+  process-transcript | produce-shorts | review-content | publish-checklist
+  retro-episode        Add --codex / --claude to pick the agent
+
 Launcher commands:
   doctor               Show resolved paths, interpreter, backend, ffmpeg, models
   version              Print version

From c5f9d917113934c3fd35f1b7cc571401d70e491f Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 13:31:32 +0400
Subject: [PATCH 19/41] Degrade Web UI launch gracefully on a native install

The studio is a Node/TypeScript app not shipped with the native binary, so
launching it ran npm against the managed runtime dir and crashed with ENOENT.
Detect the missing package.json (and missing npm) and print how to run the
studio from a source checkout instead.
---
 backend/cli.py | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/backend/cli.py b/backend/cli.py
index a5fb82e..88af3c3 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -3561,20 +3561,29 @@ def interactive_menu():
             return
         elif choice == "webui":
             import subprocess as sp
+            import shutil as _shutil
             repo = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..")
-            spa = os.path.join(repo, "dist", "ui", "public", "index.html")
             port = os.environ.get("PORT", "3847")
-            ok = True
-            # npm is npm.cmd on Windows; subprocess can't run a batch file without a shell.
-            _npm_shell = sys.platform == "win32"
-            if not os.path.exists(spa):
-                print(f"\n  {gray}Building the studio (first run)…{reset}\n")
-                ok = sp.run(["npm", "run", "build"], cwd=repo, shell=_npm_shell).returncode == 0
-                if not ok:
-                    print(f"\n  {yellow}Build failed — run 'npm install' then try again.{reset}\n")
-            if ok:
-                print(f"\n  {gray}Studio:{reset} {accent}http://localhost:{port}{reset}   {dim}(Ctrl+C to stop){reset}\n")
-                sp.run(["npm", "run", "ui:prod"], cwd=repo, shell=_npm_shell)
+            # The studio is a Node/TypeScript app that the native binary does not
+            # ship; it only runs from a source checkout with deps installed.
+            if not os.path.exists(os.path.join(repo, "package.json")):
+                print(f"\n  {yellow}The Web UI studio isn't bundled in the native install yet.{reset}")
+                print(f"  {dim}Run it from a source checkout:{reset} {accent}git clone … && npm install && npm run build && PORT={port} npm run ui:prod{reset}\n")
+            elif _shutil.which("npm") is None:
+                print(f"\n  {yellow}Node.js / npm not found on PATH — the studio needs them to build and serve.{reset}\n")
+            else:
+                spa = os.path.join(repo, "dist", "ui", "public", "index.html")
+                ok = True
+                # npm is npm.cmd on Windows; subprocess can't run a batch file without a shell.
+                _npm_shell = sys.platform == "win32"
+                if not os.path.exists(spa):
+                    print(f"\n  {gray}Building the studio (first run)…{reset}\n")
+                    ok = sp.run(["npm", "run", "build"], cwd=repo, shell=_npm_shell).returncode == 0
+                    if not ok:
+                        print(f"\n  {yellow}Build failed — run 'npm install' then try again.{reset}\n")
+                if ok:
+                    print(f"\n  {gray}Studio:{reset} {accent}http://localhost:{port}{reset}   {dim}(Ctrl+C to stop){reset}\n")
+                    sp.run(["npm", "run", "ui:prod"], cwd=repo, shell=_npm_shell)
         elif choice == "assets":
             _interactive_assets()
         elif choice == "presets":

From 53c4d9b434d3e3f387f7816bd8c07e954f779373 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 13:51:01 +0400
Subject: [PATCH 20/41] Serve the Web UI studio from a hermetic Node in native
 installs

Open Web UI ran npm against the managed runtime and crashed on a native
install. Provision a hermetic Node and a prebuilt studio bundle (esbuild'd
server + SPA, built by scripts/build-studio.sh and shipped as a release
asset), then launch them from the menu with rendering delegated to the
hermetic Python + ffmpeg. The studio resolves its backend via a new
PODCLI_BACKEND override in paths.ts; doctor and setup report node/studio;
CI builds and uploads studio-bundle.tar.gz. Falls back to system Node, then
to the npm dev flow from a source checkout.
---
 .github/workflows/release.yml    |  19 ++-
 backend/cli.py                   |  37 +++--
 cli/internal/engine/engine.go    |  26 ++++
 cli/internal/provision/studio.go | 224 +++++++++++++++++++++++++++++++
 cli/main.go                      |  20 +++
 scripts/build-studio.sh          |  18 +++
 src/config/paths.ts              |   4 +-
 7 files changed, 335 insertions(+), 13 deletions(-)
 create mode 100644 cli/internal/provision/studio.go
 create mode 100644 scripts/build-studio.sh

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index fe5b629..88f982a 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -88,8 +88,25 @@ jobs:
           name: ${{ env.ASSET }}
           path: ${{ env.ASSET }}
 
+  studio:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+      - name: Build studio bundle
+        run: |
+          npm ci
+          sh scripts/build-studio.sh dist/studio
+          tar -czf studio-bundle.tar.gz -C dist/studio .
+      - uses: actions/upload-artifact@v4
+        with:
+          name: studio-bundle.tar.gz
+          path: studio-bundle.tar.gz
+
   release:
-    needs: [build, whisper]
+    needs: [build, whisper, studio]
     runs-on: ubuntu-latest
     steps:
       - uses: actions/download-artifact@v4
diff --git a/backend/cli.py b/backend/cli.py
index 88af3c3..3897293 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -3562,20 +3562,32 @@ def interactive_menu():
         elif choice == "webui":
             import subprocess as sp
             import shutil as _shutil
-            repo = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..")
+            backend_dir = os.path.dirname(os.path.abspath(__file__))
             port = os.environ.get("PORT", "3847")
-            # The studio is a Node/TypeScript app that the native binary does not
-            # ship; it only runs from a source checkout with deps installed.
-            if not os.path.exists(os.path.join(repo, "package.json")):
-                print(f"\n  {yellow}The Web UI studio isn't bundled in the native install yet.{reset}")
-                print(f"  {dim}Run it from a source checkout:{reset} {accent}git clone … && npm install && npm run build && PORT={port} npm run ui:prod{reset}\n")
-            elif _shutil.which("npm") is None:
-                print(f"\n  {yellow}Node.js / npm not found on PATH — the studio needs them to build and serve.{reset}\n")
-            else:
+            node = os.environ.get("PODCLI_NODE") or _shutil.which("node")
+            studio = os.environ.get("PODCLI_STUDIO") or os.path.join(backend_dir, "..", "studio")
+            server = os.path.join(studio, "web-server.mjs")
+            repo = os.path.join(backend_dir, "..")
+            if node and os.path.exists(server):
+                # Bundled studio: hermetic Node serves it, rendering delegated to
+                # this same Python backend + ffmpeg via the env below.
+                env = {
+                    **os.environ,
+                    "PORT": str(port),
+                    "PODCLI_BACKEND": backend_dir,
+                    "PYTHON_PATH": sys.executable,
+                    "PODCLI_HOME": paths["home"],
+                    "PODCLI_DATA": os.path.dirname(paths["output"]),
+                    "FFMPEG_PATH": os.environ.get("PODCLI_FFMPEG", "ffmpeg"),
+                    "FFPROBE_PATH": os.environ.get("PODCLI_FFPROBE", "ffprobe"),
+                }
+                print(f"\n  {gray}Studio:{reset} {accent}http://localhost:{port}{reset}   {dim}(Ctrl+C to stop){reset}\n")
+                sp.run([node, server], env=env)
+            elif os.path.exists(os.path.join(repo, "package.json")) and _shutil.which("npm"):
+                # Source checkout (dev): build + serve via npm.
+                _npm_shell = sys.platform == "win32"
                 spa = os.path.join(repo, "dist", "ui", "public", "index.html")
                 ok = True
-                # npm is npm.cmd on Windows; subprocess can't run a batch file without a shell.
-                _npm_shell = sys.platform == "win32"
                 if not os.path.exists(spa):
                     print(f"\n  {gray}Building the studio (first run)…{reset}\n")
                     ok = sp.run(["npm", "run", "build"], cwd=repo, shell=_npm_shell).returncode == 0
@@ -3584,6 +3596,9 @@ def interactive_menu():
                 if ok:
                     print(f"\n  {gray}Studio:{reset} {accent}http://localhost:{port}{reset}   {dim}(Ctrl+C to stop){reset}\n")
                     sp.run(["npm", "run", "ui:prod"], cwd=repo, shell=_npm_shell)
+            else:
+                print(f"\n  {yellow}Studio isn't provisioned yet.{reset}")
+                print(f"  {dim}Run{reset} {accent}podcli setup{reset} {dim}to fetch the bundled studio + Node.{reset}\n")
         elif choice == "assets":
             _interactive_assets()
         elif choice == "presets":
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index 08f976b..ee57793 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -88,6 +88,26 @@ func FFmpeg() string     { return runtimeBin("ffmpeg", "ffmpeg") }
 func FFprobe() string    { return runtimeBin("ffmpeg", "ffprobe") }
 func WhisperCLI() string { return runtimeBin("whisper", "whisper-cli") }
 
+func Node() string {
+	for _, p := range []string{
+		filepath.Join(paths.RuntimeDir(), "node", "bin", "node"),
+		filepath.Join(paths.RuntimeDir(), "node", "node.exe"),
+	} {
+		if exists(p) {
+			return p
+		}
+	}
+	return ""
+}
+
+func StudioServer() string {
+	p := filepath.Join(paths.RuntimeDir(), "studio", "web-server.mjs")
+	if exists(p) {
+		return p
+	}
+	return ""
+}
+
 func Run(args []string) (int, error) {
 	root, ok := BackendRoot()
 	if !ok {
@@ -112,6 +132,12 @@ func Run(args []string) (int, error) {
 	if wc := WhisperCLI(); wc != "" {
 		env = append(env, "PODCLI_WHISPER_CLI="+wc)
 	}
+	if nd := Node(); nd != "" {
+		env = append(env, "PODCLI_NODE="+nd)
+	}
+	if ss := StudioServer(); ss != "" {
+		env = append(env, "PODCLI_STUDIO="+filepath.Dir(ss))
+	}
 	cmd.Env = env
 
 	if err := cmd.Run(); err != nil {
diff --git a/cli/internal/provision/studio.go b/cli/internal/provision/studio.go
new file mode 100644
index 0000000..726ed3b
--- /dev/null
+++ b/cli/internal/provision/studio.go
@@ -0,0 +1,224 @@
+package provision
+
+import (
+	"archive/tar"
+	"archive/zip"
+	"compress/gzip"
+	"fmt"
+	"io"
+	"os"
+	"path/filepath"
+	"runtime"
+	"strings"
+
+	"podcli/internal/paths"
+)
+
+// nodeVersion is the hermetic Node.js used to serve the studio web UI. Node only
+// runs the bundled server (rendering is delegated to the Python backend), so any
+// recent LTS works; pin for reproducibility.
+const nodeVersion = "20.18.1"
+
+var nodeTriples = map[string]string{
+	"darwin/amd64":  "darwin-x64",
+	"darwin/arm64":  "darwin-arm64",
+	"linux/amd64":   "linux-x64",
+	"linux/arm64":   "linux-arm64",
+	"windows/amd64": "win-x64",
+}
+
+func NodeDir() string { return filepath.Join(paths.RuntimeDir(), "node") }
+
+func NodeBin() string {
+	if runtime.GOOS == "windows" {
+		return filepath.Join(NodeDir(), "node.exe")
+	}
+	return filepath.Join(NodeDir(), "bin", "node")
+}
+
+func StudioDir() string    { return filepath.Join(paths.RuntimeDir(), "studio") }
+func StudioServer() string { return filepath.Join(StudioDir(), "web-server.mjs") }
+
+func EnsureNode() (string, error) {
+	bin := NodeBin()
+	if have(bin) {
+		return bin, nil
+	}
+	triple, ok := nodeTriples[runtime.GOOS+"/"+runtime.GOARCH]
+	if !ok {
+		return "", fmt.Errorf("no node build for %s/%s", runtime.GOOS, runtime.GOARCH)
+	}
+	ext := "tar.gz"
+	if runtime.GOOS == "windows" {
+		ext = "zip"
+	}
+	base := fmt.Sprintf("node-v%s-%s", nodeVersion, triple)
+	url := fmt.Sprintf("https://nodejs.org/dist/v%s/%s.%s", nodeVersion, base, ext)
+	archive := filepath.Join(os.TempDir(), "podcli-"+base+"."+ext)
+	if err := fetch(url, archive, "node"); err != nil {
+		return "", err
+	}
+	defer os.Remove(archive)
+	if err := os.RemoveAll(NodeDir()); err != nil {
+		return "", err
+	}
+	var err error
+	if ext == "zip" {
+		err = extractZipStrip1(archive, NodeDir())
+	} else {
+		err = extractTarGzStrip1(archive, NodeDir())
+	}
+	if err != nil {
+		return "", err
+	}
+	if runtime.GOOS != "windows" {
+		os.Chmod(bin, 0o755)
+	}
+	if !have(bin) {
+		return "", fmt.Errorf("node missing after extraction in %s", NodeDir())
+	}
+	return bin, nil
+}
+
+// EnsureStudio fetches the prebuilt, platform-independent studio bundle (server +
+// SPA) from the latest release into StudioDir.
+func EnsureStudio() (string, error) {
+	server := StudioServer()
+	if have(server) {
+		return StudioDir(), nil
+	}
+	url, err := latestReleaseAssetURL("studio-bundle.tar.gz")
+	if err != nil {
+		return "", err
+	}
+	archive := filepath.Join(os.TempDir(), "podcli-studio-bundle.tar.gz")
+	if err := fetch(url, archive, "studio"); err != nil {
+		return "", err
+	}
+	defer os.Remove(archive)
+	if err := os.RemoveAll(StudioDir()); err != nil {
+		return "", err
+	}
+	if err := os.MkdirAll(StudioDir(), 0o755); err != nil {
+		return "", err
+	}
+	if err := extractTarGz(archive, StudioDir()); err != nil {
+		return "", err
+	}
+	if !have(server) {
+		return "", fmt.Errorf("studio server missing after extraction in %s", StudioDir())
+	}
+	return StudioDir(), nil
+}
+
+// strip1 drops the leading path component (node tarballs/zips nest everything
+// under node-vX-os-arch/).
+func strip1(name string) string {
+	name = filepath.ToSlash(name)
+	if i := strings.IndexByte(name, '/'); i >= 0 {
+		return name[i+1:]
+	}
+	return ""
+}
+
+func extractTarGzStrip1(archive, dest string) error {
+	f, err := os.Open(archive)
+	if err != nil {
+		return err
+	}
+	defer f.Close()
+	gz, err := gzip.NewReader(f)
+	if err != nil {
+		return err
+	}
+	defer gz.Close()
+	tr := tar.NewReader(gz)
+	root := filepath.Clean(dest) + string(os.PathSeparator)
+	for {
+		h, err := tr.Next()
+		if err == io.EOF {
+			break
+		}
+		if err != nil {
+			return err
+		}
+		rel := strip1(h.Name)
+		if rel == "" {
+			continue
+		}
+		target := filepath.Join(dest, rel)
+		if !strings.HasPrefix(target, root) {
+			return fmt.Errorf("unsafe path in archive: %s", h.Name)
+		}
+		switch h.Typeflag {
+		case tar.TypeDir:
+			if err := os.MkdirAll(target, 0o755); err != nil {
+				return err
+			}
+		case tar.TypeReg:
+			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+				return err
+			}
+			out, err := os.OpenFile(target, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, os.FileMode(h.Mode))
+			if err != nil {
+				return err
+			}
+			_, err = io.Copy(out, tr)
+			out.Close()
+			if err != nil {
+				return err
+			}
+		case tar.TypeSymlink:
+			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+				return err
+			}
+			os.Remove(target)
+			os.Symlink(h.Linkname, target)
+		}
+	}
+	return nil
+}
+
+func extractZipStrip1(archive, dest string) error {
+	zr, err := zip.OpenReader(archive)
+	if err != nil {
+		return err
+	}
+	defer zr.Close()
+	root := filepath.Clean(dest) + string(os.PathSeparator)
+	for _, zf := range zr.File {
+		rel := strip1(zf.Name)
+		if rel == "" {
+			continue
+		}
+		target := filepath.Join(dest, rel)
+		if !strings.HasPrefix(target, root) {
+			return fmt.Errorf("unsafe path in archive: %s", zf.Name)
+		}
+		if zf.FileInfo().IsDir() {
+			if err := os.MkdirAll(target, 0o755); err != nil {
+				return err
+			}
+			continue
+		}
+		if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
+			return err
+		}
+		rc, err := zf.Open()
+		if err != nil {
+			return err
+		}
+		out, err := os.OpenFile(target, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, zf.Mode())
+		if err != nil {
+			rc.Close()
+			return err
+		}
+		_, err = io.Copy(out, rc)
+		out.Close()
+		rc.Close()
+		if err != nil {
+			return err
+		}
+	}
+	return nil
+}
diff --git a/cli/main.go b/cli/main.go
index 127da04..f31f17d 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -168,6 +168,16 @@ func setup(args []string) int {
 	} else {
 		fmt.Printf("  whisper: %s\n", wc)
 	}
+	if nb, err := provision.EnsureNode(); err != nil {
+		fmt.Fprintf(os.Stderr, "  node:    skipped (%v) — Web UI will use system Node if present\n", err)
+	} else {
+		fmt.Printf("  node:    %s\n", nb)
+	}
+	if sd, err := provision.EnsureStudio(); err != nil {
+		fmt.Fprintf(os.Stderr, "  studio:  skipped (%v) — Web UI needs a published release\n", err)
+	} else {
+		fmt.Printf("  studio:  %s\n", sd)
+	}
 	fmt.Println("Done.")
 	return 0
 }
@@ -198,6 +208,16 @@ func doctor() {
 	} else {
 		fmt.Printf("  whisper:  PATH fallback (install whisper-cli, or provisioned once hosted)\n")
 	}
+	if nd := engine.Node(); nd != "" {
+		fmt.Printf("  node:     %s (hermetic)\n", nd)
+	} else {
+		fmt.Printf("  node:     PATH fallback (Web UI uses system Node, or run `podcli setup`)\n")
+	}
+	if ss := engine.StudioServer(); ss != "" {
+		fmt.Printf("  studio:   %s\n", ss)
+	} else {
+		fmt.Printf("  studio:   not provisioned (Web UI needs a published release)\n")
+	}
 	fmt.Println("\nModels")
 	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
 	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))
diff --git a/scripts/build-studio.sh b/scripts/build-studio.sh
new file mode 100644
index 0000000..655e033
--- /dev/null
+++ b/scripts/build-studio.sh
@@ -0,0 +1,18 @@
+#!/bin/sh
+# Build the studio web-UI bundle: a single self-contained server (esbuild) plus
+# the built SPA. Rendering is delegated to the Python backend at runtime, so the
+# bundle needs no node_modules — only a Node runtime to execute it.
+#
+# Usage: scripts/build-studio.sh [out-dir]   (default: dist/studio)
+set -e
+here="$(cd "$(dirname "$0")/.." && pwd)"
+out="${1:-$here/dist/studio}"
+cd "$here"
+
+npm run build   # tsc + vite -> dist/ui/web-server.js + dist/ui/public
+
+rm -rf "$out"
+mkdir -p "$out"
+node -e "require('esbuild').buildSync({entryPoints:['dist/ui/web-server.js'],bundle:true,platform:'node',format:'esm',outfile:'$out/web-server.mjs',banner:{js:\"import{createRequire as _cr}from'module';const require=_cr(import.meta.url);\"},logLevel:'error'})"
+cp -r dist/ui/public "$out/public"
+echo "studio bundle -> $out"
diff --git a/src/config/paths.ts b/src/config/paths.ts
index 1535c02..c0f4d42 100644
--- a/src/config/paths.ts
+++ b/src/config/paths.ts
@@ -60,7 +60,9 @@ export const paths = {
   corrections: join(home, "corrections.json"),
   thumbnailConfig: join(home, "thumbnail-config.json"),
   integrations: join(home, "integrations.json"),
-  pythonBackend: join(projectRoot, "backend", "main.py"),
+  pythonBackend: process.env.PODCLI_BACKEND
+    ? join(resolve(process.env.PODCLI_BACKEND), "main.py")
+    : join(projectRoot, "backend", "main.py"),
   pythonPath: detectPython(),
   ffmpegPath: process.env.FFMPEG_PATH || "ffmpeg",
   ffprobePath: process.env.FFPROBE_PATH || "ffprobe",

From 36fe400212d7de556c677168a49e9bc0ba733f57 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 14:43:36 +0400
Subject: [PATCH 21/41] Resolve project data + .env from the working directory
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With the backend relocated to the global runtime dir, the Python derived
.podcli/, data/, and .env from the backend's location, so a native install
stranded each project's episodes, presets, and secrets in a global runtime
folder. The launcher now resolves the project root from the working dir
(nearest ancestor with .podcli/.podcli-home, else cwd) and passes
PODCLI_HOME/PODCLI_DATA/PODCLI_ENV_FILE; explicit env still wins. Existing
projects work unchanged — just run podcli in the project dir.
---
 backend/cli.py                |  2 +-
 backend/main.py               |  2 +-
 cli/internal/engine/engine.go | 39 +++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/backend/cli.py b/backend/cli.py
index 3897293..2e1f0d5 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -26,7 +26,7 @@
         except (ValueError, OSError):
             pass
 
-_env_file = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".env")
+_env_file = os.environ.get("PODCLI_ENV_FILE") or os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".env")
 try:
     from dotenv import load_dotenv
     load_dotenv(_env_file)
diff --git a/backend/main.py b/backend/main.py
index 079ff5c..ff1e27a 100644
--- a/backend/main.py
+++ b/backend/main.py
@@ -21,7 +21,7 @@
 # Load .env file (for HF_TOKEN, etc.)
 try:
     from dotenv import load_dotenv
-    load_dotenv(os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".env"))
+    load_dotenv(os.environ.get("PODCLI_ENV_FILE") or os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".env"))
 except ImportError:
     pass
 import traceback
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index ee57793..ccad2cf 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -108,6 +108,32 @@ func StudioServer() string {
 	return ""
 }
 
+// ProjectDir resolves the user's project root: the nearest ancestor of the
+// working directory holding a .podcli dir or .podcli-home marker, else the
+// working directory itself. This keeps episode data, presets, and .env
+// project-local — the behavior of the old in-repo launcher — now that the
+// backend lives in the global runtime dir instead of beside the data.
+func ProjectDir() string {
+	dir, err := os.Getwd()
+	if err != nil {
+		return ""
+	}
+	for {
+		if exists(filepath.Join(dir, ".podcli")) || exists(filepath.Join(dir, ".podcli-home")) {
+			return dir
+		}
+		parent := filepath.Dir(dir)
+		if parent == dir {
+			break
+		}
+		dir = parent
+	}
+	if cwd, err := os.Getwd(); err == nil {
+		return cwd
+	}
+	return ""
+}
+
 func Run(args []string) (int, error) {
 	root, ok := BackendRoot()
 	if !ok {
@@ -138,6 +164,19 @@ func Run(args []string) (int, error) {
 	if ss := StudioServer(); ss != "" {
 		env = append(env, "PODCLI_STUDIO="+filepath.Dir(ss))
 	}
+	// Pin data + .env to the user's project dir so the global runtime backend
+	// doesn't strand project-local episodes/presets. Explicit env wins.
+	if proj := ProjectDir(); proj != "" {
+		if os.Getenv("PODCLI_HOME") == "" {
+			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
+		}
+		if os.Getenv("PODCLI_DATA") == "" {
+			env = append(env, "PODCLI_DATA="+filepath.Join(proj, "data"))
+		}
+		if os.Getenv("PODCLI_ENV_FILE") == "" {
+			env = append(env, "PODCLI_ENV_FILE="+filepath.Join(proj, ".env"))
+		}
+	}
 	cmd.Env = env
 
 	if err := cmd.Run(); err != nil {

From f3791db0e2729a86912c174c19e91b35648d142b Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 15:29:41 +0400
Subject: [PATCH 22/41] Harden the native install against dropped-dependency
 regressions

Four parity fixes vs the source install: --engine whisper-py now fails with
a clear message instead of a raw ModuleNotFoundError; info/menu report
speakers as unavailable when pyannote/torch aren't present rather than
'configured' on HF_TOKEN alone; the launcher filters the objc/FP16/warnings
stderr noise the old bash wrapper stripped; and the unused 10MB res10
caffemodel is dropped from the embedded backend (YuNet onnx is the live one).
---
 backend/cli.py                    | 21 ++++++++++++++--
 backend/services/transcription.py |  8 +++++-
 cli/internal/backend/sync.sh      |  2 ++
 cli/internal/engine/engine.go     | 42 +++++++++++++++++++++++++++++--
 4 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/backend/cli.py b/backend/cli.py
index 2e1f0d5..fd4d39a 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -3026,8 +3026,20 @@ def cmd_info(args):
     print(f"    Platform:     {info['system']}")
     print(f"    Encoder:      {info['best']}")
     print(f"    Available:    {', '.join(info['available'])}")
+    import importlib.util
+    try:
+        diarization_available = importlib.util.find_spec("pyannote.audio") is not None
+    except (ImportError, ValueError):
+        diarization_available = False
+    if not diarization_available:
+        speakers_status = f"{yellow}✗ not available in this install (source install only)"
+    elif hf_token:
+        speakers_status = f"{green}✓ configured"
+    else:
+        speakers_status = f"{yellow}✗ set HF_TOKEN in .env"
+
     print(f"    AI CLI:       {green}{('Claude' if ai_engine == 'claude' else 'Codex') + ' (' + ai_path + ')' if ai_path else f'{yellow}not found — install Claude Code or Codex'}{reset}")
-    print(f"    Speakers:     {green + '✓ configured' if hf_token else yellow + '✗ set HF_TOKEN in .env'}{reset}")
+    print(f"    Speakers:     {speakers_status}{reset}")
     print()
 
 
@@ -3086,7 +3098,12 @@ def print_banner():
                         hf_token = line.strip().split("=", 1)[1].strip()
                         break
 
-    speakers_ok = bool(hf_token)
+    import importlib.util
+    try:
+        _diarization_ok = importlib.util.find_spec("pyannote.audio") is not None
+    except (ImportError, ValueError):
+        _diarization_ok = False
+    speakers_ok = bool(hf_token) and _diarization_ok
 
     # Check AI CLI (Claude Code or Codex)
     from services.claude_suggest import _find_ai_cli
diff --git a/backend/services/transcription.py b/backend/services/transcription.py
index 8cdac5e..4d80528 100644
--- a/backend/services/transcription.py
+++ b/backend/services/transcription.py
@@ -91,7 +91,13 @@ def transcribe_file(
     if progress_callback:
         progress_callback(5, "Loading Whisper model...")
 
-    import whisper
+    try:
+        import whisper
+    except ImportError as e:
+        raise RuntimeError(
+            "The whisper-py engine needs the full source install (openai-whisper + torch). "
+            "This native install ships whisper.cpp — rerun with --engine whispercpp."
+        ) from e
 
     model = whisper.load_model(model_size)
 
diff --git a/cli/internal/backend/sync.sh b/cli/internal/backend/sync.sh
index b84af4f..51ca2ac 100644
--- a/cli/internal/backend/sync.sh
+++ b/cli/internal/backend/sync.sh
@@ -13,5 +13,7 @@ rsync -a \
   --exclude='venv' \
   --exclude='.venv' \
   --exclude='requirements.txt' \
+  --exclude='models/res10_300x300_ssd_iter_140000.caffemodel' \
+  --exclude='models/deploy.prototxt' \
   "$src"/ "$dest"/
 echo "synced backend -> $dest"
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index ccad2cf..682cdb1 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -2,6 +2,7 @@
 package engine
 
 import (
+	"bytes"
 	"fmt"
 	"os"
 	"os/exec"
@@ -12,6 +13,40 @@ import (
 	"podcli/internal/paths"
 )
 
+// stderrFilter drops the macOS/OpenCV/Whisper startup noise the old bash
+// launcher stripped with `grep -v`, so native runs aren't louder than before.
+// It buffers partial lines and forwards everything else to real stderr.
+type stderrFilter struct{ buf []byte }
+
+func isStderrNoise(line string) bool {
+	return strings.HasPrefix(line, "objc[") ||
+		strings.Contains(line, "FP16 is not supported") ||
+		strings.Contains(line, "warnings.warn")
+}
+
+func (w *stderrFilter) Write(p []byte) (int, error) {
+	w.buf = append(w.buf, p...)
+	for {
+		i := bytes.IndexByte(w.buf, '\n')
+		if i < 0 {
+			break
+		}
+		line := w.buf[:i]
+		w.buf = w.buf[i+1:]
+		if !isStderrNoise(string(line)) {
+			os.Stderr.Write(append(line, '\n'))
+		}
+	}
+	return len(p), nil
+}
+
+func (w *stderrFilter) flush() {
+	if len(w.buf) > 0 && !isStderrNoise(string(w.buf)) {
+		os.Stderr.Write(w.buf)
+	}
+	w.buf = nil
+}
+
 // IsHermeticPython reports whether the resolved interpreter is the provisioned
 // one (which lacks openai-whisper, so transcription must default to whisper.cpp).
 func IsHermeticPython() bool {
@@ -143,7 +178,8 @@ func Run(args []string) (int, error) {
 	full := append([]string{"-W", "ignore::UserWarning", cli}, args...)
 
 	cmd := exec.Command(Python(), full...)
-	cmd.Stdin, cmd.Stdout, cmd.Stderr = os.Stdin, os.Stdout, os.Stderr
+	sf := &stderrFilter{}
+	cmd.Stdin, cmd.Stdout, cmd.Stderr = os.Stdin, os.Stdout, sf
 	env := append(os.Environ(),
 		"OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES",
 		"PYTHONIOENCODING=utf-8",
@@ -179,7 +215,9 @@ func Run(args []string) (int, error) {
 	}
 	cmd.Env = env
 
-	if err := cmd.Run(); err != nil {
+	err := cmd.Run()
+	sf.flush()
+	if err != nil {
 		if ee, ok := err.(*exec.ExitError); ok {
 			return ee.ExitCode(), nil
 		}

From bfb08673aae933cec97a610fec41375ac8181c28 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 15:44:35 +0400
Subject: [PATCH 23/41] Ship the MCP server in the native CLI

The mcp__podcli__* tools (the entire Claude/Codex integration the PodStack
slash commands depend on) lived only in the TypeScript dist/index.js and were
absent from the native install. Bundle the MCP stdio server alongside the
studio (esbuild -> mcp-server.mjs), add a `podcli mcp` command that execs it
under hermetic Node with the backend/python/ffmpeg + project env wired, and a
`podcli mcp install` (run automatically by setup) that registers it with
Claude Code via `claude mcp add podcli -- <self> mcp`. doctor reports it.
---
 cli/internal/engine/engine.go | 56 +++++++++++++++++++++++++++++++++++
 cli/main.go                   | 53 +++++++++++++++++++++++++++++++++
 scripts/build-studio.sh       |  8 +++--
 3 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index 682cdb1..6e3260f 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -143,6 +143,62 @@ func StudioServer() string {
 	return ""
 }
 
+func MCPServer() string {
+	p := filepath.Join(paths.RuntimeDir(), "studio", "mcp-server.mjs")
+	if exists(p) {
+		return p
+	}
+	return ""
+}
+
+// nodeEnv builds the env a bundled Node server (studio/MCP) needs: the TS
+// paths.ts reads these names (note PYTHON_PATH/FFMPEG_PATH differ from the
+// PODCLI_* names the Python side uses). Project data stays cwd-local.
+func nodeEnv() []string {
+	env := os.Environ()
+	if root, ok := BackendRoot(); ok {
+		env = append(env, "PODCLI_BACKEND="+root)
+	}
+	env = append(env, "PYTHON_PATH="+Python())
+	if ff := FFmpeg(); ff != "" {
+		env = append(env, "FFMPEG_PATH="+ff)
+	}
+	if fp := FFprobe(); fp != "" {
+		env = append(env, "FFPROBE_PATH="+fp)
+	}
+	if proj := ProjectDir(); proj != "" {
+		if os.Getenv("PODCLI_HOME") == "" {
+			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
+		}
+		if os.Getenv("PODCLI_DATA") == "" {
+			env = append(env, "PODCLI_DATA="+filepath.Join(proj, "data"))
+		}
+		if os.Getenv("PODCLI_ENV_FILE") == "" {
+			env = append(env, "PODCLI_ENV_FILE="+filepath.Join(proj, ".env"))
+		}
+	}
+	return env
+}
+
+// RunMCP execs the bundled MCP stdio server under hermetic Node. stdout carries
+// JSON-RPC, so nothing here may write to it.
+func RunMCP() (int, error) {
+	node, server := Node(), MCPServer()
+	if node == "" || server == "" {
+		return 1, fmt.Errorf("MCP server not provisioned — run `podcli setup`")
+	}
+	cmd := exec.Command(node, server)
+	cmd.Stdin, cmd.Stdout, cmd.Stderr = os.Stdin, os.Stdout, os.Stderr
+	cmd.Env = nodeEnv()
+	if err := cmd.Run(); err != nil {
+		if ee, ok := err.(*exec.ExitError); ok {
+			return ee.ExitCode(), nil
+		}
+		return 1, err
+	}
+	return 0, nil
+}
+
 // ProjectDir resolves the user's project root: the nearest ancestor of the
 // working directory holding a .podcli dir or .podcli-home marker, else the
 // working directory itself. This keeps episode data, presets, and .env
diff --git a/cli/main.go b/cli/main.go
index f31f17d..973009e 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -5,6 +5,7 @@ package main
 import (
 	"fmt"
 	"os"
+	"os/exec"
 	"path/filepath"
 	"strings"
 
@@ -35,6 +36,15 @@ func main() {
 		os.Exit(update.Run(Version))
 	case "setup":
 		os.Exit(setup(args[1:]))
+	case "mcp":
+		if len(args) >= 2 && args[1] == "install" {
+			os.Exit(mcpInstall())
+		}
+		code, err := engine.RunMCP()
+		if err != nil {
+			fmt.Fprintln(os.Stderr, "podcli:", err)
+		}
+		os.Exit(code)
 	case "config":
 		if len(args) >= 2 && (args[1] == "get" || args[1] == "set") {
 			os.Exit(configCmd(args[1:]))
@@ -178,10 +188,46 @@ func setup(args []string) int {
 	} else {
 		fmt.Printf("  studio:  %s\n", sd)
 	}
+	if engine.MCPServer() != "" {
+		if err := registerMCPServer(); err != nil {
+			fmt.Fprintf(os.Stderr, "  mcp:     not registered (%v) — run `podcli mcp install`\n", err)
+		} else {
+			fmt.Printf("  mcp:     registered with Claude Code\n")
+		}
+	}
 	fmt.Println("Done.")
 	return 0
 }
 
+// registerMCPServer points Claude Code at this binary's `mcp` command. Remove
+// first so re-runs refresh a stale path and stay idempotent.
+func registerMCPServer() error {
+	claude, err := exec.LookPath("claude")
+	if err != nil {
+		return fmt.Errorf("Claude Code CLI not found on PATH")
+	}
+	self, err := os.Executable()
+	if err != nil {
+		return err
+	}
+	exec.Command(claude, "mcp", "remove", "podcli").Run()
+	if out, err := exec.Command(claude, "mcp", "add", "podcli", "--", self, "mcp").CombinedOutput(); err != nil {
+		return fmt.Errorf("%v: %s", err, strings.TrimSpace(string(out)))
+	}
+	return nil
+}
+
+func mcpInstall() int {
+	if err := registerMCPServer(); err != nil {
+		self, _ := os.Executable()
+		fmt.Fprintf(os.Stderr, "podcli: %v\n", err)
+		fmt.Fprintf(os.Stderr, "Register manually:  claude mcp add podcli -- %s mcp\n", self)
+		return 1
+	}
+	fmt.Println("Registered podcli MCP server with Claude Code.")
+	return 0
+}
+
 func doctor() {
 	fmt.Printf("podcli %s\n\n", Version)
 	fmt.Println("Paths")
@@ -218,6 +264,11 @@ func doctor() {
 	} else {
 		fmt.Printf("  studio:   not provisioned (Web UI needs a published release)\n")
 	}
+	if ms := engine.MCPServer(); ms != "" {
+		fmt.Printf("  mcp:      %s\n", ms)
+	} else {
+		fmt.Printf("  mcp:      not provisioned (needs a published release)\n")
+	}
 	fmt.Println("\nModels")
 	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
 	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))
@@ -266,6 +317,8 @@ Launcher commands:
   update               Check for and apply a newer release
   setup [--model base] [--vad]
                        Provision runtimes + models into the managed dir
+  mcp                  Run the MCP server (stdio) for Claude/Codex
+  mcp install          Register the MCP server with Claude Code
   config set update.auto off    Disable auto-update (also: PODCLI_NO_UPDATE=1)
   config get update.auto
 
diff --git a/scripts/build-studio.sh b/scripts/build-studio.sh
index 655e033..cd1fa48 100644
--- a/scripts/build-studio.sh
+++ b/scripts/build-studio.sh
@@ -13,6 +13,10 @@ npm run build   # tsc + vite -> dist/ui/web-server.js + dist/ui/public
 
 rm -rf "$out"
 mkdir -p "$out"
-node -e "require('esbuild').buildSync({entryPoints:['dist/ui/web-server.js'],bundle:true,platform:'node',format:'esm',outfile:'$out/web-server.mjs',banner:{js:\"import{createRequire as _cr}from'module';const require=_cr(import.meta.url);\"},logLevel:'error'})"
+banner="import{createRequire as _cr}from'module';const require=_cr(import.meta.url);"
+# Studio web-UI server + its built SPA.
+node -e "require('esbuild').buildSync({entryPoints:['dist/ui/web-server.js'],bundle:true,platform:'node',format:'esm',outfile:'$out/web-server.mjs',banner:{js:\"$banner\"},logLevel:'error'})"
 cp -r dist/ui/public "$out/public"
-echo "studio bundle -> $out"
+# MCP stdio server (the mcp__podcli__* tools Claude/Codex drive).
+node -e "require('esbuild').buildSync({entryPoints:['dist/index.js'],bundle:true,platform:'node',format:'esm',outfile:'$out/mcp-server.mjs',banner:{js:\"$banner\"},logLevel:'error'})"
+echo "studio + mcp bundle -> $out"

From 69c8acea379f0be26e20fb474ac2faff142ab44b Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 17:07:35 +0400
Subject: [PATCH 24/41] Render captions + thumbnails via provisioned Remotion
 in native installs

Captions and thumbnails render through Remotion (a headless-Chromium React
pipeline) that the native install didn't ship, so both failed by default off
a source checkout. Provision a per-platform Remotion bundle (remotion/ + a
production node_modules with the native @rspack/compositor binaries, built by
scripts/build-remotion.sh and uploaded per-platform in CI) into the runtime
dir, point clip_generator + thumbnail_html at the hermetic Node, cache the
project-independent composition bundle once in the managed dir, and pre-warm
the bundle + Chrome at setup. Verified end-to-end: a real captioned clip and
a thumbnail both render under the wired path.

Also fixes re-audit regressions: whisper-py engine fails with a clean message
(spinner stopped, no traceback); ProjectDir only pins project-local data when
a .podcli/.podcli-home marker exists, so read-only commands don't scatter
.podcli into arbitrary dirs; stderr noise filter matches warnings.warn lines
precisely instead of swallowing tracebacks; setup skips MCP re-registration
when already pointed at this binary.
---
 .github/workflows/release.yml      | 32 ++++++++++++++-
 backend/cli.py                     | 18 ++++++---
 backend/services/clip_generator.py | 13 +++++--
 backend/services/thumbnail_html.py |  2 +-
 cli/internal/engine/engine.go      | 22 ++++++-----
 cli/internal/provision/studio.go   | 62 ++++++++++++++++++++++++++++++
 cli/main.go                        | 42 +++++++++++++++++++-
 scripts/build-remotion.sh          | 38 ++++++++++++++++++
 8 files changed, 206 insertions(+), 23 deletions(-)
 create mode 100644 scripts/build-remotion.sh

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index 88f982a..7d3b85a 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -105,8 +105,38 @@ jobs:
           name: studio-bundle.tar.gz
           path: studio-bundle.tar.gz
 
+  remotion:
+    # Per-platform: @rspack and the Remotion compositor ship native binaries, so
+    # the bundle is not portable across os/arch.
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - { runner: macos-14, goos: darwin, goarch: arm64 }
+          - { runner: macos-13, goos: darwin, goarch: amd64 }
+          - { runner: ubuntu-latest, goos: linux, goarch: amd64 }
+          - { runner: ubuntu-24.04-arm, goos: linux, goarch: arm64 }
+          - { runner: windows-latest, goos: windows, goarch: amd64 }
+    runs-on: ${{ matrix.runner }}
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+      - name: Build remotion bundle
+        shell: bash
+        run: |
+          sh scripts/build-remotion.sh dist/remotion-bundle
+          OUT="remotion-bundle-${{ matrix.goos }}-${{ matrix.goarch }}.tar.gz"
+          tar -czf "$OUT" -C dist/remotion-bundle .
+          echo "ASSET=$OUT" >> "$GITHUB_ENV"
+      - uses: actions/upload-artifact@v4
+        with:
+          name: ${{ env.ASSET }}
+          path: ${{ env.ASSET }}
+
   release:
-    needs: [build, whisper, studio]
+    needs: [build, whisper, studio, remotion]
     runs-on: ubuntu-latest
     steps:
       - uses: actions/download-artifact@v4
diff --git a/backend/cli.py b/backend/cli.py
index fd4d39a..8e4b790 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -588,12 +588,18 @@ def _spinner():
             def _transcribe_progress(pct, msg):
                 _spin_msg[0] = f"{msg} ({pct}%)" if pct < 100 else msg
 
-            result = transcribe_file(
-                file_path=video_path,
-                model_size=config.get("whisper_model", "base"),
-                enable_diarization=not config.get("no_speakers", False),
-                progress_callback=_transcribe_progress,
-            )
+            try:
+                result = transcribe_file(
+                    file_path=video_path,
+                    model_size=config.get("whisper_model", "base"),
+                    enable_diarization=not config.get("no_speakers", False),
+                    progress_callback=_transcribe_progress,
+                )
+            except RuntimeError as e:
+                _spin_stop.set()
+                spin_thread.join(timeout=1)
+                print(f"\r{' ' * 70}\r  ✗ {e}\n", flush=True)
+                sys.exit(1)
             _spin_stop.set()
             spin_thread.join(timeout=1)
             words = result["words"]
diff --git a/backend/services/clip_generator.py b/backend/services/clip_generator.py
index 9608bf4..f5ad3d2 100644
--- a/backend/services/clip_generator.py
+++ b/backend/services/clip_generator.py
@@ -424,20 +424,25 @@ def _render_with_remotion(
         return False, None
 
     # Check node is available
-    node_path = shutil.which("node")
+    node_path = os.environ.get("PODCLI_NODE") or shutil.which("node")
     if not node_path:
         _remotion_available = False
         return False, None
 
-    remotion_env = {**os.environ, "PODCLI_CACHE_DIR": paths["cache"]}
+    # Cache the compiled bundle next to the render script. The compositions are
+    # project-independent, so a single global bundle (in the managed runtime dir
+    # for native installs) is reused across every project instead of rebuilt per
+    # data/cache.
+    bundle_cache_root = os.path.join(os.path.dirname(render_script), ".bundle-cache")
+    remotion_env = {**os.environ, "PODCLI_CACHE_DIR": bundle_cache_root}
 
-    cache_dir = os.path.join(paths["cache"], "remotion-bundle")
+    cache_dir = os.path.join(bundle_cache_root, "remotion-bundle")
     bundle_index = os.path.join(cache_dir, "index.html")
     if not os.path.exists(bundle_index):
         try:
             r = subprocess.run(
                 [node_path, render_script, "--prebundle"],
-                timeout=30,
+                timeout=180,
                 cwd=project_root,
                 env=remotion_env,
                 capture_output=True,
diff --git a/backend/services/thumbnail_html.py b/backend/services/thumbnail_html.py
index 3aaaa52..c52ea86 100644
--- a/backend/services/thumbnail_html.py
+++ b/backend/services/thumbnail_html.py
@@ -173,7 +173,7 @@ def _build_remotion_screenshot_command(
     wait_ms: int,
 ) -> Optional[list[str]]:
     """Build a Remotion-backed screenshot command if the repo can run it."""
-    node_bin = shutil.which("node")
+    node_bin = os.environ.get("PODCLI_NODE") or shutil.which("node")
     repo_root = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "..")
     renderer_pkg = os.path.join(repo_root, "node_modules", "@remotion", "renderer", "package.json")
     if not node_bin or not os.path.exists(script_path) or not os.path.exists(renderer_pkg):
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index 6e3260f..cecb337 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -21,7 +21,7 @@ type stderrFilter struct{ buf []byte }
 func isStderrNoise(line string) bool {
 	return strings.HasPrefix(line, "objc[") ||
 		strings.Contains(line, "FP16 is not supported") ||
-		strings.Contains(line, "warnings.warn")
+		strings.HasPrefix(strings.TrimSpace(line), "warnings.warn(")
 }
 
 func (w *stderrFilter) Write(p []byte) (int, error) {
@@ -166,7 +166,7 @@ func nodeEnv() []string {
 	if fp := FFprobe(); fp != "" {
 		env = append(env, "FFPROBE_PATH="+fp)
 	}
-	if proj := ProjectDir(); proj != "" {
+	if proj, ok := ProjectDir(); ok {
 		if os.Getenv("PODCLI_HOME") == "" {
 			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
 		}
@@ -204,14 +204,19 @@ func RunMCP() (int, error) {
 // working directory itself. This keeps episode data, presets, and .env
 // project-local — the behavior of the old in-repo launcher — now that the
 // backend lives in the global runtime dir instead of beside the data.
-func ProjectDir() string {
+// ProjectDir returns the nearest ancestor of the working directory holding a
+// .podcli dir or .podcli-home marker, and whether one was found. Only an
+// established project pins data locally; in an unmarked dir the second return is
+// false so callers leave the backend's default home rather than scattering
+// .podcli/ into arbitrary directories.
+func ProjectDir() (string, bool) {
 	dir, err := os.Getwd()
 	if err != nil {
-		return ""
+		return "", false
 	}
 	for {
 		if exists(filepath.Join(dir, ".podcli")) || exists(filepath.Join(dir, ".podcli-home")) {
-			return dir
+			return dir, true
 		}
 		parent := filepath.Dir(dir)
 		if parent == dir {
@@ -219,10 +224,7 @@ func ProjectDir() string {
 		}
 		dir = parent
 	}
-	if cwd, err := os.Getwd(); err == nil {
-		return cwd
-	}
-	return ""
+	return "", false
 }
 
 func Run(args []string) (int, error) {
@@ -258,7 +260,7 @@ func Run(args []string) (int, error) {
 	}
 	// Pin data + .env to the user's project dir so the global runtime backend
 	// doesn't strand project-local episodes/presets. Explicit env wins.
-	if proj := ProjectDir(); proj != "" {
+	if proj, ok := ProjectDir(); ok {
 		if os.Getenv("PODCLI_HOME") == "" {
 			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
 		}
diff --git a/cli/internal/provision/studio.go b/cli/internal/provision/studio.go
index 726ed3b..28fdc0d 100644
--- a/cli/internal/provision/studio.go
+++ b/cli/internal/provision/studio.go
@@ -7,6 +7,7 @@ import (
 	"fmt"
 	"io"
 	"os"
+	"os/exec"
 	"path/filepath"
 	"runtime"
 	"strings"
@@ -80,6 +81,67 @@ func EnsureNode() (string, error) {
 	return bin, nil
 }
 
+func RemotionDir() string    { return filepath.Join(paths.RuntimeDir(), "remotion") }
+func RemotionScript() string { return filepath.Join(RemotionDir(), "render.mjs") }
+
+// EnsureRemotion fetches the per-platform Remotion render bundle (remotion/ +
+// a production node_modules with native bindings) and extracts it into the
+// runtime dir so remotion/ and node_modules/ sit beside backend/. Per-platform
+// because @rspack and the Remotion compositor ship native binaries.
+func EnsureRemotion() (string, error) {
+	if have(RemotionScript()) {
+		return RemotionDir(), nil
+	}
+	name := fmt.Sprintf("remotion-bundle-%s-%s.tar.gz", runtime.GOOS, runtime.GOARCH)
+	url, err := latestReleaseAssetURL(name)
+	if err != nil {
+		return "", err
+	}
+	archive := filepath.Join(os.TempDir(), "podcli-"+name)
+	if err := fetch(url, archive, "remotion"); err != nil {
+		return "", err
+	}
+	defer os.Remove(archive)
+	os.RemoveAll(RemotionDir())
+	os.RemoveAll(filepath.Join(paths.RuntimeDir(), "node_modules"))
+	if err := extractTarGz(archive, paths.RuntimeDir()); err != nil {
+		return "", err
+	}
+	if !have(RemotionScript()) {
+		return "", fmt.Errorf("remotion render.mjs missing after extraction")
+	}
+	return RemotionDir(), nil
+}
+
+// PrewarmRemotion compiles the project-independent composition bundle once into
+// the managed dir so the first caption render skips the ~20s bundling step.
+func PrewarmRemotion() error {
+	node := NodeBin()
+	if !have(node) || !have(RemotionScript()) {
+		return fmt.Errorf("node/remotion not provisioned")
+	}
+	cmd := exec.Command(node, RemotionScript(), "--prebundle")
+	cmd.Dir = RemotionDir()
+	cmd.Env = append(os.Environ(), "PODCLI_CACHE_DIR="+filepath.Join(RemotionDir(), ".bundle-cache"))
+	cmd.Stderr = os.Stderr
+	return cmd.Run()
+}
+
+// EnsureRemotionBrowser pre-downloads the Chrome Headless Shell the Remotion
+// renderer needs, so the first caption render works offline. Best-effort: if it
+// fails, the renderer downloads the browser on first use.
+func EnsureRemotionBrowser() error {
+	node := NodeBin()
+	if !have(node) || !have(RemotionScript()) {
+		return fmt.Errorf("node/remotion not provisioned")
+	}
+	cmd := exec.Command(node, "-e",
+		"import('@remotion/renderer').then(r=>r.ensureBrowser()).then(()=>process.exit(0)).catch(e=>{console.error(String(e));process.exit(1)})")
+	cmd.Dir = RemotionDir()
+	cmd.Stderr = os.Stderr
+	return cmd.Run()
+}
+
 // EnsureStudio fetches the prebuilt, platform-independent studio bundle (server +
 // SPA) from the latest release into StudioDir.
 func EnsureStudio() (string, error) {
diff --git a/cli/main.go b/cli/main.go
index 973009e..8509acd 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -188,8 +188,25 @@ func setup(args []string) int {
 	} else {
 		fmt.Printf("  studio:  %s\n", sd)
 	}
+	if rd, err := provision.EnsureRemotion(); err != nil {
+		fmt.Fprintf(os.Stderr, "  remotion: skipped (%v) — captions/thumbnails need a published release\n", err)
+	} else {
+		fmt.Printf("  remotion: %s\n", rd)
+		if err := provision.PrewarmRemotion(); err != nil {
+			fmt.Fprintf(os.Stderr, "  bundle:   deferred to first render (%v)\n", err)
+		} else {
+			fmt.Printf("  bundle:   prebuilt\n")
+		}
+		if err := provision.EnsureRemotionBrowser(); err != nil {
+			fmt.Fprintf(os.Stderr, "  browser:  deferred to first render (%v)\n", err)
+		} else {
+			fmt.Printf("  browser:  ready\n")
+		}
+	}
 	if engine.MCPServer() != "" {
-		if err := registerMCPServer(); err != nil {
+		if mcpRegisteredToSelf() {
+			fmt.Printf("  mcp:     already registered\n")
+		} else if err := registerMCPServer(); err != nil {
 			fmt.Fprintf(os.Stderr, "  mcp:     not registered (%v) — run `podcli mcp install`\n", err)
 		} else {
 			fmt.Printf("  mcp:     registered with Claude Code\n")
@@ -217,6 +234,19 @@ func registerMCPServer() error {
 	return nil
 }
 
+func mcpRegisteredToSelf() bool {
+	claude, err := exec.LookPath("claude")
+	if err != nil {
+		return false
+	}
+	self, err := os.Executable()
+	if err != nil {
+		return false
+	}
+	out, err := exec.Command(claude, "mcp", "get", "podcli").CombinedOutput()
+	return err == nil && strings.Contains(string(out), self)
+}
+
 func mcpInstall() int {
 	if err := registerMCPServer(); err != nil {
 		self, _ := os.Executable()
@@ -269,6 +299,11 @@ func doctor() {
 	} else {
 		fmt.Printf("  mcp:      not provisioned (needs a published release)\n")
 	}
+	if rs := provision.RemotionScript(); fileExists(rs) {
+		fmt.Printf("  remotion: %s\n", rs)
+	} else {
+		fmt.Printf("  remotion: not provisioned (captions/thumbnails need a published release)\n")
+	}
 	fmt.Println("\nModels")
 	fmt.Printf("  base:     %s\n", presence(provision.ModelPath("base")))
 	fmt.Printf("  vad:      %s\n", presence(provision.VADModelPath()))
@@ -281,6 +316,11 @@ func presence(p string) string {
 	return "not provisioned — run `podcli setup`"
 }
 
+func fileExists(p string) bool {
+	_, err := os.Stat(p)
+	return err == nil
+}
+
 func humanBytes(n int64) string {
 	switch {
 	case n >= 1<<20:
diff --git a/scripts/build-remotion.sh b/scripts/build-remotion.sh
new file mode 100644
index 0000000..09769d1
--- /dev/null
+++ b/scripts/build-remotion.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+# Build the Remotion render bundle: the remotion/ project plus a production
+# node_modules with only the render deps. Native bindings (@rspack, the Remotion
+# compositor) are per-platform, so this MUST run on each target platform/arch in
+# CI — the produced bundle is not portable across os/arch.
+#
+# Layout (extracted to the runtime dir): remotion/ + node_modules/ as siblings of
+# backend/, so render.mjs resolves node_modules and the Python backend resolves
+# runtime/remotion/render.mjs + runtime/node_modules.
+#
+# Usage: scripts/build-remotion.sh [out-dir]   (default: dist/remotion-bundle)
+set -e
+here="$(cd "$(dirname "$0")/.." && pwd)"
+out="${1:-$here/dist/remotion-bundle}"
+
+rm -rf "$out"
+mkdir -p "$out"
+cp -R "$here/remotion" "$out/remotion"
+cp "$here/tsconfig.json" "$out/tsconfig.json" 2>/dev/null || true
+
+cat > "$out/package.json" <<'JSON'
+{
+  "name": "podcli-remotion-bundle",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "@remotion/bundler": "^4.0.441",
+    "@remotion/renderer": "^4.0.441",
+    "remotion": "^4.0.441",
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
+JSON
+
+cd "$out"
+npm install --omit=dev --no-audit --no-fund --no-package-lock
+echo "remotion bundle -> $out"

From ec91bd282e765a27cd5e731f9d72148d2da0dc57 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 17:23:59 +0400
Subject: [PATCH 25/41] Fix native render gaps found in third audit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Remotion bundle build failed on a clean install: build-remotion.sh omitted
  @fontsource/dm-sans, which the compositions import — so the prebundle could
  not resolve the font and captions/bookends broke. Added the dep.
- podcli studio was unreachable: clip_studio.py lived at repo-root scripts/ and
  was never shipped. Moved it into backend/ (so it embeds) and resolve it from
  the backend dir; it also now honors PODCLI_DATA for output.
- render-bookend.mjs hardcoded its bundle cache into the runtime dir; it now
  honors PODCLI_CACHE_DIR and shares the prewarmed bundle with render.mjs.
- New-project data no longer lands in the global runtime dir (which setup/
  self-update overwrite): ProjectDir falls back to the working dir, keeping
  data project-local. doctor now prints the resolved project dir.
---
 backend/cli.py                      |  5 ++---
 {scripts => backend}/clip_studio.py |  6 +++++-
 cli/internal/engine/engine.go       | 16 ++++++++--------
 cli/main.go                         |  3 +++
 remotion/render-bookend.mjs         |  8 +++++++-
 scripts/build-remotion.sh           |  3 ++-
 6 files changed, 27 insertions(+), 14 deletions(-)
 rename {scripts => backend}/clip_studio.py (97%)

diff --git a/backend/cli.py b/backend/cli.py
index 8e4b790..b3f5c4f 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -313,7 +313,7 @@ def _should_enter_post_render_loop(config: dict, interrupted: bool, results: lis
 def cmd_studio(args):
     """Cut a fragment + wrap it with Remotion intro/outro bookends.
 
-    Thin wrapper around scripts/clip_studio.py (which orchestrates the
+    Thin wrapper around clip_studio.py (which orchestrates the
     fragment render, bookend renders, and the concat). Runs it as a
     subprocess with this same interpreter so the venv is reused.
     """
@@ -321,8 +321,7 @@ def cmd_studio(args):
         print("Error: provide either --start and --end, or --paragraph \"text\"", file=sys.stderr)
         sys.exit(1)
 
-    project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
-    script = os.path.join(project_root, "scripts", "clip_studio.py")
+    script = os.path.join(os.path.dirname(os.path.abspath(__file__)), "clip_studio.py")
     if not os.path.exists(script):
         print(f"Error: clip_studio.py not found at {script}", file=sys.stderr)
         sys.exit(1)
diff --git a/scripts/clip_studio.py b/backend/clip_studio.py
similarity index 97%
rename from scripts/clip_studio.py
rename to backend/clip_studio.py
index cac3eb2..376bff4 100644
--- a/scripts/clip_studio.py
+++ b/backend/clip_studio.py
@@ -37,6 +37,9 @@
 
 FFMPEG = os.environ.get("PODCLI_FFMPEG", "ffmpeg")
 NODE = os.environ.get("PODCLI_NODE", "node")
+# Share the prewarmed, project-independent composition bundle with the fragment
+# caption render (clip_generator) and the bookend render.
+os.environ.setdefault("PODCLI_CACHE_DIR", os.path.join(ROOT, "remotion", ".bundle-cache"))
 
 # Brand config: remembered handle / platforms / colors / outro title so they
 # don't have to be retyped each run. CLI flags override these; these override
@@ -236,7 +239,8 @@ def main():
     video = os.path.abspath(video)
     if not os.path.exists(video):
         raise SystemExit(f"Video not found: {video}")
-    out_dir = os.path.join(ROOT, "data", "output")
+    data_dir = os.environ.get("PODCLI_DATA") or os.path.join(ROOT, "data")
+    out_dir = os.path.join(data_dir, "output")
     os.makedirs(out_dir, exist_ok=True)
 
     # Need a transcript if cutting by paragraph or if rendering captions.
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index cecb337..f1e49d6 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -204,17 +204,17 @@ func RunMCP() (int, error) {
 // working directory itself. This keeps episode data, presets, and .env
 // project-local — the behavior of the old in-repo launcher — now that the
 // backend lives in the global runtime dir instead of beside the data.
-// ProjectDir returns the nearest ancestor of the working directory holding a
-// .podcli dir or .podcli-home marker, and whether one was found. Only an
-// established project pins data locally; in an unmarked dir the second return is
-// false so callers leave the backend's default home rather than scattering
-// .podcli/ into arbitrary directories.
+// ProjectDir resolves where a run's data lives: the nearest ancestor of the
+// working directory holding a .podcli/.podcli-home marker, else the working
+// directory itself. Data is always project-local — never the global runtime dir,
+// where setup/self-update would overwrite it. The bool is false only when the
+// working directory can't be determined.
 func ProjectDir() (string, bool) {
-	dir, err := os.Getwd()
+	cwd, err := os.Getwd()
 	if err != nil {
 		return "", false
 	}
-	for {
+	for dir := cwd; ; {
 		if exists(filepath.Join(dir, ".podcli")) || exists(filepath.Join(dir, ".podcli-home")) {
 			return dir, true
 		}
@@ -224,7 +224,7 @@ func ProjectDir() (string, bool) {
 		}
 		dir = parent
 	}
-	return "", false
+	return cwd, true
 }
 
 func Run(args []string) (int, error) {
diff --git a/cli/main.go b/cli/main.go
index 8509acd..8fc7926 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -264,6 +264,9 @@ func doctor() {
 	fmt.Printf("  home:     %s\n", paths.Home())
 	fmt.Printf("  runtime:  %s\n", paths.RuntimeDir())
 	fmt.Printf("  models:   %s\n", paths.ModelsDir())
+	if proj, ok := engine.ProjectDir(); ok {
+		fmt.Printf("  project:  %s  (episodes, presets, .env resolve here)\n", proj)
+	}
 	fmt.Println("\nEngine resolution")
 	if root, ok := engine.BackendRoot(); ok {
 		fmt.Printf("  backend:  %s\n", root)
diff --git a/remotion/render-bookend.mjs b/remotion/render-bookend.mjs
index ee6c7df..2688079 100644
--- a/remotion/render-bookend.mjs
+++ b/remotion/render-bookend.mjs
@@ -25,7 +25,13 @@ import { fileURLToPath } from "url";
 
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const PROJECT_ROOT = path.resolve(__dirname, "..");
-const CACHE_DIR = path.join(PROJECT_ROOT, ".podcli", "cache", "remotion-bundle");
+// Share the (project-independent) composition bundle with render.mjs via
+// PODCLI_CACHE_DIR so bookends reuse the prewarmed bundle instead of rebuilding
+// into the runtime dir.
+const CACHE_ROOT = process.env.PODCLI_CACHE_DIR
+  ? path.resolve(process.env.PODCLI_CACHE_DIR)
+  : path.join(PROJECT_ROOT, "data", "cache");
+const CACHE_DIR = path.join(CACHE_ROOT, "remotion-bundle");
 const ENTRY_POINT = path.join(__dirname, "src", "index.ts");
 
 function parseArgs() {
diff --git a/scripts/build-remotion.sh b/scripts/build-remotion.sh
index 09769d1..aef6c3a 100644
--- a/scripts/build-remotion.sh
+++ b/scripts/build-remotion.sh
@@ -28,7 +28,8 @@ cat > "$out/package.json" <<'JSON'
     "@remotion/renderer": "^4.0.441",
     "remotion": "^4.0.441",
     "react": "^18.3.1",
-    "react-dom": "^18.3.1"
+    "react-dom": "^18.3.1",
+    "@fontsource/dm-sans": "^5.2.8"
   }
 }
 JSON

From e3e95f17f5c23aea39ab1fa3b8fe7aa556e74bc9 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 18:11:43 +0400
Subject: [PATCH 26/41] Fix native release and whisper.cpp gaps

---
 .github/workflows/release.yml                |  2 +-
 backend/cli.py                               |  2 +-
 backend/services/transcription_whispercpp.py |  4 ++--
 cli/main.go                                  | 13 ++++++++++-
 cli/main_test.go                             | 12 ++++++++++
 tests/test_whispercpp_adapter.py             | 23 ++++++++++++++++++++
 6 files changed, 51 insertions(+), 5 deletions(-)
 create mode 100644 cli/main_test.go
 create mode 100644 tests/test_whispercpp_adapter.py

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index 7d3b85a..3810ced 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -38,7 +38,7 @@ jobs:
           CGO_ENABLED: '0'
         run: |
           VERSION="${GITHUB_REF_NAME#v}"
-          go generate ./internal/backend/   # sync repo backend into files/ for go:embed
+          go generate ./...   # sync backend + PodStack commands for go:embed
           EXT=""
           [ "${{ matrix.goos }}" = "windows" ] && EXT=".exe"
           OUT="podcli-${{ matrix.goos }}-${{ matrix.goarch }}${EXT}"
diff --git a/backend/cli.py b/backend/cli.py
index b3f5c4f..7c5537f 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -594,7 +594,7 @@ def _transcribe_progress(pct, msg):
                     enable_diarization=not config.get("no_speakers", False),
                     progress_callback=_transcribe_progress,
                 )
-            except RuntimeError as e:
+            except (RuntimeError, FileNotFoundError, subprocess.CalledProcessError) as e:
                 _spin_stop.set()
                 spin_thread.join(timeout=1)
                 print(f"\r{' ' * 70}\r  ✗ {e}\n", flush=True)
diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index 2920356..6aa56a2 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -34,7 +34,7 @@ def _tokens_to_words(tokens: list[dict]) -> list[dict]:
 
     def flush():
         nonlocal cur_text, cur_start, cur_end
-        text = cur_text.strip()
+        text = cur_text.strip().lstrip("▁").strip()
         if text and cur_start is not None:
             words.append({
                 "word": text,
@@ -166,7 +166,7 @@ def transcribe_file(
             cmd += ["--vad", "--vad-model", vad_model]
         if language:
             cmd += ["-l", language]
-        subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+        subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.PIPE, text=True)
 
         with open(out_base + ".json", encoding="utf-8") as f:
             data = json.load(f)
diff --git a/cli/main.go b/cli/main.go
index 8fc7926..4c1a21b 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -63,11 +63,13 @@ func main() {
 func runEngine(args []string) int {
 	update.NotifyIfOutdated(Version)
 	if transcribeEngine(args) == "whispercpp" {
-		if _, err := provision.EnsureModel("base"); err != nil {
+		model, err := provision.EnsureModel(transcribeModel(args))
+		if err != nil {
 			fmt.Fprintln(os.Stderr, "podcli: provisioning model:", err)
 			return 1
 		}
 		os.Setenv("PODCLI_ENGINE", "whispercpp")
+		os.Setenv("PODCLI_WHISPERCPP_MODEL", model)
 	}
 	code, err := engine.Run(args)
 	if err != nil {
@@ -77,6 +79,15 @@ func runEngine(args []string) int {
 	return code
 }
 
+func transcribeModel(args []string) string {
+	for _, arg := range args {
+		if arg == "--fast" {
+			return "tiny.en"
+		}
+	}
+	return "base"
+}
+
 func configCmd(args []string) int {
 	switch {
 	case args[0] == "get" && len(args) == 2:
diff --git a/cli/main_test.go b/cli/main_test.go
new file mode 100644
index 0000000..d3a1b74
--- /dev/null
+++ b/cli/main_test.go
@@ -0,0 +1,12 @@
+package main
+
+import "testing"
+
+func TestTranscribeModel(t *testing.T) {
+	if got := transcribeModel([]string{"process", "episode.mp4"}); got != "base" {
+		t.Fatalf("default model = %q, want base", got)
+	}
+	if got := transcribeModel([]string{"process", "episode.mp4", "--fast"}); got != "tiny.en" {
+		t.Fatalf("fast model = %q, want tiny.en", got)
+	}
+}
diff --git a/tests/test_whispercpp_adapter.py b/tests/test_whispercpp_adapter.py
new file mode 100644
index 0000000..c1bb9ac
--- /dev/null
+++ b/tests/test_whispercpp_adapter.py
@@ -0,0 +1,23 @@
+import os
+import sys
+import unittest
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+from services.transcription_whispercpp import _tokens_to_words
+
+
+class WhisperCppAdapterTests(unittest.TestCase):
+    def test_sentencepiece_marker_is_removed(self):
+        words = _tokens_to_words([
+            {"text": "▁hello", "offsets": {"from": 0, "to": 100}},
+            {"text": "▁world", "offsets": {"from": 100, "to": 200}},
+        ])
+        self.assertEqual([w["word"] for w in words], ["hello", "world"])
+
+
+if __name__ == "__main__":
+    unittest.main()

From e6aa165317fe0bae587b219c6acb142c36fcf031 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 19:48:03 +0400
Subject: [PATCH 27/41] Harden native install: verify release checksums, pin
 download hosts, block symlink traversal

- Publish checksums.txt in the release workflow; verify it in self-update,
  whisper-cli/studio/remotion provisioning, and the npm installer
- Pin binary downloads/redirects to GitHub hosts (replaces manual redirect
  recursion in the self-updater)
- Reject archive symlinks/hardlinks whose targets escape the extract dir
- Namespace the transcript cache by engine so whisper.cpp doesn't reuse a
  whisper-py transcript; honor PODCLI_FFPROBE in clip_studio
- Fail fast when NPM_TOKEN is unset before publishing
---
 .github/workflows/release.yml           |  13 +++
 backend/clip_studio.py                  |   2 +-
 backend/services/transcript_packer.py   |  12 +-
 cli/internal/provision/provision.go     | 142 ++++++++++++++++++++++--
 cli/internal/provision/security_test.go |  75 +++++++++++++
 cli/internal/provision/studio.go        |  33 +++++-
 cli/internal/update/update.go           |  85 ++++++++++++--
 npm/scripts/install.js                  |  65 ++++++++++-
 tests/test_transcript_cache_engine.py   |  48 ++++++++
 9 files changed, 447 insertions(+), 28 deletions(-)
 create mode 100644 cli/internal/provision/security_test.go
 create mode 100644 tests/test_transcript_cache_engine.py

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index 3810ced..7707e19 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -143,6 +143,11 @@ jobs:
         with:
           path: dist
           merge-multiple: true
+      - name: Generate checksums
+        working-directory: dist
+        run: |
+          sha256sum * > "$RUNNER_TEMP/checksums.txt"
+          mv "$RUNNER_TEMP/checksums.txt" checksums.txt
       - uses: softprops/action-gh-release@v2
         with:
           files: dist/*
@@ -157,6 +162,14 @@ jobs:
         with:
           node-version: '20'
           registry-url: 'https://registry.npmjs.org'
+      - name: Assert NPM_TOKEN is set
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: |
+          if [ -z "$NODE_AUTH_TOKEN" ]; then
+            echo "::error::NPM_TOKEN secret is not set — refusing to publish. The GitHub release and assets are already published; set the secret and re-run this job." >&2
+            exit 1
+          fi
       - name: Publish wrapper pinned to the tag
         working-directory: npm
         env:
diff --git a/backend/clip_studio.py b/backend/clip_studio.py
index 376bff4..bbb9d87 100644
--- a/backend/clip_studio.py
+++ b/backend/clip_studio.py
@@ -72,7 +72,7 @@ def _save_brand(data: dict):
 
 def _probe_duration(path: str) -> float:
     out = subprocess.run(
-        ["ffprobe", "-v", "error", "-show_entries", "format=duration",
+        [os.environ.get("PODCLI_FFPROBE", "ffprobe"), "-v", "error", "-show_entries", "format=duration",
          "-of", "default=nw=1:nk=1", path],
         capture_output=True, text=True,
     )
diff --git a/backend/services/transcript_packer.py b/backend/services/transcript_packer.py
index 2d7beec..82f2532 100644
--- a/backend/services/transcript_packer.py
+++ b/backend/services/transcript_packer.py
@@ -62,8 +62,18 @@ def legacy_md5_cache_path(video_path: str) -> str:
     return os.path.join(_legacy_cache_dir(), hashlib.md5(raw.encode()).hexdigest() + ".json")
 
 
+def _engine_cache_suffix() -> str:
+    """Namespace the cache by engine so a whisper.cpp run doesn't reuse a
+    whisper-py transcript (their timings/word splits differ). whisper-py keeps
+    the bare filename, which the TS transcript cache also writes."""
+    engine = os.environ.get("PODCLI_ENGINE", "whisper-py").strip().lower()
+    if engine in ("whispercpp", "whisper-cpp", "whisper.cpp", "cpp"):
+        return "-whispercpp"
+    return ""
+
+
 def transcript_json_path(cache_hash: str) -> str:
-    return os.path.join(_transcripts_cache_dir(), f"{cache_hash}.json")
+    return os.path.join(_transcripts_cache_dir(), f"{cache_hash}{_engine_cache_suffix()}.json")
 
 
 def load_cached_transcript_for_video(video_path: str) -> dict[str, Any] | None:
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index 7605b7a..1dbc8aa 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -177,6 +177,36 @@ func downloadOnce(url, tmp, label string) (bool, error) {
 	return true, nil
 }
 
+// symlinkTargetInside reports whether a symlink at linkPath pointing to linkname
+// resolves within root (root has a trailing separator). Absolute targets and
+// any path that escapes via .. are rejected — this is the symlink half of the
+// zip-slip defense, since the path guard alone only validates the link's own
+// location, not where it points.
+func symlinkTargetInside(linkPath, linkname, root string) bool {
+	if filepath.IsAbs(linkname) {
+		return false
+	}
+	resolved := filepath.Clean(filepath.Join(filepath.Dir(linkPath), linkname))
+	return strings.HasPrefix(resolved+string(os.PathSeparator), root)
+}
+
+// ParseChecksums parses sha256sum-style "<hex>  <name>" lines into a name->hash
+// map, keyed by basename so it matches the asset filenames consumers request.
+func ParseChecksums(data []byte) map[string]string {
+	out := map[string]string{}
+	for _, line := range strings.Split(string(data), "\n") {
+		fields := strings.Fields(line)
+		if len(fields) < 2 {
+			continue
+		}
+		out[filepath.Base(fields[len(fields)-1])] = strings.ToLower(fields[0])
+	}
+	return out
+}
+
+// Sha256File returns the lowercase hex SHA-256 of the file at path.
+func Sha256File(path string) (string, error) { return sha256file(path) }
+
 func sha256file(path string) (string, error) {
 	f, err := os.Open(path)
 	if err != nil {
@@ -230,16 +260,16 @@ func WhisperCLIBin() string {
 	return filepath.Join(paths.RuntimeDir(), "whisper", "whisper-cli"+exeSuffix())
 }
 
-func latestReleaseAssetURL(name string) (string, error) {
+func latestReleaseAssets() (map[string]string, error) {
 	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/"+podcliRepo+"/releases/latest", nil)
 	req.Header.Set("Accept", "application/vnd.github+json")
 	resp, err := http.DefaultClient.Do(req)
 	if err != nil {
-		return "", err
+		return nil, err
 	}
 	defer resp.Body.Close()
 	if resp.StatusCode != http.StatusOK {
-		return "", fmt.Errorf("no published release (HTTP %d)", resp.StatusCode)
+		return nil, fmt.Errorf("no published release (HTTP %d)", resp.StatusCode)
 	}
 	var rel struct {
 		Assets []struct {
@@ -248,32 +278,117 @@ func latestReleaseAssetURL(name string) (string, error) {
 		} `json:"assets"`
 	}
 	if err := json.NewDecoder(resp.Body).Decode(&rel); err != nil {
-		return "", err
+		return nil, err
 	}
+	out := make(map[string]string, len(rel.Assets))
 	for _, a := range rel.Assets {
-		if a.Name == name {
-			return a.URL, nil
-		}
+		out[a.Name] = a.URL
+	}
+	return out, nil
+}
+
+func latestReleaseAssetURL(name string) (string, error) {
+	assets, err := latestReleaseAssets()
+	if err != nil {
+		return "", err
+	}
+	if u, ok := assets[name]; ok {
+		return u, nil
 	}
 	return "", fmt.Errorf("asset %s not in latest release", name)
 }
 
+// allowedReleaseHost pins binary downloads to GitHub's own hosts so a redirect
+// can't divert a download to an attacker-controlled host.
+func allowedReleaseHost(h string) bool {
+	h = strings.ToLower(h)
+	switch h {
+	case "github.com", "api.github.com", "objects.githubusercontent.com", "codeload.github.com":
+		return true
+	}
+	return strings.HasSuffix(h, ".githubusercontent.com")
+}
+
+func releaseHTTPClient() *http.Client {
+	return &http.Client{
+		CheckRedirect: func(req *http.Request, via []*http.Request) error {
+			if len(via) >= 10 {
+				return fmt.Errorf("too many redirects")
+			}
+			if !allowedReleaseHost(req.URL.Hostname()) {
+				return fmt.Errorf("refusing redirect to untrusted host %q", req.URL.Hostname())
+			}
+			return nil
+		},
+	}
+}
+
+func httpGetBytes(url string) ([]byte, error) {
+	resp, err := releaseHTTPClient().Get(url)
+	if err != nil {
+		return nil, err
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return nil, fmt.Errorf("HTTP %d for %s", resp.StatusCode, url)
+	}
+	return io.ReadAll(io.LimitReader(resp.Body, 1<<20))
+}
+
+// verifyReleaseAsset checks the file at path against the release's checksums.txt.
+// It fails closed on a real checksum mismatch, but fails open (with a warning)
+// when checksums.txt or the asset's entry is absent — so a release published
+// before checksum manifests, or a CI hiccup, never bricks an install.
+func verifyReleaseAsset(assets map[string]string, assetName, path string) error {
+	sumsURL, ok := assets["checksums.txt"]
+	if !ok {
+		fmt.Fprintf(os.Stderr, "  (no checksums.txt in release — skipped verification of %s)\n", assetName)
+		return nil
+	}
+	data, err := httpGetBytes(sumsURL)
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "  (could not fetch checksums.txt: %v — skipped verification of %s)\n", err, assetName)
+		return nil
+	}
+	want, ok := ParseChecksums(data)[assetName]
+	if !ok {
+		fmt.Fprintf(os.Stderr, "  (no checksum entry for %s — skipped verification)\n", assetName)
+		return nil
+	}
+	got, err := sha256file(path)
+	if err != nil {
+		return err
+	}
+	if !strings.EqualFold(got, want) {
+		os.Remove(path)
+		return fmt.Errorf("checksum mismatch for %s: got %s want %s", assetName, got, want)
+	}
+	return nil
+}
+
 func EnsureWhisperCpp() (string, error) {
 	bin := WhisperCLIBin()
 	if have(bin) {
 		return bin, nil
 	}
 	name := fmt.Sprintf("whisper-cli-%s-%s%s", runtime.GOOS, runtime.GOARCH, exeSuffix())
-	url, err := latestReleaseAssetURL(name)
+	assets, err := latestReleaseAssets()
 	if err != nil {
 		return "", err
 	}
+	url, ok := assets[name]
+	if !ok {
+		return "", fmt.Errorf("asset %s not in latest release", name)
+	}
 	if err := os.MkdirAll(filepath.Dir(bin), 0o755); err != nil {
 		return "", err
 	}
 	if err := fetch(url, bin, "whisper-cli"); err != nil {
 		return "", err
 	}
+	if err := verifyReleaseAsset(assets, name, bin); err != nil {
+		return "", err
+	}
 	if runtime.GOOS != "windows" {
 		os.Chmod(bin, 0o755)
 	}
@@ -540,6 +655,9 @@ func extractTarGz(archive, dest string) error {
 				return err
 			}
 		case tar.TypeSymlink:
+			if !symlinkTargetInside(target, h.Linkname, root) {
+				return fmt.Errorf("unsafe symlink %s -> %s in archive", h.Name, h.Linkname)
+			}
 			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
 				return err
 			}
@@ -548,11 +666,17 @@ func extractTarGz(archive, dest string) error {
 				return err
 			}
 		case tar.TypeLink:
+			linkTarget := filepath.Join(dest, h.Linkname)
+			if !strings.HasPrefix(linkTarget, root) {
+				return fmt.Errorf("unsafe hardlink %s -> %s in archive", h.Name, h.Linkname)
+			}
 			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
 				return err
 			}
 			os.Remove(target)
-			os.Link(filepath.Join(dest, h.Linkname), target)
+			if err := os.Link(linkTarget, target); err != nil {
+				return err
+			}
 		}
 	}
 	return nil
diff --git a/cli/internal/provision/security_test.go b/cli/internal/provision/security_test.go
new file mode 100644
index 0000000..2a2d257
--- /dev/null
+++ b/cli/internal/provision/security_test.go
@@ -0,0 +1,75 @@
+package provision
+
+import (
+	"archive/tar"
+	"bytes"
+	"compress/gzip"
+	"os"
+	"path/filepath"
+	"runtime"
+	"testing"
+)
+
+func TestParseChecksums(t *testing.T) {
+	in := []byte("abc123  podcli-darwin-arm64\n" +
+		"DEADBEEF  ./dist/whisper-cli-linux-amd64\n" +
+		"\n" +
+		"malformed-line\n")
+	got := ParseChecksums(in)
+	if got["podcli-darwin-arm64"] != "abc123" {
+		t.Fatalf("podcli hash = %q", got["podcli-darwin-arm64"])
+	}
+	if got["whisper-cli-linux-amd64"] != "deadbeef" {
+		t.Fatalf("whisper hash = %q (want lowercased, basename-keyed)", got["whisper-cli-linux-amd64"])
+	}
+	if len(got) != 2 {
+		t.Fatalf("expected 2 entries, got %d: %v", len(got), got)
+	}
+}
+
+func TestSymlinkTargetInside(t *testing.T) {
+	root := filepath.Clean("/tmp/dest") + string(os.PathSeparator)
+	cases := []struct {
+		name     string
+		linkPath string
+		linkname string
+		ok       bool
+	}{
+		{"relative inside", "/tmp/dest/a/link", "../b", true},
+		{"to root", "/tmp/dest/link", ".", true},
+		{"escape via dotdot", "/tmp/dest/link", "../../etc/passwd", false},
+		{"absolute", "/tmp/dest/link", "/etc/passwd", false},
+		{"deep escape", "/tmp/dest/a/b/link", "../../../outside", false},
+	}
+	for _, c := range cases {
+		if got := symlinkTargetInside(c.linkPath, c.linkname, root); got != c.ok {
+			t.Errorf("%s: symlinkTargetInside(%q,%q)=%v want %v", c.name, c.linkPath, c.linkname, got, c.ok)
+		}
+	}
+}
+
+// TestExtractTarGzRejectsEscapingSymlink builds an in-memory tarball with a
+// symlink that escapes the destination and confirms extraction refuses it.
+func TestExtractTarGzRejectsEscapingSymlink(t *testing.T) {
+	if runtime.GOOS == "windows" {
+		t.Skip("symlink semantics differ on Windows")
+	}
+	var buf bytes.Buffer
+	gz := gzip.NewWriter(&buf)
+	tw := tar.NewWriter(gz)
+	if err := tw.WriteHeader(&tar.Header{Name: "evil", Typeflag: tar.TypeSymlink, Linkname: "../../../../etc", Mode: 0o777}); err != nil {
+		t.Fatal(err)
+	}
+	tw.Close()
+	gz.Close()
+
+	dir := t.TempDir()
+	archive := filepath.Join(dir, "evil.tar.gz")
+	if err := os.WriteFile(archive, buf.Bytes(), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	dest := filepath.Join(dir, "out")
+	if err := extractTarGz(archive, dest); err == nil {
+		t.Fatal("expected extractTarGz to reject escaping symlink, got nil error")
+	}
+}
diff --git a/cli/internal/provision/studio.go b/cli/internal/provision/studio.go
index 28fdc0d..43a6e17 100644
--- a/cli/internal/provision/studio.go
+++ b/cli/internal/provision/studio.go
@@ -93,17 +93,28 @@ func EnsureRemotion() (string, error) {
 		return RemotionDir(), nil
 	}
 	name := fmt.Sprintf("remotion-bundle-%s-%s.tar.gz", runtime.GOOS, runtime.GOARCH)
-	url, err := latestReleaseAssetURL(name)
+	assets, err := latestReleaseAssets()
 	if err != nil {
 		return "", err
 	}
+	url, ok := assets[name]
+	if !ok {
+		return "", fmt.Errorf("asset %s not in latest release", name)
+	}
 	archive := filepath.Join(os.TempDir(), "podcli-"+name)
 	if err := fetch(url, archive, "remotion"); err != nil {
 		return "", err
 	}
 	defer os.Remove(archive)
-	os.RemoveAll(RemotionDir())
-	os.RemoveAll(filepath.Join(paths.RuntimeDir(), "node_modules"))
+	if err := verifyReleaseAsset(assets, name, archive); err != nil {
+		return "", err
+	}
+	if err := os.RemoveAll(RemotionDir()); err != nil {
+		return "", err
+	}
+	if err := os.RemoveAll(filepath.Join(paths.RuntimeDir(), "node_modules")); err != nil {
+		return "", err
+	}
 	if err := extractTarGz(archive, paths.RuntimeDir()); err != nil {
 		return "", err
 	}
@@ -149,15 +160,22 @@ func EnsureStudio() (string, error) {
 	if have(server) {
 		return StudioDir(), nil
 	}
-	url, err := latestReleaseAssetURL("studio-bundle.tar.gz")
+	assets, err := latestReleaseAssets()
 	if err != nil {
 		return "", err
 	}
+	url, ok := assets["studio-bundle.tar.gz"]
+	if !ok {
+		return "", fmt.Errorf("asset studio-bundle.tar.gz not in latest release")
+	}
 	archive := filepath.Join(os.TempDir(), "podcli-studio-bundle.tar.gz")
 	if err := fetch(url, archive, "studio"); err != nil {
 		return "", err
 	}
 	defer os.Remove(archive)
+	if err := verifyReleaseAsset(assets, "studio-bundle.tar.gz", archive); err != nil {
+		return "", err
+	}
 	if err := os.RemoveAll(StudioDir()); err != nil {
 		return "", err
 	}
@@ -231,11 +249,16 @@ func extractTarGzStrip1(archive, dest string) error {
 				return err
 			}
 		case tar.TypeSymlink:
+			if !symlinkTargetInside(target, h.Linkname, root) {
+				return fmt.Errorf("unsafe symlink %s -> %s in archive", h.Name, h.Linkname)
+			}
 			if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
 				return err
 			}
 			os.Remove(target)
-			os.Symlink(h.Linkname, target)
+			if err := os.Symlink(h.Linkname, target); err != nil {
+				return err
+			}
 		}
 	}
 	return nil
diff --git a/cli/internal/update/update.go b/cli/internal/update/update.go
index b3a2d8e..efe330e 100644
--- a/cli/internal/update/update.go
+++ b/cli/internal/update/update.go
@@ -17,6 +17,7 @@ import (
 
 	"podcli/internal/config"
 	"podcli/internal/paths"
+	"podcli/internal/provision"
 )
 
 const repo = "nmbrthirteen/podcli"
@@ -34,9 +35,39 @@ func managedBin() string {
 	return filepath.Join(paths.BinDir(), "podcli"+exeExt())
 }
 
+func assetName() string {
+	return fmt.Sprintf("podcli-%s-%s%s", runtime.GOOS, runtime.GOARCH, exeExt())
+}
+
 func assetURL(tag string) string {
-	return fmt.Sprintf("https://github.com/%s/releases/download/v%s/podcli-%s-%s%s",
-		repo, tag, runtime.GOOS, runtime.GOARCH, exeExt())
+	return fmt.Sprintf("https://github.com/%s/releases/download/v%s/%s", repo, tag, assetName())
+}
+
+func checksumsURL(tag string) string {
+	return fmt.Sprintf("https://github.com/%s/releases/download/v%s/checksums.txt", repo, tag)
+}
+
+func allowedHost(h string) bool {
+	h = strings.ToLower(h)
+	switch h {
+	case "github.com", "api.github.com", "objects.githubusercontent.com", "codeload.github.com":
+		return true
+	}
+	return strings.HasSuffix(h, ".githubusercontent.com")
+}
+
+func guardedClient() *http.Client {
+	return &http.Client{
+		CheckRedirect: func(req *http.Request, via []*http.Request) error {
+			if len(via) >= 10 {
+				return fmt.Errorf("too many redirects")
+			}
+			if !allowedHost(req.URL.Hostname()) {
+				return fmt.Errorf("refusing redirect to untrusted host %q", req.URL.Hostname())
+			}
+			return nil
+		},
+	}
 }
 
 func latestTag(timeout time.Duration) (string, error) {
@@ -126,6 +157,10 @@ func apply(tag string) error {
 	if err := downloadFile(assetURL(tag), staged); err != nil {
 		return err
 	}
+	if err := verifyStaged(tag, staged); err != nil {
+		os.Remove(staged)
+		return err
+	}
 	if runtime.GOOS != "windows" {
 		if err := os.Chmod(staged, 0o755); err != nil {
 			return err
@@ -134,6 +169,41 @@ func apply(tag string) error {
 	return swap(staged, dest)
 }
 
+// verifyStaged checks the downloaded binary against the release's checksums.txt.
+// Fails closed on a mismatch; fails open (warning) only when checksums.txt is
+// absent, so an older release without a manifest still updates.
+func verifyStaged(tag, staged string) error {
+	resp, err := guardedClient().Get(checksumsURL(tag))
+	if err != nil {
+		return err
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode == http.StatusNotFound {
+		fmt.Fprintln(os.Stderr, "  (no checksums.txt in release — skipped verification)")
+		return nil
+	}
+	if resp.StatusCode != http.StatusOK {
+		return fmt.Errorf("HTTP %d fetching checksums.txt", resp.StatusCode)
+	}
+	data, err := io.ReadAll(io.LimitReader(resp.Body, 1<<20))
+	if err != nil {
+		return err
+	}
+	want, ok := provision.ParseChecksums(data)[assetName()]
+	if !ok {
+		fmt.Fprintf(os.Stderr, "  (no checksum entry for %s — skipped verification)\n", assetName())
+		return nil
+	}
+	got, err := provision.Sha256File(staged)
+	if err != nil {
+		return err
+	}
+	if !strings.EqualFold(got, want) {
+		return fmt.Errorf("checksum mismatch: got %s want %s", got, want)
+	}
+	return nil
+}
+
 // swap replaces dest with staged. On Windows a running .exe can't be overwritten,
 // so the old binary is moved aside first; on Unix the rename is atomic.
 func swap(staged, dest string) error {
@@ -149,19 +219,12 @@ func swap(staged, dest string) error {
 	return os.Rename(staged, dest)
 }
 
-func downloadFile(url, dest string, redirects ...int) error {
-	depth := 0
-	if len(redirects) > 0 {
-		depth = redirects[0]
-	}
-	resp, err := http.Get(url)
+func downloadFile(url, dest string) error {
+	resp, err := guardedClient().Get(url)
 	if err != nil {
 		return err
 	}
 	defer resp.Body.Close()
-	if loc := resp.Header.Get("Location"); loc != "" && resp.StatusCode/100 == 3 && depth < 6 {
-		return downloadFile(loc, dest, depth+1)
-	}
 	if resp.StatusCode != http.StatusOK {
 		return fmt.Errorf("HTTP %d for %s", resp.StatusCode, url)
 	}
diff --git a/npm/scripts/install.js b/npm/scripts/install.js
index d0ebd9b..14bd731 100644
--- a/npm/scripts/install.js
+++ b/npm/scripts/install.js
@@ -9,10 +9,17 @@ const os = require('os');
 const fs = require('fs');
 const path = require('path');
 const https = require('https');
+const crypto = require('crypto');
 const { version } = require('../package.json');
 
 const REPO = 'nmbrthirteen/podcli';
 
+function allowedHost(h) {
+  h = String(h).toLowerCase();
+  if (['github.com', 'api.github.com', 'objects.githubusercontent.com', 'codeload.github.com'].includes(h)) return true;
+  return h.endsWith('.githubusercontent.com');
+}
+
 const TARGETS = {
   'darwin-x64': 'darwin-amd64',
   'darwin-arm64': 'darwin-arm64',
@@ -43,6 +50,9 @@ function target() {
 function download(url, dest, redirects) {
   redirects = redirects || 0;
   return new Promise((resolve, reject) => {
+    let host;
+    try { host = new URL(url).hostname; } catch (e) { return reject(e); }
+    if (!allowedHost(host)) return reject(new Error(`refusing to download from untrusted host ${host}`));
     https
       .get(url, { headers: { 'User-Agent': 'podcli-install' } }, (res) => {
         if ([301, 302, 307, 308].includes(res.statusCode) && res.headers.location && redirects < 6) {
@@ -66,6 +76,57 @@ function download(url, dest, redirects) {
   });
 }
 
+function fetchText(url, redirects) {
+  redirects = redirects || 0;
+  return new Promise((resolve, reject) => {
+    let host;
+    try { host = new URL(url).hostname; } catch (e) { return reject(e); }
+    if (!allowedHost(host)) return reject(new Error(`refusing to fetch from untrusted host ${host}`));
+    https
+      .get(url, { headers: { 'User-Agent': 'podcli-install' } }, (res) => {
+        if ([301, 302, 307, 308].includes(res.statusCode) && res.headers.location && redirects < 6) {
+          res.resume();
+          return resolve(fetchText(res.headers.location, redirects + 1));
+        }
+        if (res.statusCode === 404) { res.resume(); return resolve(null); }
+        if (res.statusCode !== 200) { res.resume(); return reject(new Error(`HTTP ${res.statusCode} for ${url}`)); }
+        let body = '';
+        res.setEncoding('utf8');
+        res.on('data', (c) => { body += c; });
+        res.on('end', () => resolve(body));
+      })
+      .on('error', reject);
+  });
+}
+
+function parseChecksums(text) {
+  const out = {};
+  for (const line of text.split('\n')) {
+    const f = line.trim().split(/\s+/);
+    if (f.length < 2) continue;
+    out[path.basename(f[f.length - 1])] = f[0].toLowerCase();
+  }
+  return out;
+}
+
+function sha256(file) {
+  return crypto.createHash('sha256').update(fs.readFileSync(file)).digest('hex');
+}
+
+// verify checks dest against checksums.txt for this release. Fails closed on a
+// mismatch; fails open (warning) when the manifest or entry is absent.
+async function verify(dest, name, tag) {
+  const text = await fetchText(`https://github.com/${REPO}/releases/download/v${tag}/checksums.txt`);
+  if (!text) { console.error('podcli: no checksums.txt in release — skipped verification'); return; }
+  const want = parseChecksums(text)[name];
+  if (!want) { console.error(`podcli: no checksum entry for ${name} — skipped verification`); return; }
+  const got = sha256(dest);
+  if (got !== want) {
+    fs.rmSync(dest, { force: true });
+    throw new Error(`checksum mismatch for ${name}: got ${got} want ${want}`);
+  }
+}
+
 async function ensure() {
   const dest = binPath();
   fs.mkdirSync(path.dirname(dest), { recursive: true });
@@ -74,8 +135,10 @@ async function ensure() {
     fs.copyFileSync(src, dest);
   } else {
     const ext = process.platform === 'win32' ? '.exe' : '';
-    const url = `https://github.com/${REPO}/releases/download/v${version}/podcli-${target()}${ext}`;
+    const name = `podcli-${target()}${ext}`;
+    const url = `https://github.com/${REPO}/releases/download/v${version}/${name}`;
     await download(url, dest);
+    await verify(dest, name, version);
   }
   if (process.platform !== 'win32') fs.chmodSync(dest, 0o755);
   return dest;
diff --git a/tests/test_transcript_cache_engine.py b/tests/test_transcript_cache_engine.py
new file mode 100644
index 0000000..53362b1
--- /dev/null
+++ b/tests/test_transcript_cache_engine.py
@@ -0,0 +1,48 @@
+import os
+import sys
+import unittest
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+from services import transcript_packer as tp
+
+
+class EngineNamespacedCacheTests(unittest.TestCase):
+    def setUp(self):
+        self._saved = os.environ.get("PODCLI_ENGINE")
+
+    def tearDown(self):
+        if self._saved is None:
+            os.environ.pop("PODCLI_ENGINE", None)
+        else:
+            os.environ["PODCLI_ENGINE"] = self._saved
+
+    def test_whisper_py_keeps_bare_path(self):
+        for v in (None, "whisper-py", "WHISPER-PY"):
+            if v is None:
+                os.environ.pop("PODCLI_ENGINE", None)
+            else:
+                os.environ["PODCLI_ENGINE"] = v
+            self.assertTrue(tp.transcript_json_path("abc123").endswith("abc123.json"))
+
+    def test_whispercpp_is_namespaced(self):
+        for v in ("whispercpp", "whisper-cpp", "whisper.cpp", "cpp"):
+            os.environ["PODCLI_ENGINE"] = v
+            self.assertTrue(
+                tp.transcript_json_path("abc123").endswith("abc123-whispercpp.json"),
+                f"engine {v!r} not namespaced",
+            )
+
+    def test_engines_do_not_collide(self):
+        os.environ.pop("PODCLI_ENGINE", None)
+        py = tp.transcript_json_path("abc123")
+        os.environ["PODCLI_ENGINE"] = "whispercpp"
+        cpp = tp.transcript_json_path("abc123")
+        self.assertNotEqual(py, cpp)
+
+
+if __name__ == "__main__":
+    unittest.main()

From 4d6cd4ae7d68abbd043d50acb96407c7290d6701 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 20:04:15 +0400
Subject: [PATCH 28/41] Make caption parity goldens host-independent
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The goldens encoded two host-specific inputs, so they passed on macOS but
failed on CI's Ubuntu:
- the font name (DETECTED_FONT resolves Arial on macOS, Liberation Sans on
  Ubuntu via fc-list) written into the ASS Style line — affected all 4 styles
- per-word text widths (fc-match + freetype metrics) — affected the branded
  style's positioning and pill geometry

Pin both in the parity test fixture (font -> Arial, deterministic width model)
and regenerate the goldens. Production rendering still uses real host metrics;
only the harness is pinned, which is its whole point: lock the engine-independent
ASS pipeline, not the host's typography.
---
 tests/parity/golden/branded.ass.expected | 76 ++++++++++++------------
 tests/parity/test_caption_parity.py      | 43 ++++++++++++++
 2 files changed, 81 insertions(+), 38 deletions(-)

diff --git a/tests/parity/golden/branded.ass.expected b/tests/parity/golden/branded.ass.expected
index e6e5650..3181376 100644
--- a/tests/parity/golden/branded.ass.expected
+++ b/tests/parity/golden/branded.ass.expected
@@ -12,41 +12,41 @@ Style: BrandedNormal,Arial,80,&H00FFFFFF,&H00FFFFFF,&H90000000,&H00000000,-1,0,0
 
 [Events]
 Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
-Dialogue: 0,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an7\pos(208,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 168 0 b 176 0 183 7 183 15 l 183 85 b 183 93 176 100 168 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
-Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
-Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
-Dialogue: 0,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an7\pos(373,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 257 0 b 265 0 272 7 272 15 l 272 85 b 272 93 265 100 257 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
-Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
-Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
-Dialogue: 0,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an7\pos(627,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 230 0 b 238 0 245 7 245 15 l 245 85 b 245 93 238 100 230 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(299,1380)}The
-Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(509,1380)}james
-Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(749,1380)}webb
-Dialogue: 0,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an7\pos(150,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 419 0 b 427 0 434 7 434 15 l 434 85 b 434 93 427 100 419 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
-Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
-Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
-Dialogue: 0,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an7\pos(566,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 191 0 b 199 0 206 7 206 15 l 206 85 b 206 93 199 100 191 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
-Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
-Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
-Dialogue: 0,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an7\pos(754,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 160 0 b 168 0 175 7 175 15 l 175 85 b 175 93 168 100 160 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(367,1380)}Telescope
-Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(669,1380)}cost
-Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(841,1380)}$10
-Dialogue: 0,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an7\pos(167,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 291 0 b 299 0 306 7 306 15 l 306 85 b 306 93 299 100 291 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
-Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
-Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
-Dialogue: 0,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an7\pos(455,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 272 0 b 280 0 287 7 287 15 l 287 85 b 287 93 280 100 272 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
-Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
-Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
-Dialogue: 0,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an7\pos(724,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 173 0 b 181 0 188 7 188 15 l 188 85 b 188 93 181 100 173 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}Billion.
-Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(598,1380)}wasn't
-Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(818,1380)}that
-Dialogue: 0,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an7\pos(296,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 472 0 b 480 0 487 7 487 15 l 487 85 b 487 93 480 100 472 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
-Dialogue: 1,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an5\pos(539,1380)}Expensive?
+Dialogue: 0,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an7\pos(240,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 145 0 b 153 0 160 7 160 15 l 160 85 b 160 93 153 100 145 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}The
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(520,1380)}james
+Dialogue: 1,0:00:00.00,0:00:00.34,BrandedNormal,,0,0,0,,{\an5\pos(740,1380)}webb
+Dialogue: 0,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an7\pos(400,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 225 0 b 233 0 240 7 240 15 l 240 85 b 240 93 233 100 225 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}The
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(520,1380)}james
+Dialogue: 1,0:00:00.34,0:00:00.72,BrandedNormal,,0,0,0,,{\an5\pos(740,1380)}webb
+Dialogue: 0,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an7\pos(640,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 185 0 b 193 0 200 7 200 15 l 200 85 b 200 93 193 100 185 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(320,1380)}The
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(520,1380)}james
+Dialogue: 1,0:00:00.72,0:00:01.05,BrandedNormal,,0,0,0,,{\an5\pos(740,1380)}webb
+Dialogue: 0,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an7\pos(160,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 385 0 b 393 0 400 7 400 15 l 400 85 b 400 93 393 100 385 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(360,1380)}Telescope
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(660,1380)}cost
+Dialogue: 1,0:00:01.05,0:00:01.62,BrandedNormal,,0,0,0,,{\an5\pos(840,1380)}$10
+Dialogue: 0,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an7\pos(560,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 185 0 b 193 0 200 7 200 15 l 200 85 b 200 93 193 100 185 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(360,1380)}Telescope
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(660,1380)}cost
+Dialogue: 1,0:00:01.62,0:00:01.95,BrandedNormal,,0,0,0,,{\an5\pos(840,1380)}$10
+Dialogue: 0,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an7\pos(760,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 145 0 b 153 0 160 7 160 15 l 160 85 b 160 93 153 100 145 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(360,1380)}Telescope
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(660,1380)}cost
+Dialogue: 1,0:00:01.95,0:00:02.40,BrandedNormal,,0,0,0,,{\an5\pos(840,1380)}$10
+Dialogue: 0,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an7\pos(120,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 345 0 b 353 0 360 7 360 15 l 360 85 b 360 93 353 100 345 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(300,1380)}Billion.
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(620,1380)}wasn't
+Dialogue: 1,0:00:02.40,0:00:03.10,BrandedNormal,,0,0,0,,{\an5\pos(860,1380)}that
+Dialogue: 0,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an7\pos(480,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 265 0 b 273 0 280 7 280 15 l 280 85 b 280 93 273 100 265 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(300,1380)}Billion.
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(620,1380)}wasn't
+Dialogue: 1,0:00:03.10,0:00:03.55,BrandedNormal,,0,0,0,,{\an5\pos(860,1380)}that
+Dialogue: 0,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an7\pos(760,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 185 0 b 193 0 200 7 200 15 l 200 85 b 200 93 193 100 185 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(300,1380)}Billion.
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(620,1380)}wasn't
+Dialogue: 1,0:00:03.55,0:00:03.60,BrandedNormal,,0,0,0,,{\an5\pos(860,1380)}that
+Dialogue: 0,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an7\pos(320,1330)\p1\c&H00000000&\bord0\shad0\1a&HFF&\t(0,80,\1a&H30&)}m 15 0 l 425 0 b 433 0 440 7 440 15 l 440 85 b 440 93 433 100 425 100 l 15 100 b 7 100 0 93 0 85 l 0 15 b 0 7 7 0 15 0{\p0}
+Dialogue: 1,0:00:03.55,0:00:04.20,BrandedNormal,,0,0,0,,{\an5\pos(540,1380)}Expensive?
diff --git a/tests/parity/test_caption_parity.py b/tests/parity/test_caption_parity.py
index d5cc250..8689874 100644
--- a/tests/parity/test_caption_parity.py
+++ b/tests/parity/test_caption_parity.py
@@ -26,6 +26,7 @@
 if BACKEND_ROOT not in sys.path:
     sys.path.insert(0, BACKEND_ROOT)
 
+import services.caption_renderer as caption_renderer  # noqa: E402
 from services.caption_renderer import render_captions, _sanitize_words  # noqa: E402
 
 HERE = os.path.dirname(__file__)
@@ -34,6 +35,26 @@
 STYLES = ["hormozi", "karaoke", "subtle", "branded"]
 
 
+def _deterministic_text_widths(texts, font_name, font_size, bold, spacing=2):
+    """A host-font-independent stand-in for _measure_text_widths.
+
+    Real width measurement resolves the host's font via fc-match and freetype,
+    so positions differ between macOS (Arial) and CI (DejaVu/Liberation) and the
+    goldens can never match across machines. The parity harness exists to lock
+    the engine-independent ASS pipeline (timing, styles, karaoke, position and
+    pill geometry derived from widths) — not the host's typography — so we pin a
+    deterministic width model and the goldens become reproducible everywhere.
+    """
+    widths = []
+    for t in texts:
+        n = len(t)
+        if n == 0:
+            widths.append(0)
+            continue
+        widths.append(round(n * font_size * 0.5) + spacing * max(0, n - 1))
+    return widths
+
+
 def _load_words():
     with open(TRANSCRIPT, encoding="utf-8") as f:
         return json.load(f)["words"]
@@ -51,6 +72,28 @@ def _render(style: str) -> str:
             os.remove(out)
 
 
+@pytest.fixture(autouse=True)
+def _hermetic_render(monkeypatch):
+    """Pin the two host-dependent inputs so goldens are reproducible everywhere.
+
+    1. Font name: caption_styles.DETECTED_FONT is resolved at import via fc-list,
+       so it's "Arial" on macOS and "Liberation Sans" on CI's Ubuntu — and that
+       name is written into the ASS Style line.
+    2. Text widths: see _deterministic_text_widths (affects the branded style's
+       per-word positioning and pill geometry).
+    """
+    monkeypatch.setattr(caption_renderer, "_measure_text_widths", _deterministic_text_widths)
+
+    real_get_style = caption_renderer.get_style
+
+    def _pinned_get_style(name):
+        style = dict(real_get_style(name))
+        style["font_name"] = "Arial"
+        return style
+
+    monkeypatch.setattr(caption_renderer, "get_style", _pinned_get_style)
+
+
 @pytest.mark.parametrize("style", STYLES)
 def test_caption_output_matches_golden(style):
     produced = _render(style)

From 167f2c17ac9cc049171db4912bf49a9ff4412309 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Mon, 15 Jun 2026 20:15:00 +0400
Subject: [PATCH 29/41] Address PR review comments
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- requirements: bump opencv-python-headless (CVE-2023-4863), python-dotenv
  (CVE-2026-28684), google-api-python-client (2.0.0 yanked) floors in both
  runtime and dev requirements
- cli.py: import subprocess at module scope — the native-engine except clause
  referenced subprocess.CalledProcessError without it in scope (NameError)
- install.js: Windows defaultHome now nests under LOCALAPPDATA\podcli to match
  the Go launcher's managed dir (binary was installed where the shim couldn't
  find it)
- update/provision: add ResponseHeaderTimeout to the release HTTP clients so a
  stalled server can't hang update/provisioning indefinitely
- update.go: roll back the Windows binary swap if the staged rename fails, so a
  failed self-update can't brick the CLI
- whisper.cpp snap: guarantee positive word duration when clamping to the
  voiced upper bound (was collapsing to zero)
- transcript cache: don't adopt the legacy whisper-py cache under a whisper.cpp
  namespace
- release.yml: default permissions to read; grant contents:write only to the
  release job
- tests: parity goldens use mkstemp; add hardlink-escape and zero-duration-snap
  regression tests
---
 .github/workflows/release.yml                |  5 ++++-
 backend/cli.py                               |  1 +
 backend/requirements-runtime.txt             |  6 +++---
 backend/requirements.txt                     |  6 +++---
 backend/services/transcript_packer.py        | 16 +++++++++------
 backend/services/transcription_whispercpp.py |  5 ++++-
 cli/internal/provision/provision.go          |  1 +
 cli/internal/provision/security_test.go      | 21 ++++++++++++++++++++
 cli/internal/update/update.go                | 10 ++++++++++
 npm/scripts/install.js                       |  2 +-
 tests/parity/test_caption_parity.py          |  3 ++-
 tests/test_whispercpp_snap.py                |  3 +++
 12 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index 7707e19..6ed6316 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -11,8 +11,9 @@ on:
     tags:
       - 'v*'
 
+# Default to read-only; only the release job that publishes assets gets write.
 permissions:
-  contents: write
+  contents: read
 
 jobs:
   build:
@@ -138,6 +139,8 @@ jobs:
   release:
     needs: [build, whisper, studio, remotion]
     runs-on: ubuntu-latest
+    permissions:
+      contents: write
     steps:
       - uses: actions/download-artifact@v4
         with:
diff --git a/backend/cli.py b/backend/cli.py
index 7c5537f..08d3f34 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -14,6 +14,7 @@
 import json
 import os
 import shutil
+import subprocess
 import sys
 import textwrap
 import time
diff --git a/backend/requirements-runtime.txt b/backend/requirements-runtime.txt
index 8f5143d..408f153 100644
--- a/backend/requirements-runtime.txt
+++ b/backend/requirements-runtime.txt
@@ -2,10 +2,10 @@
 # binary), so openai-whisper/torch are intentionally absent — that is the ~2GB
 # the native install saves. --engine whisper-py remains a dev/source-only option
 # via backend/requirements.txt.
-opencv-python-headless>=4.8.0
+opencv-python-headless>=4.8.1.78
 numpy>=1.24.0
 Pillow>=10.0.0
 questionary>=2.0.0
-python-dotenv>=1.0.0
-google-api-python-client>=2.0.0
+python-dotenv>=1.2.2
+google-api-python-client>=2.0.1
 google-auth-oauthlib>=1.0.0
diff --git a/backend/requirements.txt b/backend/requirements.txt
index ca445ef..e5fcbfc 100644
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -9,7 +9,7 @@ openai-whisper>=20231117
 # speechbrain>=1.0.0
 
 # Face detection (optional, for smart cropping)
-opencv-python-headless>=4.8.0
+opencv-python-headless>=4.8.1.78
 numpy>=1.24.0
 
 # Thumbnails
@@ -19,8 +19,8 @@ Pillow>=10.0.0
 questionary>=2.0.0
 
 # Utilities
-python-dotenv>=1.0.0
+python-dotenv>=1.2.2
 
 # YouTube analytics (optional — only needed for live OAuth sync; CSV import works without these)
-google-api-python-client>=2.0.0
+google-api-python-client>=2.0.1
 google-auth-oauthlib>=1.0.0
diff --git a/backend/services/transcript_packer.py b/backend/services/transcript_packer.py
index 82f2532..f311496 100644
--- a/backend/services/transcript_packer.py
+++ b/backend/services/transcript_packer.py
@@ -82,12 +82,16 @@ def load_cached_transcript_for_video(video_path: str) -> dict[str, Any] | None:
     if os.path.exists(canonical):
         with open(canonical, encoding="utf-8") as f:
             return json.load(f)
-    legacy = legacy_md5_cache_path(video_path)
-    if os.path.exists(legacy):
-        with open(legacy, encoding="utf-8") as f:
-            data = json.load(f)
-        save_cached_transcript_for_video(video_path, data)
-        return data
+    # The legacy md5 cache predates the engine split and only ever held
+    # whisper-py output, so don't let a whisper.cpp run adopt it under its own
+    # namespace.
+    if _engine_cache_suffix() == "":
+        legacy = legacy_md5_cache_path(video_path)
+        if os.path.exists(legacy):
+            with open(legacy, encoding="utf-8") as f:
+                data = json.load(f)
+            save_cached_transcript_for_video(video_path, data)
+            return data
     return None
 
 
diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index 6aa56a2..c236cfe 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -126,7 +126,10 @@ def _snap_words_to_voiced(words: list[dict], wav_path: str) -> list[dict]:
         if s < prev_end:
             s = prev_end
         if e <= s:
-            e = min(s + 0.05, hi)
+            # A word clamped to hi would otherwise collapse to zero length; a
+            # tiny overhang past the voiced pad is harmless, a zero-length
+            # caption event is not.
+            e = s + 0.05
         prev_end = e
         out.append({**w, "start": round(s, 3), "end": round(e, 3)})
     return out
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index 1dbc8aa..bb0fab9 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -311,6 +311,7 @@ func allowedReleaseHost(h string) bool {
 
 func releaseHTTPClient() *http.Client {
 	return &http.Client{
+		Transport: &http.Transport{ResponseHeaderTimeout: 30 * time.Second},
 		CheckRedirect: func(req *http.Request, via []*http.Request) error {
 			if len(via) >= 10 {
 				return fmt.Errorf("too many redirects")
diff --git a/cli/internal/provision/security_test.go b/cli/internal/provision/security_test.go
index 2a2d257..7a747c1 100644
--- a/cli/internal/provision/security_test.go
+++ b/cli/internal/provision/security_test.go
@@ -73,3 +73,24 @@ func TestExtractTarGzRejectsEscapingSymlink(t *testing.T) {
 		t.Fatal("expected extractTarGz to reject escaping symlink, got nil error")
 	}
 }
+
+func TestExtractTarGzRejectsEscapingHardlink(t *testing.T) {
+	var buf bytes.Buffer
+	gz := gzip.NewWriter(&buf)
+	tw := tar.NewWriter(gz)
+	if err := tw.WriteHeader(&tar.Header{Name: "evil", Typeflag: tar.TypeLink, Linkname: "../../../../etc/passwd", Mode: 0o644}); err != nil {
+		t.Fatal(err)
+	}
+	tw.Close()
+	gz.Close()
+
+	dir := t.TempDir()
+	archive := filepath.Join(dir, "evil.tar.gz")
+	if err := os.WriteFile(archive, buf.Bytes(), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	dest := filepath.Join(dir, "out")
+	if err := extractTarGz(archive, dest); err == nil {
+		t.Fatal("expected extractTarGz to reject escaping hardlink, got nil error")
+	}
+}
diff --git a/cli/internal/update/update.go b/cli/internal/update/update.go
index efe330e..4a3fb7f 100644
--- a/cli/internal/update/update.go
+++ b/cli/internal/update/update.go
@@ -58,6 +58,7 @@ func allowedHost(h string) bool {
 
 func guardedClient() *http.Client {
 	return &http.Client{
+		Transport: &http.Transport{ResponseHeaderTimeout: 30 * time.Second},
 		CheckRedirect: func(req *http.Request, via []*http.Request) error {
 			if len(via) >= 10 {
 				return fmt.Errorf("too many redirects")
@@ -210,11 +211,20 @@ func swap(staged, dest string) error {
 	if runtime.GOOS == "windows" {
 		old := dest + ".old"
 		os.Remove(old)
+		moved := false
 		if _, err := os.Stat(dest); err == nil {
 			if err := os.Rename(dest, old); err != nil {
 				return err
 			}
+			moved = true
 		}
+		if err := os.Rename(staged, dest); err != nil {
+			if moved {
+				os.Rename(old, dest) // restore the original so the CLI isn't bricked
+			}
+			return err
+		}
+		return nil
 	}
 	return os.Rename(staged, dest)
 }
diff --git a/npm/scripts/install.js b/npm/scripts/install.js
index 14bd731..e67a0f4 100644
--- a/npm/scripts/install.js
+++ b/npm/scripts/install.js
@@ -31,7 +31,7 @@ const TARGETS = {
 function defaultHome() {
   const h = os.homedir();
   if (process.platform === 'darwin') return path.join(h, 'Library', 'Application Support', 'podcli');
-  if (process.platform === 'win32') return process.env.LOCALAPPDATA || path.join(h, 'AppData', 'Local', 'podcli');
+  if (process.platform === 'win32') return path.join(process.env.LOCALAPPDATA || path.join(h, 'AppData', 'Local'), 'podcli');
   return process.env.XDG_DATA_HOME ? path.join(process.env.XDG_DATA_HOME, 'podcli') : path.join(h, '.local', 'share', 'podcli');
 }
 
diff --git a/tests/parity/test_caption_parity.py b/tests/parity/test_caption_parity.py
index 8689874..3af7a9c 100644
--- a/tests/parity/test_caption_parity.py
+++ b/tests/parity/test_caption_parity.py
@@ -62,7 +62,8 @@ def _load_words():
 
 def _render(style: str) -> str:
     words = _load_words()
-    out = tempfile.mktemp(suffix=".ass")
+    fd, out = tempfile.mkstemp(suffix=".ass")
+    os.close(fd)
     try:
         render_captions(words, style, out, time_offset=0.0)
         with open(out, encoding="utf-8") as f:
diff --git a/tests/test_whispercpp_snap.py b/tests/test_whispercpp_snap.py
index bc5d9f2..9f09861 100644
--- a/tests/test_whispercpp_snap.py
+++ b/tests/test_whispercpp_snap.py
@@ -55,6 +55,9 @@ def test_trailing_word_pulled_out_of_silence(self):
         out = _snap_words_to_voiced(words, self.wav)
         self.assertEqual(out[0]["start"], 0.1)  # word over speech untouched
         self.assertLessEqual(out[1]["start"], 1.2)  # stranded word clamped back
+        # a word clamped to the upper bound must keep positive duration
+        for w in out:
+            self.assertGreater(w["end"], w["start"])
 
     def test_all_silence_leaves_words_unchanged(self):
         silent = tempfile.mktemp(suffix=".wav")

From 384a3d98b6d481c2fa5f03e132381f1893237624 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Tue, 16 Jun 2026 23:15:05 +0400
Subject: [PATCH 30/41] Make data global, render clips to the working dir,
 auto-select whisper.cpp

Path model: the brand brain (presets, knowledge, assets, history, config) and
the transcript cache now live in the global managed dir so they follow the user
across directories; only rendered clips are written to <cwd>/podcli-clips. The Go
launcher injects PODCLI_HOME/PODCLI_DATA (global), PODCLI_OUTPUT and PODCLI_CWD;
Python and the studio honor PODCLI_OUTPUT.

Migration: running podcli in an old ./podcli folder imports that folder's brand
brain into the global home (via the bundle export/import so asset paths are
rewritten), moves the transcript cache, picks up ancient top-level presets/, and
copies .env. Non-destructive: it never overwrites a populated global home.

Studio: the library already reads the global clips.json; clips are now streamed
by history id from wherever they were rendered, with the recorded output_path as
the allowlist (symlink-resolved, regular-file + extension checked).

Engine: native installs auto-use whisper.cpp instead of erroring. The launcher
selects it on a hermetic Python for the interactive menu, process, and studio;
the backend also falls back automatically when openai-whisper is absent and the
user did not explicitly request the whisper-py engine.

The old in-repo launcher prints an upgrade notice pointing at the native CLI.
---
 .gitignore                         |   2 +
 backend/cli.py                     |  30 ++++-
 backend/config/paths.py            |   3 +-
 backend/config_bundle.py           | 139 ++++++++++++++++---
 backend/services/transcription.py  |  38 +++++-
 cli/internal/engine/engine.go      |  70 ++++------
 cli/main.go                        |  20 ++-
 podcli                             |  19 +++
 src/config/paths.ts                |   3 +-
 src/ui/client/ClipDetail.tsx       |   5 +-
 src/ui/client/StudioHome.tsx       |   2 +-
 src/ui/web-server.ts               |  46 +++++++
 tests/test_config_bundle.py        | 209 ++++++++++++++++++++---------
 tests/test_transcription_engine.py |  67 +++++++++
 14 files changed, 502 insertions(+), 151 deletions(-)
 create mode 100644 tests/test_transcription_engine.py

diff --git a/.gitignore b/.gitignore
index 5992c77..2c5527a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -46,3 +46,5 @@ uploads/
 output/
 cli/internal/backend/files/
 cli/internal/podstack/commands/
+cli/podcli
+podcli-clips/
diff --git a/backend/cli.py b/backend/cli.py
index 08d3f34..bb0a28c 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -153,17 +153,24 @@ def _auto_migrate_cli(args) -> None:
     summary = auto_migrate_legacy_if_pending(quiet=True)
     if not summary:
         return
+    home = summary.get("home_migration") or {}
+    imported_home = home.get("imported")
     moved_cache = summary.get("moved_json") or summary.get("moved_remotion_bundle")
     moved_presets = (summary.get("presets_migration") or {}).get("moved")
-    if moved_cache or moved_presets:
+    copied_env = (summary.get("env_migration") or {}).get("copied")
+    if imported_home or moved_cache or moved_presets or copied_env:
         gray = "\033[38;5;245m"
         green = "\033[38;2;74;222;128m"
         reset = "\033[0m"
+        if imported_home:
+            print(f"  {green}✓{reset} {gray}Imported your presets, knowledge & assets → {home.get('target_home')}{reset}")
         if moved_cache:
-            print(f"  {green}✓{reset} {gray}Migrated legacy cache → {summary.get('target_dir')}{reset}")
+            print(f"  {green}✓{reset} {gray}Migrated transcript cache → {summary.get('target_dir')}{reset}")
         if moved_presets:
             pm = summary.get("presets_migration") or {}
             print(f"  {green}✓{reset} {gray}Migrated presets → {pm.get('target_dir')}{reset}")
+        if copied_env:
+            print(f"  {green}✓{reset} {gray}Copied .env → {(summary.get('env_migration') or {}).get('target')}{reset}")
         print()
 
 
@@ -286,7 +293,11 @@ def _resolve_output_dir(
     if explicit_output_dir:
         return explicit_output_dir
 
-    base_output_dir = configured_output_dir or os.path.join(os.path.dirname(video_path), "clips")
+    base_output_dir = (
+        configured_output_dir
+        or os.environ.get("PODCLI_OUTPUT")
+        or os.path.join(os.path.dirname(video_path), "clips")
+    )
     if not preset_name:
         return base_output_dir
 
@@ -2619,6 +2630,15 @@ def _print_config_result(action: str, data: dict) -> None:
 
     if action == "migrate":
         print(f"\n  {bold}Legacy migration{reset}")
+        home_mig = data.get("home_migration") or {}
+        if home_mig.get("imported") or home_mig.get("skipped_existing"):
+            print(f"  {gray}brand brain (presets, knowledge, assets, history, config){reset}")
+            print(f"    {gray}from{reset}: {home_mig.get('legacy_home')}")
+            print(f"    {gray}to{reset}:   {home_mig.get('target_home')}")
+            if home_mig.get("skipped_existing"):
+                print(f"    {gray}skipped{reset}: global home already has data")
+            else:
+                print(f"    {gray}imported{reset}: yes")
         print(f"  {gray}cache{reset}")
         print(f"    {gray}from{reset}: {data.get('legacy_dir')}")
         print(f"    {gray}to{reset}:   {data.get('target_dir')}")
@@ -2637,6 +2657,10 @@ def _print_config_result(action: str, data: dict) -> None:
             print(f"    {gray}moved{reset}:    {presets.get('moved')}")
             if presets.get("skipped"):
                 print(f"    {gray}skipped{reset}:  {presets['skipped']} (already in target)")
+        env_mig = data.get("env_migration") or {}
+        if env_mig.get("copied"):
+            print(f"  {gray}.env{reset}")
+            print(f"    {gray}to{reset}:   {env_mig.get('target')}")
         if not data.get("dry_run"):
             print(f"\n  {green}✓{reset} Migration complete")
         print()
diff --git a/backend/config/paths.py b/backend/config/paths.py
index 66dd38a..2a603e3 100644
--- a/backend/config/paths.py
+++ b/backend/config/paths.py
@@ -44,6 +44,7 @@ def _build_paths() -> dict[str, str]:
     home = _resolve_home()
     project_root = _project_root()
     data_dir = Path(os.environ.get("PODCLI_DATA", str(project_root / "data"))).expanduser().resolve()
+    output_dir = Path(os.environ.get("PODCLI_OUTPUT", str(data_dir / "output"))).expanduser().resolve()
     return {
         "home": str(home),
         "project_root": str(project_root),
@@ -51,7 +52,7 @@ def _build_paths() -> dict[str, str]:
         "transcripts": str(data_dir / "cache" / "transcripts"),
         "packed": str(home / "packed"),
         "working": str(data_dir / "working"),
-        "output": str(data_dir / "output"),
+        "output": str(output_dir),
         "logs": str(data_dir / "logs"),
         "assets": str(home / "assets"),
         "assetsRegistry": str(home / "assets" / "registry.json"),
diff --git a/backend/config_bundle.py b/backend/config_bundle.py
index 612be04..a47dc63 100644
--- a/backend/config_bundle.py
+++ b/backend/config_bundle.py
@@ -4,6 +4,7 @@
 import os
 import re
 import shutil
+import tempfile
 import zipfile
 from datetime import datetime, timezone
 from pathlib import Path
@@ -60,12 +61,38 @@ def _migration_marker_path() -> Path:
     return _data_dir() / MIGRATION_MARKER_NAME
 
 
+# The old ./podcli kept everything project-local. The native CLI keeps the brand
+# brain (presets, knowledge, assets, history, config) and the transcript cache in
+# the global managed dir so they follow the user across directories; only clips
+# stay in the working dir. Migration therefore reads the *working directory* —
+# the old ./podcli folder the user is standing in — and imports it into the global
+# home/cache. PODCLI_CWD is injected by the Go launcher; getcwd() is the fallback.
+def _legacy_project_dir() -> Path:
+    return Path(os.environ.get("PODCLI_CWD") or os.getcwd()).expanduser().resolve()
+
+
+def _global_home() -> Path:
+    return Path(paths["home"]).resolve()
+
+
+def _legacy_home_dir() -> Path:
+    return _legacy_project_dir() / ".podcli"
+
+
 def _legacy_cache_dir() -> Path:
-    return Path(paths["project_root"]) / ".podcli" / "cache"
+    # Old layouts stored the transcript cache under <proj>/.podcli/cache (original)
+    # or <proj>/data/cache (interim project-local). Prefer whichever holds content.
+    proj = _legacy_project_dir()
+    for candidate in (proj / ".podcli" / "cache", proj / "data" / "cache"):
+        if candidate.is_dir() and any(candidate.iterdir()):
+            return candidate
+    return proj / ".podcli" / "cache"
 
 
 def _legacy_presets_dir() -> Path:
-    return Path(paths["project_root"]) / "presets"
+    # Ancient layout kept presets at <proj>/presets, outside .podcli. Presets
+    # inside .podcli/presets are covered by the brand-brain import instead.
+    return _legacy_project_dir() / "presets"
 
 
 def _legacy_presets_has_content() -> bool:
@@ -73,8 +100,28 @@ def _legacy_presets_has_content() -> bool:
     return legacy.is_dir() and any(legacy.glob("*.json"))
 
 
+def _legacy_env_file() -> Path:
+    return _legacy_project_dir() / ".env"
+
+
+def _legacy_env_pending() -> bool:
+    return _legacy_env_file().is_file() and not (_global_home() / ".env").exists()
+
+
+def _legacy_home_pending() -> bool:
+    legacy = _legacy_home_dir()
+    if legacy.resolve() == _global_home():
+        return False
+    return _has_managed_content(legacy) and not _has_managed_content(_global_home())
+
+
 def _legacy_migration_pending() -> bool:
-    return _legacy_cache_has_content() or _legacy_presets_has_content()
+    return (
+        _legacy_home_pending()
+        or _legacy_cache_has_content()
+        or _legacy_presets_has_content()
+        or _legacy_env_pending()
+    )
 
 
 def _read_json(path: Path) -> Any:
@@ -231,7 +278,7 @@ def migrate_legacy_cache(*, dry_run: bool = False) -> dict[str, Any]:
         "dry_run": dry_run,
     }
 
-    if not legacy.is_dir():
+    if not legacy.is_dir() or legacy.resolve() == target.resolve():
         return result
 
     target.mkdir(parents=True, exist_ok=True)
@@ -319,6 +366,54 @@ def migrate_legacy_presets(*, dry_run: bool = False) -> dict[str, Any]:
     return result
 
 
+def migrate_legacy_home(*, dry_run: bool = False) -> dict[str, Any]:
+    """Import a project-local .podcli brand brain (presets, knowledge, assets,
+    history, config) from the working dir into the global home. Only runs when the
+    global home is still empty so it never clobbers an existing global profile."""
+    legacy = _legacy_home_dir()
+    home = _global_home()
+    result: dict[str, Any] = {
+        "legacy_home": str(legacy),
+        "target_home": str(home),
+        "imported": False,
+        "dry_run": dry_run,
+    }
+    if legacy.resolve() == home or not _has_managed_content(legacy):
+        return result
+    if _has_managed_content(home):
+        result["skipped_existing"] = True
+        return result
+    if dry_run:
+        result["imported"] = True
+        return result
+    # Reuse export/import so asset files get archived and absolute path references
+    # inside presets/config are rewritten to the new global home.
+    with tempfile.TemporaryDirectory() as tmp:
+        bundle = os.path.join(tmp, "legacy-home.zip")
+        export_config(bundle, source_home=str(legacy))
+        import_config(bundle, target_home=str(home), activate=False)
+    result["imported"] = True
+    return result
+
+
+def migrate_legacy_env(*, dry_run: bool = False) -> dict[str, Any]:
+    src = _legacy_env_file()
+    dest = _global_home() / ".env"
+    result: dict[str, Any] = {
+        "source": str(src),
+        "target": str(dest),
+        "copied": False,
+        "dry_run": dry_run,
+    }
+    if not src.is_file() or dest.exists():
+        return result
+    if not dry_run:
+        dest.parent.mkdir(parents=True, exist_ok=True)
+        shutil.copy2(str(src), str(dest))
+    result["copied"] = True
+    return result
+
+
 def auto_migrate_legacy_if_pending(*, quiet: bool = True) -> dict[str, Any] | None:
     if not _legacy_migration_pending():
         return None
@@ -328,28 +423,31 @@ def auto_migrate_legacy_if_pending(*, quiet: bool = True) -> dict[str, Any] | No
 def ensure_legacy_migrated(*, quiet: bool = True) -> dict[str, Any]:
     marker = _migration_marker_path()
     had_marker = marker.exists()
+    home_summary = migrate_legacy_home(dry_run=False)
     summary = migrate_legacy_cache(dry_run=False)
     presets_summary = migrate_legacy_presets(dry_run=False)
+    env_summary = migrate_legacy_env(dry_run=False)
+    summary["home_migration"] = home_summary
     summary["presets_migration"] = presets_summary
+    summary["env_migration"] = env_summary
     try:
         from services.transcript_packer import migrate_transcript_cache_layout
 
         layout = migrate_transcript_cache_layout()
         summary["transcript_layout"] = layout
-        changed = bool(
-            summary.get("moved_json")
-            or summary.get("moved_remotion_bundle")
-            or summary.get("removed_duplicate_remotion_bundle")
-            or layout.get("moved_to_transcripts")
-            or presets_summary.get("moved")
-        )
+        layout_moved = layout.get("moved_to_transcripts")
     except Exception:
-        changed = bool(
-            summary.get("moved_json")
-            or summary.get("moved_remotion_bundle")
-            or summary.get("removed_duplicate_remotion_bundle")
-            or presets_summary.get("moved")
-        )
+        summary["transcript_layout"] = {"moved_to_transcripts": 0, "skipped": 0}
+        layout_moved = 0
+    changed = bool(
+        home_summary.get("imported")
+        or summary.get("moved_json")
+        or summary.get("moved_remotion_bundle")
+        or summary.get("removed_duplicate_remotion_bundle")
+        or layout_moved
+        or presets_summary.get("moved")
+        or env_summary.get("copied")
+    )
     if not had_marker or changed:
         marker.parent.mkdir(parents=True, exist_ok=True)
         marker.write_text(
@@ -539,8 +637,10 @@ def get_config_status() -> dict[str, Any]:
         "home": get_active_home(),
         "cache": paths["cache"],
         "profile_marker": paths["profileMarker"],
+        "legacy_home_pending": _legacy_home_pending(),
         "legacy_cache_pending": _legacy_cache_has_content(),
         "legacy_presets_pending": _legacy_presets_has_content(),
+        "legacy_env_pending": _legacy_env_pending(),
         "migration_marker": str(marker) if marker.exists() else None,
         "migration": migration_info,
     }
@@ -560,8 +660,9 @@ def run_config_action(
     if act == "migrate":
         if dry_run:
             cache = migrate_legacy_cache(dry_run=True)
-            presets = migrate_legacy_presets(dry_run=True)
-            cache["presets_migration"] = presets
+            cache["home_migration"] = migrate_legacy_home(dry_run=True)
+            cache["presets_migration"] = migrate_legacy_presets(dry_run=True)
+            cache["env_migration"] = migrate_legacy_env(dry_run=True)
             return cache
         return ensure_legacy_migrated(quiet=True)
     if act == "export":
diff --git a/backend/services/transcription.py b/backend/services/transcription.py
index 4d80528..6f35470 100644
--- a/backend/services/transcription.py
+++ b/backend/services/transcription.py
@@ -8,6 +8,7 @@
 """
 
 import os
+import shutil
 import subprocess
 import sys
 import tempfile
@@ -26,16 +27,38 @@ def _managed_home() -> str:
     return os.environ.get("XDG_DATA_HOME") or os.path.join(home, ".local", "share", "podcli")
 
 
+def _whispercpp_cli() -> Optional[str]:
+    """Resolve the whisper.cpp binary: explicit env, PATH, then the hermetic
+    runtime location the native installer provisions."""
+    cli = os.environ.get("PODCLI_WHISPER_CLI")
+    if cli and (os.path.exists(cli) or shutil.which(cli)):
+        return cli
+    found = shutil.which("whisper-cli") or shutil.which("whisper-cpp")
+    if found:
+        return found
+    exe = "whisper-cli.exe" if sys.platform == "win32" else "whisper-cli"
+    hermetic = os.path.join(_managed_home(), "runtime", "whisper", exe)
+    return hermetic if os.path.exists(hermetic) else None
+
+
+def _whispercpp_model(model_size: str) -> str:
+    return os.environ.get("PODCLI_WHISPERCPP_MODEL") or os.path.join(
+        _managed_home(), "models", f"ggml-{model_size}.bin"
+    )
+
+
+def _whispercpp_ready(model_size: str) -> bool:
+    return _whispercpp_cli() is not None and os.path.exists(_whispercpp_model(model_size))
+
+
 def _transcribe_with_whispercpp(file_path, model_size, language, progress_callback):
     from services import transcription_whispercpp as wcpp
 
     if progress_callback:
         progress_callback(10, "Transcribing with whisper.cpp...")
 
-    cli = os.environ.get("PODCLI_WHISPER_CLI", "whisper-cli")
-    model = os.environ.get("PODCLI_WHISPERCPP_MODEL") or os.path.join(
-        _managed_home(), "models", f"ggml-{model_size}.bin"
-    )
+    cli = _whispercpp_cli() or "whisper-cli"
+    model = _whispercpp_model(model_size)
     if not os.path.exists(model):
         raise FileNotFoundError(
             f"whisper.cpp model not found: {model}. "
@@ -81,7 +104,8 @@ def transcribe_file(
     if not os.path.exists(file_path):
         raise FileNotFoundError(f"File not found: {file_path}")
 
-    engine = os.environ.get("PODCLI_ENGINE", "whisper-py").strip().lower()
+    requested = os.environ.get("PODCLI_ENGINE", "").strip().lower()
+    engine = requested or "whisper-py"
     if engine in ("whispercpp", "whisper-cpp", "whisper.cpp", "cpp"):
         return _transcribe_with_whispercpp(file_path, model_size, language, progress_callback)
 
@@ -94,6 +118,10 @@ def transcribe_file(
     try:
         import whisper
     except ImportError as e:
+        # Native installs ship whisper.cpp, not openai-whisper. Fall back to it
+        # automatically unless the user explicitly asked for the whisper-py engine.
+        if not requested and _whispercpp_ready(model_size):
+            return _transcribe_with_whispercpp(file_path, model_size, language, progress_callback)
         raise RuntimeError(
             "The whisper-py engine needs the full source install (openai-whisper + torch). "
             "This native install ships whisper.cpp — rerun with --engine whispercpp."
diff --git a/cli/internal/engine/engine.go b/cli/internal/engine/engine.go
index f1e49d6..82cae60 100644
--- a/cli/internal/engine/engine.go
+++ b/cli/internal/engine/engine.go
@@ -166,17 +166,7 @@ func nodeEnv() []string {
 	if fp := FFprobe(); fp != "" {
 		env = append(env, "FFPROBE_PATH="+fp)
 	}
-	if proj, ok := ProjectDir(); ok {
-		if os.Getenv("PODCLI_HOME") == "" {
-			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
-		}
-		if os.Getenv("PODCLI_DATA") == "" {
-			env = append(env, "PODCLI_DATA="+filepath.Join(proj, "data"))
-		}
-		if os.Getenv("PODCLI_ENV_FILE") == "" {
-			env = append(env, "PODCLI_ENV_FILE="+filepath.Join(proj, ".env"))
-		}
-	}
+	env = append(env, DataEnv()...)
 	return env
 }
 
@@ -199,32 +189,32 @@ func RunMCP() (int, error) {
 	return 0, nil
 }
 
-// ProjectDir resolves the user's project root: the nearest ancestor of the
-// working directory holding a .podcli dir or .podcli-home marker, else the
-// working directory itself. This keeps episode data, presets, and .env
-// project-local — the behavior of the old in-repo launcher — now that the
-// backend lives in the global runtime dir instead of beside the data.
-// ProjectDir resolves where a run's data lives: the nearest ancestor of the
-// working directory holding a .podcli/.podcli-home marker, else the working
-// directory itself. Data is always project-local — never the global runtime dir,
-// where setup/self-update would overwrite it. The bool is false only when the
-// working directory can't be determined.
-func ProjectDir() (string, bool) {
-	cwd, err := os.Getwd()
-	if err != nil {
-		return "", false
+// DataEnv tells the backend where data lives. The brand brain (presets,
+// knowledge, assets, history, config) and the transcript cache live in the
+// global managed dir so they follow the user across directories; only rendered
+// clips go under the working directory (PODCLI_OUTPUT). Explicit env always wins
+// so power users and tests can override any single path.
+func DataEnv() []string {
+	var env []string
+	home := paths.Home()
+	if os.Getenv("PODCLI_HOME") == "" {
+		env = append(env, "PODCLI_HOME="+home)
+	}
+	if os.Getenv("PODCLI_DATA") == "" {
+		env = append(env, "PODCLI_DATA="+filepath.Join(home, "data"))
+	}
+	if os.Getenv("PODCLI_ENV_FILE") == "" {
+		env = append(env, "PODCLI_ENV_FILE="+filepath.Join(home, ".env"))
 	}
-	for dir := cwd; ; {
-		if exists(filepath.Join(dir, ".podcli")) || exists(filepath.Join(dir, ".podcli-home")) {
-			return dir, true
+	if cwd, err := os.Getwd(); err == nil {
+		if os.Getenv("PODCLI_CWD") == "" {
+			env = append(env, "PODCLI_CWD="+cwd)
 		}
-		parent := filepath.Dir(dir)
-		if parent == dir {
-			break
+		if os.Getenv("PODCLI_OUTPUT") == "" {
+			env = append(env, "PODCLI_OUTPUT="+filepath.Join(cwd, "podcli-clips"))
 		}
-		dir = parent
 	}
-	return cwd, true
+	return env
 }
 
 func Run(args []string) (int, error) {
@@ -258,19 +248,7 @@ func Run(args []string) (int, error) {
 	if ss := StudioServer(); ss != "" {
 		env = append(env, "PODCLI_STUDIO="+filepath.Dir(ss))
 	}
-	// Pin data + .env to the user's project dir so the global runtime backend
-	// doesn't strand project-local episodes/presets. Explicit env wins.
-	if proj, ok := ProjectDir(); ok {
-		if os.Getenv("PODCLI_HOME") == "" {
-			env = append(env, "PODCLI_HOME="+filepath.Join(proj, ".podcli"))
-		}
-		if os.Getenv("PODCLI_DATA") == "" {
-			env = append(env, "PODCLI_DATA="+filepath.Join(proj, "data"))
-		}
-		if os.Getenv("PODCLI_ENV_FILE") == "" {
-			env = append(env, "PODCLI_ENV_FILE="+filepath.Join(proj, ".env"))
-		}
-	}
+	env = append(env, DataEnv()...)
 	cmd.Env = env
 
 	err := cmd.Run()
diff --git a/cli/main.go b/cli/main.go
index 4c1a21b..9f9e048 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -110,11 +110,18 @@ func configCmd(args []string) int {
 	return 0
 }
 
-// transcribeEngine resolves which engine a process/studio run will use, honoring
-// --engine, PODCLI_ENGINE, then defaulting to whisper.cpp on a hermetic Python
-// (which has no openai-whisper).
+// transcribeEngine resolves which engine a run will use, honoring --engine,
+// PODCLI_ENGINE, then defaulting to whisper.cpp on a hermetic Python (which has
+// no openai-whisper). Covers every entry point that can transcribe: the no-arg
+// interactive menu, process, studio, and transcribe.
 func transcribeEngine(args []string) string {
-	if len(args) == 0 || (args[0] != "process" && args[0] != "studio") {
+	cmd := ""
+	if len(args) > 0 {
+		cmd = args[0]
+	}
+	switch cmd {
+	case "", "process", "studio", "transcribe":
+	default:
 		return ""
 	}
 	sel := strings.ToLower(os.Getenv("PODCLI_ENGINE"))
@@ -275,8 +282,9 @@ func doctor() {
 	fmt.Printf("  home:     %s\n", paths.Home())
 	fmt.Printf("  runtime:  %s\n", paths.RuntimeDir())
 	fmt.Printf("  models:   %s\n", paths.ModelsDir())
-	if proj, ok := engine.ProjectDir(); ok {
-		fmt.Printf("  project:  %s  (episodes, presets, .env resolve here)\n", proj)
+	fmt.Printf("  presets/knowledge/assets/history/cache: %s  (global — follow you everywhere)\n", paths.Home())
+	if cwd, err := os.Getwd(); err == nil {
+		fmt.Printf("  clips:    %s  (rendered into your working directory)\n", filepath.Join(cwd, "podcli-clips"))
 	}
 	fmt.Println("\nEngine resolution")
 	if root, ok := engine.BackendRoot(); ok {
diff --git a/podcli b/podcli
index 7eb8b18..829336f 100755
--- a/podcli
+++ b/podcli
@@ -17,6 +17,25 @@ else
   PYTHON="python3"
 fi
 
+# ── Native CLI upgrade notice ──
+# This in-repo bash launcher is superseded by the native `podcli` binary. Running
+# the native CLI in this same directory auto-migrates this project's data on first
+# run. Printed to stderr so command stdout (JSON, pipes) stays clean.
+{
+  U_ACCENT="\033[38;2;212;135;74m"
+  U_BOLD="\033[1m"
+  U_DIM="\033[2m"
+  U_RESET="\033[0m"
+  NATIVE_BIN="$(command -v podcli 2>/dev/null)"
+  echo -e "  ${U_BOLD}podcli now ships as a native CLI.${U_RESET} ${U_DIM}You're running the old in-repo launcher.${U_RESET}" >&2
+  if [ -n "$NATIVE_BIN" ] && [ "$NATIVE_BIN" != "$0" ] && [ "$NATIVE_BIN" != "$SCRIPT_DIR/podcli" ]; then
+    echo -e "  Run ${U_ACCENT}podcli${U_RESET} in this folder instead — it auto-migrates this project's data on first run." >&2
+  else
+    echo -e "  Install ${U_ACCENT}npm i -g podcli${U_RESET}, then run ${U_ACCENT}podcli${U_RESET} here — it auto-migrates this project's data on first run." >&2
+  fi
+  echo "" >&2
+}
+
 # ── PodStack slash commands → launch an AI agent ──
 CMD="$1"
 # Strip leading / or -- prefix (user might try /prep-episode, --prep-episode, or prep-episode)
diff --git a/src/config/paths.ts b/src/config/paths.ts
index c0f4d42..25fac3e 100644
--- a/src/config/paths.ts
+++ b/src/config/paths.ts
@@ -7,6 +7,7 @@ const __dirname = dirname(fileURLToPath(import.meta.url));
 const projectRoot = resolve(__dirname, "..", "..");
 const homeMarker = join(projectRoot, ".podcli-home");
 const dataDir = resolve(process.env.PODCLI_DATA || join(projectRoot, "data"));
+const outputDir = resolve(process.env.PODCLI_OUTPUT || join(dataDir, "output"));
 
 function resolveHome(): string {
   if (process.env.PODCLI_HOME) {
@@ -49,7 +50,7 @@ export const paths = {
   transcripts: join(dataDir, "cache", "transcripts"),
   packed: join(home, "packed"),
   working: join(dataDir, "working"),
-  output: join(dataDir, "output"),
+  output: outputDir,
   logs: join(dataDir, "logs"),
   assets: join(home, "assets"),
   assetsRegistry: join(home, "assets", "registry.json"),
diff --git a/src/ui/client/ClipDetail.tsx b/src/ui/client/ClipDetail.tsx
index f425a50..b4ed041 100644
--- a/src/ui/client/ClipDetail.tsx
+++ b/src/ui/client/ClipDetail.tsx
@@ -83,8 +83,7 @@ export default function ClipDetail() {
 
   const tc = clip.thumbnail_config || {};
   const dirty = title !== clip.title || captionStyle !== clip.caption_style;
-  const file = basename(clip.output_path);
-  const previewUrl = `/api/preview/${file}?t=${bust}`;
+  const previewUrl = `/api/clips/${clip.id}/preview?t=${bust}`;
   const source = thumbImage ? `Image · ${basename(thumbImage)}` : thumbTimestamp != null ? `Frame @ ${fmt(thumbTimestamp)}` : "Auto";
 
   const patch = (body: any) => api(`/clips/${clip.id}`, { method: "PATCH", body: JSON.stringify(body) });
@@ -167,7 +166,7 @@ export default function ClipDetail() {
         <div style={{ display: "flex", justifyContent: "space-between", alignItems: "flex-end", gap: 16, marginTop: 8, flexWrap: "wrap" }}>
           <h1 style={{ margin: 0 }}>{clip.title}</h1>
           <div className="set-actions">
-            <a className="btn btn-ghost btn-sm" href={`/api/download/${file}`} download style={{ textDecoration: "none" }}>Download</a>
+            <a className="btn btn-ghost btn-sm" href={`/api/clips/${clip.id}/download`} download style={{ textDecoration: "none" }}>Download</a>
             <button className="btn btn-ghost btn-sm" onClick={reopen} disabled={busy !== null}>{busy === "reopen" ? <div className="spinner sm" /> : "Reopen in editor"}</button>
             {davinciOn && <button className="btn btn-ghost btn-sm" onClick={exportDavinci} disabled={busy !== null}>{busy === "davinci" ? <div className="spinner sm" /> : "Export for DaVinci"}</button>}
             <button className="btn btn-danger btn-sm" onClick={del} disabled={busy !== null}>{busy === "delete" ? <div className="spinner sm" /> : "Delete"}</button>
diff --git a/src/ui/client/StudioHome.tsx b/src/ui/client/StudioHome.tsx
index 7182ff0..b386b52 100644
--- a/src/ui/client/StudioHome.tsx
+++ b/src/ui/client/StudioHome.tsx
@@ -130,7 +130,7 @@ export default function StudioHome() {
                       {thumb ? (
                         <img className="clip-card-media" src={`/api/image?path=${encodeURIComponent(thumb)}`} alt="" />
                       ) : file ? (
-                        <video className="clip-card-media" src={`/api/preview/${file}#t=0.1`} muted preload="metadata" playsInline />
+                        <video className="clip-card-media" src={`/api/clips/${c.id}/preview#t=0.1`} muted preload="metadata" playsInline />
                       ) : (
                         <div className="clip-card-media empty">▶</div>
                       )}
diff --git a/src/ui/web-server.ts b/src/ui/web-server.ts
index 36b1212..7b77714 100644
--- a/src/ui/web-server.ts
+++ b/src/ui/web-server.ts
@@ -995,6 +995,52 @@ app.get("/api/preview/:filename", (req, res) => {
   streamVideo(req, res, filePath);
 });
 
+// Clips now render into the user's working dir, not a single output root, so the
+// library streams them by history id from wherever they live. The output_path
+// recorded in clips.json IS the allowlist: only files podcli itself logged are
+// servable, and only as regular files (symlinks resolved, extension checked).
+async function serveClipById(
+  req: Request,
+  res: Response,
+  id: string,
+  mode: "preview" | "download",
+) {
+  const entry = await clipsHistory.findById(id);
+  if (!entry || !entry.output_path) {
+    res.status(404).json({ error: "Clip not found" });
+    return;
+  }
+  let real: string;
+  try {
+    real = realpathSync(entry.output_path);
+  } catch {
+    res.status(404).json({ error: "File no longer exists" });
+    return;
+  }
+  if (!statSync(real).isFile() || !/\.(mp4|mov|mkv|webm)$/i.test(real)) {
+    res.status(400).json({ error: "Unsupported clip file" });
+    return;
+  }
+  if (mode === "download") {
+    res.download(real);
+    return;
+  }
+  const mimeTypes: Record<string, string> = {
+    ".webm": "video/webm",
+    ".mov": "video/quicktime",
+    ".mkv": "video/x-matroska",
+  };
+  streamVideo(req, res, real, mimeTypes[extname(real).toLowerCase()] || "video/mp4");
+}
+
+app.get("/api/clips/:id/preview", (req, res) => {
+  void serveClipById(req, res, req.params.id, "preview");
+});
+
+app.get("/api/clips/:id/download", (req, res) => {
+  void serveClipById(req, res, req.params.id, "download");
+});
+
 /**
  * GET /api/stream-source — Stream the source video for in-browser preview
  * Accepts ?path= query param (must be a file previously validated via /select-file or /upload)
diff --git a/tests/test_config_bundle.py b/tests/test_config_bundle.py
index 791b9e3..e4e7b7f 100644
--- a/tests/test_config_bundle.py
+++ b/tests/test_config_bundle.py
@@ -106,96 +106,173 @@ def boom(self, path):
 
         self.assertTrue(os.path.exists(keep_file))
 
-    def test_auto_migrate_skips_when_no_legacy_cache(self):
-        self.assertIsNone(auto_migrate_legacy_if_pending(quiet=True))
+    # The native CLI keeps the brand brain + cache global; migration reads the
+    # working dir (PODCLI_CWD) and imports it into the global home/cache.
+    def _enter_migration(self, proj_root, global_home):
+        import config.paths as paths_mod
 
-    def test_migrate_legacy_presets(self):
-        import shutil
+        self._paths_mod = paths_mod
+        self._saved_paths = {k: paths_mod.paths.get(k) for k in ("home", "cache", "project_root")}
+        self._saved_cwd_env = os.environ.get("PODCLI_CWD")
+        paths_mod.paths["home"] = global_home
+        paths_mod.paths["cache"] = os.path.join(global_home, "data", "cache")
+        # The backend install dir must be ignored by migration; point it elsewhere.
+        paths_mod.paths["project_root"] = self.src_home
+        os.environ["PODCLI_CWD"] = proj_root
+
+    def _exit_migration(self):
+        for key, value in self._saved_paths.items():
+            if value is None:
+                self._paths_mod.paths.pop(key, None)
+            else:
+                self._paths_mod.paths[key] = value
+        if self._saved_cwd_env is None:
+            os.environ.pop("PODCLI_CWD", None)
+        else:
+            os.environ["PODCLI_CWD"] = self._saved_cwd_env
+
+    def test_auto_migrate_skips_when_no_legacy(self):
+        empty_proj = os.path.join(os.path.dirname(self.bundle), "empty-proj")
+        os.makedirs(empty_proj, exist_ok=True)
+        self._enter_migration(empty_proj, self.dst_home)
+        try:
+            self.assertIsNone(auto_migrate_legacy_if_pending(quiet=True))
+        finally:
+            self._exit_migration()
 
-        import config.paths as paths_mod
-        from config_bundle import migrate_legacy_presets
+    def test_migrate_legacy_cache(self):
+        from config_bundle import migrate_legacy_cache as mlc
 
-        legacy_root = os.path.join(os.path.dirname(self.bundle), "legacy-presets-root")
-        legacy_presets = os.path.join(legacy_root, "presets")
+        proj = os.path.join(os.path.dirname(self.bundle), "legacy-project")
+        legacy = os.path.join(proj, ".podcli", "cache")
+        os.makedirs(legacy, exist_ok=True)
+        with open(os.path.join(legacy, "abc123.json"), "w", encoding="utf-8") as f:
+            json.dump({"words": []}, f)
+
+        self._enter_migration(proj, self.dst_home)
+        try:
+            summary = mlc(dry_run=False)
+            target = os.path.join(self.dst_home, "data", "cache")
+            self.assertEqual(summary["moved_json"], 1)
+            self.assertTrue(os.path.exists(os.path.join(target, "abc123.json")))
+            self.assertFalse(os.path.exists(os.path.join(legacy, "abc123.json")))
+        finally:
+            self._exit_migration()
+
+    def test_status_reports_legacy_pending_without_migrating(self):
+        proj = os.path.join(os.path.dirname(self.bundle), "legacy-status")
+        legacy = os.path.join(proj, ".podcli", "cache")
+        os.makedirs(legacy, exist_ok=True)
+        stay = os.path.join(legacy, "stay.json")
+        with open(stay, "w", encoding="utf-8") as f:
+            json.dump({"words": []}, f)
+
+        self._enter_migration(proj, self.dst_home)
+        try:
+            status = run_config_action("status")
+            self.assertTrue(status.get("legacy_cache_pending"))
+            self.assertTrue(os.path.exists(stay))
+        finally:
+            self._exit_migration()
+
+    def test_migrate_legacy_top_presets(self):
+        from config_bundle import migrate_legacy_presets as mlp
+
+        proj = os.path.join(os.path.dirname(self.bundle), "legacy-presets-root")
+        legacy_presets = os.path.join(proj, "presets")
         os.makedirs(legacy_presets, exist_ok=True)
         with open(os.path.join(legacy_presets, "myshow.json"), "w", encoding="utf-8") as f:
             json.dump({"caption_style": "branded"}, f)
 
-        old_root = paths_mod.paths["project_root"]
-        old_home = paths_mod.paths["home"]
+        self._enter_migration(proj, self.dst_home)
         try:
-            paths_mod.paths["project_root"] = legacy_root
-            paths_mod.paths["home"] = self.dst_home
+            summary = mlp(dry_run=False)
             target = os.path.join(self.dst_home, "presets")
-            os.makedirs(target, exist_ok=True)
-
-            summary = migrate_legacy_presets(dry_run=False)
             self.assertEqual(summary["moved"], 1)
             self.assertTrue(os.path.exists(os.path.join(target, "myshow.json")))
-            self.assertFalse(os.path.exists(os.path.join(legacy_presets, "myshow.json")))
         finally:
-            paths_mod.paths["project_root"] = old_root
-            paths_mod.paths["home"] = old_home
-            shutil.rmtree(legacy_root, ignore_errors=True)
+            self._exit_migration()
 
-    def test_status_does_not_migrate_legacy_cache(self):
-        import importlib
-        import config.paths as paths_mod
-        import config_bundle
+    def test_migrate_legacy_home_imports_brand_brain(self):
+        from config_bundle import migrate_legacy_home as mlh
 
-        legacy_root = os.path.join(os.path.dirname(self.bundle), "legacy-status")
-        legacy = os.path.join(legacy_root, ".podcli", "cache")
-        os.makedirs(legacy, exist_ok=True)
-        legacy_file = os.path.join(legacy, "stay.json")
-        with open(legacy_file, "w", encoding="utf-8") as f:
-            json.dump({"words": []}, f)
+        proj = os.path.join(os.path.dirname(self.bundle), "legacy-home")
+        os.makedirs(os.path.join(proj, ".podcli", "presets"), exist_ok=True)
+        os.makedirs(os.path.join(proj, ".podcli", "knowledge"), exist_ok=True)
+        with open(os.path.join(proj, ".podcli", "presets", "show.json"), "w", encoding="utf-8") as f:
+            json.dump({"caption_style": "branded"}, f)
+        with open(os.path.join(proj, ".podcli", "knowledge", "01-brand.md"), "w", encoding="utf-8") as f:
+            f.write("# Brand\n")
 
-        old_root = paths_mod.paths["project_root"]
+        empty_global = os.path.join(os.path.dirname(self.bundle), "empty-global")
+        os.makedirs(empty_global, exist_ok=True)
+        self._enter_migration(proj, empty_global)
         try:
-            paths_mod.paths["project_root"] = legacy_root
-            config_bundle.paths["project_root"] = legacy_root
-            status = run_config_action("status")
-            self.assertTrue(status.get("legacy_cache_pending"))
-            self.assertTrue(os.path.exists(legacy_file))
+            summary = mlh(dry_run=False)
+            self.assertTrue(summary["imported"])
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "presets", "show.json")))
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "knowledge", "01-brand.md")))
         finally:
-            paths_mod.paths["project_root"] = old_root
-            config_bundle.paths["project_root"] = old_root
-            import shutil
-            shutil.rmtree(legacy_root, ignore_errors=True)
+            self._exit_migration()
 
-    def test_migrate_legacy_cache(self):
-        import shutil
+    def test_migrate_legacy_home_skips_populated_global(self):
+        from config_bundle import migrate_legacy_home as mlh
 
-        import config.paths as paths_mod
-        import config_bundle
+        proj = os.path.join(os.path.dirname(self.bundle), "legacy-home-skip")
+        os.makedirs(os.path.join(proj, ".podcli", "presets"), exist_ok=True)
+        with open(os.path.join(proj, ".podcli", "presets", "show.json"), "w", encoding="utf-8") as f:
+            json.dump({"a": 1}, f)
 
-        legacy_root = os.path.join(os.path.dirname(self.bundle), "legacy-project")
-        legacy = os.path.join(legacy_root, ".podcli", "cache")
-        os.makedirs(legacy, exist_ok=True)
-        legacy_file = os.path.join(legacy, "abc123.json")
-        with open(legacy_file, "w", encoding="utf-8") as f:
+        # A global home that already holds managed content must not be clobbered.
+        os.makedirs(os.path.join(self.dst_home, "presets"), exist_ok=True)
+        with open(os.path.join(self.dst_home, "presets", "existing.json"), "w", encoding="utf-8") as f:
+            json.dump({"keep": 1}, f)
+        self._enter_migration(proj, self.dst_home)
+        try:
+            summary = mlh(dry_run=False)
+            self.assertFalse(summary.get("imported"))
+            self.assertTrue(summary.get("skipped_existing"))
+            self.assertFalse(os.path.exists(os.path.join(self.dst_home, "presets", "show.json")))
+        finally:
+            self._exit_migration()
+
+    def test_auto_migrate_from_old_project_dir(self):
+        # Running `podcli` in an old ./podcli folder imports its brand brain, cache,
+        # ancient top-level presets, and .env into the GLOBAL store. PODCLI_CWD is
+        # the folder the user runs in; paths["home"]/["cache"] are the global targets.
+        proj = os.path.join(os.path.dirname(self.bundle), "old-project")
+        os.makedirs(os.path.join(proj, ".podcli", "presets"), exist_ok=True)
+        os.makedirs(os.path.join(proj, ".podcli", "knowledge"), exist_ok=True)
+        os.makedirs(os.path.join(proj, ".podcli", "cache"), exist_ok=True)
+        os.makedirs(os.path.join(proj, "presets"), exist_ok=True)
+        with open(os.path.join(proj, ".podcli", "presets", "show.json"), "w", encoding="utf-8") as f:
+            json.dump({"caption_style": "branded"}, f)
+        with open(os.path.join(proj, ".podcli", "knowledge", "01-brand.md"), "w", encoding="utf-8") as f:
+            f.write("# Brand\n")
+        with open(os.path.join(proj, ".podcli", "cache", "ep1.json"), "w", encoding="utf-8") as f:
             json.dump({"words": []}, f)
-
-        old_root = paths_mod.paths["project_root"]
-        old_cache = paths_mod.paths["cache"]
+        with open(os.path.join(proj, "presets", "ancient.json"), "w", encoding="utf-8") as f:
+            json.dump({"old": 1}, f)
+        with open(os.path.join(proj, ".env"), "w", encoding="utf-8") as f:
+            f.write("OPENAI_API_KEY=sk-test\n")
+
+        empty_global = os.path.join(os.path.dirname(self.bundle), "empty-global2")
+        os.makedirs(empty_global, exist_ok=True)
+        self._enter_migration(proj, empty_global)
         try:
-            paths_mod.paths["project_root"] = legacy_root
-            paths_mod.paths["cache"] = os.path.join(self.dst_home, "cache")
-            config_bundle.paths["project_root"] = legacy_root
-            config_bundle.paths["cache"] = paths_mod.paths["cache"]
-            target = paths_mod.paths["cache"]
-            os.makedirs(target, exist_ok=True)
-
-            summary = migrate_legacy_cache(dry_run=False)
+            summary = auto_migrate_legacy_if_pending(quiet=True)
+            self.assertIsNotNone(summary)
+            self.assertTrue(summary["home_migration"]["imported"])
             self.assertEqual(summary["moved_json"], 1)
-            self.assertTrue(os.path.exists(os.path.join(target, "abc123.json")))
-            self.assertFalse(os.path.exists(legacy_file))
+            self.assertEqual(summary["presets_migration"]["moved"], 1)
+            self.assertTrue(summary["env_migration"]["copied"])
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "presets", "show.json")))
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "presets", "ancient.json")))
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "knowledge", "01-brand.md")))
+            self.assertTrue(os.path.exists(os.path.join(empty_global, "data", "cache", "ep1.json")))
+            self.assertTrue(os.path.exists(os.path.join(empty_global, ".env")))
         finally:
-            paths_mod.paths["project_root"] = old_root
-            paths_mod.paths["cache"] = old_cache
-            config_bundle.paths["project_root"] = old_root
-            config_bundle.paths["cache"] = old_cache
-            shutil.rmtree(legacy_root, ignore_errors=True)
+            self._exit_migration()
 
     def test_safe_extract_rejects_zip_slip(self):
         import zipfile
diff --git a/tests/test_transcription_engine.py b/tests/test_transcription_engine.py
new file mode 100644
index 0000000..d6c03db
--- /dev/null
+++ b/tests/test_transcription_engine.py
@@ -0,0 +1,67 @@
+"""Engine selection: native installs auto-use whisper.cpp when openai-whisper
+is absent, unless the user explicitly asked for the whisper-py engine."""
+
+import os
+import sys
+import tempfile
+import unittest
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+import services.transcription as tr
+
+
+class TranscriptionEngineTests(unittest.TestCase):
+    def setUp(self):
+        self._orig_wcpp = tr._transcribe_with_whispercpp
+        self._orig_ready = tr._whispercpp_ready
+        tr._transcribe_with_whispercpp = lambda *a, **k: {"engine": "whispercpp"}
+        tr._whispercpp_ready = lambda size: True
+        # Make `import whisper` fail to simulate a native (hermetic) install.
+        self._had_whisper = sys.modules.get("whisper", "__absent__")
+        sys.modules["whisper"] = None
+        self._tmp = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
+        self._tmp.write(b"x")
+        self._tmp.close()
+        self._saved_engine = os.environ.pop("PODCLI_ENGINE", None)
+
+    def tearDown(self):
+        tr._transcribe_with_whispercpp = self._orig_wcpp
+        tr._whispercpp_ready = self._orig_ready
+        if self._had_whisper == "__absent__":
+            sys.modules.pop("whisper", None)
+        else:
+            sys.modules["whisper"] = self._had_whisper
+        os.unlink(self._tmp.name)
+        if self._saved_engine is None:
+            os.environ.pop("PODCLI_ENGINE", None)
+        else:
+            os.environ["PODCLI_ENGINE"] = self._saved_engine
+
+    def test_auto_falls_back_to_whispercpp(self):
+        os.environ.pop("PODCLI_ENGINE", None)
+        result = tr.transcribe_file(self._tmp.name, model_size="base", enable_diarization=False)
+        self.assertEqual(result["engine"], "whispercpp")
+
+    def test_explicit_whispercpp_uses_it(self):
+        os.environ["PODCLI_ENGINE"] = "whispercpp"
+        result = tr.transcribe_file(self._tmp.name, model_size="base", enable_diarization=False)
+        self.assertEqual(result["engine"], "whispercpp")
+
+    def test_explicit_whisper_py_still_errors(self):
+        os.environ["PODCLI_ENGINE"] = "whisper-py"
+        with self.assertRaises(RuntimeError):
+            tr.transcribe_file(self._tmp.name, model_size="base", enable_diarization=False)
+
+    def test_no_fallback_when_whispercpp_unavailable(self):
+        os.environ.pop("PODCLI_ENGINE", None)
+        tr._whispercpp_ready = lambda size: False
+        with self.assertRaises(RuntimeError):
+            tr.transcribe_file(self._tmp.name, model_size="base", enable_diarization=False)
+
+
+if __name__ == "__main__":
+    unittest.main()

From 1e225080749e3ac91af2436af2433fda544ffa23 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Tue, 16 Jun 2026 23:30:59 +0400
Subject: [PATCH 31/41] =?UTF-8?q?Add=20"find=20a=20specific=20moment"=20?=
 =?UTF-8?q?=E2=80=94=20paste=20text,=20AI=20locates=20it=20in=20the=20tran?=
 =?UTF-8?q?script?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Studio gets a text area to paste a quote or describe moments; the backend uses
the local AI CLI to locate them in the transcript and appends the matches to the
clip suggestions (deduped against existing ones). This mirrors the CLI's
interactive "find a specific moment" flow.

The moment-finding logic moves into claude_suggest.find_moments_from_text (the
CLI now delegates to it), exposed as a find_moment backend task and a
POST /api/find-moment route. Status goes to the progress channel; warnings to
stderr so the task runner's stdout JSON-RPC stays clean.
---
 backend/cli.py                     | 156 +----------------------------
 backend/main.py                    |  27 +++++
 backend/services/claude_suggest.py | 156 +++++++++++++++++++++++++++++
 src/models/index.ts                |   2 +-
 src/ui/client/EpisodeWorkspace.jsx |  48 +++++++++
 src/ui/web-server.ts               |  71 +++++++++++++
 tests/test_find_moments.py         |  73 ++++++++++++++
 7 files changed, 379 insertions(+), 154 deletions(-)
 create mode 100644 tests/test_find_moments.py

diff --git a/backend/cli.py b/backend/cli.py
index bb0a28c..93244bf 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -1122,160 +1122,10 @@ def _print_clips(clips: list):
 
 
 def _find_moment_with_claude(description: str, segments: list, existing_clips: list) -> list:
-    """Use Claude to find a specific moment described by the user."""
-    from services.claude_suggest import _find_ai_cli_candidates, _run_ai_command
-    from presets import MIN_CLIP_DURATION, MAX_CLIP_DURATION, TARGET_CLIP_DURATION_MIN, TARGET_CLIP_DURATION_MAX
-
-    candidates = _find_ai_cli_candidates()
-    if not candidates:
-        print("         ⚠ No AI CLI available for moment search")
-        return []
-
-    # Build transcript text
-    lines = []
-    for seg in segments:
-        speaker = seg.get("speaker", "")
-        speaker_label = f"[{speaker}] " if speaker else ""
-        start = seg.get("start", 0)
-        text = seg.get("text", "").strip()
-        if text:
-            lines.append(f"[{start:.1f}s] {speaker_label}{text}")
-    transcript_text = "\n".join(lines)
-
-    # Build list of existing clip timestamps to avoid
-    existing_desc = ""
-    if existing_clips:
-        existing_desc = "\n\nALREADY SELECTED (do not re-suggest these):\n"
-        for c in existing_clips:
-            existing_desc += f"- {c['start_second']}s-{c['end_second']}s: {c['title']}\n"
-
-    import tempfile, subprocess, json as _json
-
-    prompt = f"""Find the moment the user is describing in this podcast transcript. Return ONLY valid JSON.
-
-USER WANTS: "{description}"
-{existing_desc}
-RULES:
-- Find the EXACT moment matching the user's description
-- Return 1-3 matching moments (best match first)
-- All timestamps in SECONDS as numbers
-- Duration target: {TARGET_CLIP_DURATION_MIN}-{TARGET_CLIP_DURATION_MAX} seconds, max {MAX_CLIP_DURATION} seconds
-- Cut tight: start at the hook, end when the point lands
-- Use segments to cut filler if needed
-
-Return this JSON:
-{{
-  "clips": [
-    {{
-      "title": "First sentence of the moment",
-      "start_second": 123.4,
-      "end_second": 158.4,
-      "segments": [{{"start": 123.4, "end": 158.4}}],
-      "duration": 35,
-      "content_type": "guest_story",
-      "scores": {{"standalone": 4, "hook": 5, "relevance": 4, "quotability": 3}},
-      "total_score": 16,
-      "quote": "The key quote",
-      "why": "Why this matches what the user asked for"
-    }}
-  ]
-}}
-
-Transcript:
-{transcript_text}"""
-
-    project_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..")
-    with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False, dir=project_dir) as f:
-        f.write(prompt)
-        prompt_file = f.name
+    """Use Claude/Codex to find a specific moment described by the user."""
+    from services.claude_suggest import find_moments_from_text
 
-    try:
-        for idx, (cli_path, engine) in enumerate(candidates):
-            if idx > 0:
-                print(f"         ⚠ Retrying moment search with {'Claude' if engine == 'claude' else 'Codex'}")
-            try:
-                result = _run_ai_command(
-                    cli_path=cli_path,
-                    engine=engine,
-                    prompt=prompt,
-                    prompt_file=prompt_file,
-                    project_dir=project_dir,
-                    timeout=300,
-                )
-            except Exception:
-                continue
-
-            if result.returncode != 0 or not result.stdout.strip():
-                continue
-
-            response = result.stdout.strip()
-            if "```" in response:
-                import re
-                fence_match = re.search(r"```(?:json)?\s*\n?(.*?)\n?\s*```", response, re.DOTALL)
-                if fence_match:
-                    response = fence_match.group(1).strip()
-
-            try:
-                json_start = response.find("{")
-                if json_start >= 0:
-                    decoder = _json.JSONDecoder()
-                    data, _ = decoder.raw_decode(response, json_start)
-                else:
-                    data = _json.loads(response)
-            except Exception:
-                continue
-
-            found = []
-            for c in data.get("clips", []):
-                scores = c.get("scores", {})
-                total = sum(scores.values()) if scores else c.get("total_score", 0)
-                raw_segments = c.get("segments", [])
-                keep_segments = []
-                for seg in raw_segments:
-                    s = round(float(seg.get("start", 0)), 1)
-                    e = round(float(seg.get("end", 0)), 1)
-                    if e > s:
-                        keep_segments.append({"start": s, "end": e})
-
-                start_sec = round(float(c.get("start_second", 0)), 1)
-                end_sec = round(float(c.get("end_second", 0)), 1)
-                if not keep_segments and end_sec > start_sec:
-                    keep_segments = [{"start": start_sec, "end": end_sec}]
-
-                kept_duration = sum(seg["end"] - seg["start"] for seg in keep_segments)
-                if kept_duration < MIN_CLIP_DURATION or kept_duration > MAX_CLIP_DURATION:
-                    continue
-
-                found.append({
-                    "title": c.get("title", "Untitled")[:55],
-                    "start_second": keep_segments[0]["start"] if keep_segments else start_sec,
-                    "end_second": keep_segments[-1]["end"] if keep_segments else end_sec,
-                    "segments": keep_segments,
-                    "duration": round(kept_duration),
-                    "score": total,
-                    "content_type": c.get("content_type", "unknown"),
-                    "reasoning": c.get("why", ""),
-                    "preview_text": c.get("quote", "")[:120],
-                    "suggested_caption_style": "hormozi",
-                    "quote": c.get("quote", ""),
-                    "why": c.get("why", ""),
-                    "reasons": [c.get("content_type", "")],
-                    "preview": c.get("quote", "")[:120],
-                })
-
-            if found:
-                return found
-
-        return []
-
-    except Exception as e:
-        print(f"         ⚠ Search error: {e}")
-        return []
-    finally:
-        try:
-            os.unlink(prompt_file)
-        except Exception:
-            pass
+    return find_moments_from_text(description, segments, existing_clips, max_results=3)
 
 
 def _review_clips(clips: list, segments: list, energy_scores: list | None, config: dict) -> list:
diff --git a/backend/main.py b/backend/main.py
index ff1e27a..5812ad5 100644
--- a/backend/main.py
+++ b/backend/main.py
@@ -377,6 +377,32 @@ def handle_suggest_clips(task_id: str, params: dict):
     emit_result(task_id, "success", data={"clips": clips})
 
 
+def handle_find_moment(task_id: str, params: dict):
+    """Locate user-pasted/described moments in the transcript via the AI CLI."""
+    from services.claude_suggest import find_moments_from_text
+
+    text = (params.get("text") or params.get("description") or "").strip()
+    segments = params.get("segments", [])
+    existing_clips = params.get("existing_clips", [])
+    max_results = params.get("max_results", 8)
+
+    if not text:
+        emit_result(task_id, "error", error="text is required")
+        return
+    if not segments:
+        emit_result(task_id, "error", error="segments is required")
+        return
+
+    clips = find_moments_from_text(
+        text,
+        segments,
+        existing_clips,
+        progress_callback=lambda pct, msg: emit_progress(task_id, "searching", pct, msg),
+        max_results=max_results,
+    )
+    emit_result(task_id, "success", data={"clips": clips})
+
+
 def handle_generate_content(task_id: str, params: dict):
     """Generate titles, descriptions, tags for a clip using PodStack knowledge base."""
     from services.content_generator import generate_clip_content
@@ -485,6 +511,7 @@ def handle_run_integration_tool(task_id: str, params: dict):
     "presets": handle_presets,
     "corrections": handle_corrections,
     "suggest_clips": handle_suggest_clips,
+    "find_moment": handle_find_moment,
     "generate_content": handle_generate_content,
     "manage_integrations": handle_manage_integrations,
     "run_integration_tool": handle_run_integration_tool,
diff --git a/backend/services/claude_suggest.py b/backend/services/claude_suggest.py
index 2f406c7..f86947d 100644
--- a/backend/services/claude_suggest.py
+++ b/backend/services/claude_suggest.py
@@ -375,6 +375,162 @@ def _dedupe_clips_by_range(clips: list[dict]) -> list[dict]:
     return deduped
 
 
+def find_moments_from_text(
+    description: str,
+    segments: list[dict],
+    existing_clips: Optional[list[dict]] = None,
+    progress_callback: Optional[Callable[[int, str], None]] = None,
+    max_results: int = 3,
+) -> list[dict]:
+    """Locate the moment(s) the user described/pasted in the transcript via an AI
+    CLI. Returns clip dicts (same shape as suggest_with_claude). Status goes to
+    progress_callback; warnings to stderr — never stdout, which is the task
+    runner's JSON-RPC channel."""
+    existing_clips = existing_clips or []
+    candidates = _find_ai_cli_candidates()
+    if not candidates:
+        print("No AI CLI available for moment search", file=sys.stderr, flush=True)
+        return []
+
+    if progress_callback:
+        progress_callback(15, "Reading transcript...")
+    transcript_text = _build_transcript_text(segments)
+
+    existing_desc = ""
+    if existing_clips:
+        existing_desc = "\n\nALREADY SELECTED (do not re-suggest these):\n"
+        for c in existing_clips:
+            existing_desc += f"- {c.get('start_second')}s-{c.get('end_second')}s: {c.get('title', '')}\n"
+
+    upper = max(1, int(max_results))
+    prompt = f"""Find the moment(s) the user is describing in this podcast transcript. Return ONLY valid JSON.
+
+USER WANTS: "{description}"
+{existing_desc}
+RULES:
+- Find the EXACT moment(s) matching what the user pasted/described
+- The user may list several moments — return one clip per distinct moment they mention
+- Return 1-{upper} matching moments (best match first)
+- All timestamps in SECONDS as numbers
+- Duration target: {TARGET_CLIP_DURATION_MIN}-{TARGET_CLIP_DURATION_MAX} seconds, max {MAX_CLIP_DURATION} seconds
+- Cut tight: start at the hook, end when the point lands
+- Use segments to cut filler if needed
+
+Return this JSON:
+{{
+  "clips": [
+    {{
+      "title": "First sentence of the moment",
+      "start_second": 123.4,
+      "end_second": 158.4,
+      "segments": [{{"start": 123.4, "end": 158.4}}],
+      "duration": 35,
+      "content_type": "guest_story",
+      "scores": {{"standalone": 4, "hook": 5, "relevance": 4, "quotability": 3}},
+      "total_score": 16,
+      "quote": "The key quote",
+      "why": "Why this matches what the user asked for"
+    }}
+  ]
+}}
+
+Transcript:
+{transcript_text}"""
+
+    project_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "..")
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False, dir=project_dir) as f:
+        f.write(prompt)
+        prompt_file = f.name
+
+    try:
+        for idx, (cli_path, engine) in enumerate(candidates):
+            if progress_callback:
+                label = "Claude" if engine == "claude" else "Codex"
+                progress_callback(40, f"Searching transcript with {label}...")
+            try:
+                result = _run_ai_command(
+                    cli_path=cli_path,
+                    engine=engine,
+                    prompt=prompt,
+                    prompt_file=prompt_file,
+                    project_dir=project_dir,
+                    timeout=300,
+                )
+            except Exception:
+                continue
+
+            if result.returncode != 0 or not result.stdout.strip():
+                continue
+
+            response = result.stdout.strip()
+            if "```" in response:
+                import re
+
+                fence_match = re.search(r"```(?:json)?\s*\n?(.*?)\n?\s*```", response, re.DOTALL)
+                if fence_match:
+                    response = fence_match.group(1).strip()
+
+            try:
+                json_start = response.find("{")
+                if json_start >= 0:
+                    data, _ = json.JSONDecoder().raw_decode(response, json_start)
+                else:
+                    data = json.loads(response)
+            except Exception:
+                continue
+
+            found = []
+            for c in data.get("clips", []):
+                scores = c.get("scores", {})
+                total = sum(scores.values()) if scores else c.get("total_score", 0)
+                keep_segments = []
+                for seg in c.get("segments", []):
+                    s = round(float(seg.get("start", 0)), 1)
+                    e = round(float(seg.get("end", 0)), 1)
+                    if e > s:
+                        keep_segments.append({"start": s, "end": e})
+
+                start_sec = round(float(c.get("start_second", 0)), 1)
+                end_sec = round(float(c.get("end_second", 0)), 1)
+                if not keep_segments and end_sec > start_sec:
+                    keep_segments = [{"start": start_sec, "end": end_sec}]
+
+                kept_duration = sum(seg["end"] - seg["start"] for seg in keep_segments)
+                if kept_duration < MIN_CLIP_DURATION or kept_duration > MAX_CLIP_DURATION:
+                    continue
+
+                found.append({
+                    "title": c.get("title", "Untitled")[:55],
+                    "start_second": keep_segments[0]["start"] if keep_segments else start_sec,
+                    "end_second": keep_segments[-1]["end"] if keep_segments else end_sec,
+                    "segments": keep_segments,
+                    "duration": round(kept_duration),
+                    "score": total,
+                    "content_type": c.get("content_type", "unknown"),
+                    "reasoning": c.get("why", ""),
+                    "preview_text": c.get("quote", "")[:120],
+                    "suggested_caption_style": "hormozi",
+                    "quote": c.get("quote", ""),
+                    "why": c.get("why", ""),
+                    "reasons": [c.get("content_type", "")],
+                    "preview": c.get("quote", "")[:120],
+                })
+
+            if found:
+                return _dedupe_clips_by_range(found)
+
+        return []
+
+    except Exception as e:
+        print(f"Moment search error: {e}", file=sys.stderr, flush=True)
+        return []
+    finally:
+        try:
+            os.unlink(prompt_file)
+        except Exception:
+            pass
+
+
 def suggest_with_claude(
     segments: list[dict],
     top_n: int = 5,
diff --git a/src/models/index.ts b/src/models/index.ts
index 5819c56..2c94a02 100644
--- a/src/models/index.ts
+++ b/src/models/index.ts
@@ -2,7 +2,7 @@
 
 export interface TaskRequest {
   task_id: string;
-  task_type: "transcribe" | "parse_transcript" | "create_clip" | "batch_clips" | "analyze_energy" | "pack_transcript" | "detect_encoder" | "presets" | "ping" | "suggest_clips" | "generate_content" | "corrections" | "manage_integrations" | "run_integration_tool" | "manage_config";
+  task_type: "transcribe" | "parse_transcript" | "create_clip" | "batch_clips" | "analyze_energy" | "pack_transcript" | "detect_encoder" | "presets" | "ping" | "suggest_clips" | "find_moment" | "generate_content" | "corrections" | "manage_integrations" | "run_integration_tool" | "manage_config";
   params: Record<string, unknown>;
 }
 
diff --git a/src/ui/client/EpisodeWorkspace.jsx b/src/ui/client/EpisodeWorkspace.jsx
index e4ddf6a..edbf601 100644
--- a/src/ui/client/EpisodeWorkspace.jsx
+++ b/src/ui/client/EpisodeWorkspace.jsx
@@ -531,6 +531,9 @@ const fmt = (s) => `${Math.floor(s / 60)}:${String(Math.floor(s % 60)).padStart(
       const [results, setResults] = useState([]);
       const [error, setError] = useState(null);
       const [previewFile, setPreviewFile] = useState(null);
+      const [momentText, setMomentText] = useState('');
+      const [findingMoment, setFindingMoment] = useState(false);
+      const [momentNotice, setMomentNotice] = useState(null);
 
       const [retryIdx, setRetryIdx] = useState(null);
       const [retryJobId, setRetryJobId] = useState(null);
@@ -902,6 +905,27 @@ const fmt = (s) => `${Math.floor(s / 60)}:${String(Math.floor(s % 60)).padStart(
         finally { setBrowsing(false); }
       }, []);
 
+      const findMoment = async () => {
+        const text = momentText.trim();
+        if (!text || findingMoment) return;
+        setFindingMoment(true); setError(null); setMomentNotice(null);
+        try {
+          const d = await api('/find-moment', { method: 'POST', body: JSON.stringify({ text }) });
+          if (d.error) { setError(d.error); return; }
+          if (!d.added) {
+            setMomentNotice(d.found ? 'Those moments are already in your clips.' : "Couldn't find that moment — try different wording or a direct quote.");
+            return;
+          }
+          // suggestions refresh via the SSE state-sync broadcast
+          setMomentText('');
+          setMomentNotice(`Added ${d.added} moment${d.added !== 1 ? 's' : ''}.`);
+        } catch {
+          setError('Moment search failed.');
+        } finally {
+          setFindingMoment(false);
+        }
+      };
+
       const startExport = async () => {
         setPhase('exporting'); setResults([]);
         const sc = suggestions.filter((_, i) => !deselected.has(i));
@@ -1485,6 +1509,30 @@ const fmt = (s) => `${Math.floor(s / 60)}:${String(Math.floor(s % 60)).padStart(
                 </div>
               )}
 
+              {/* Find a specific moment — paste a quote/description, AI locates it */}
+              {(transcript || transcriptText) && phase !== 'parsing' && phase !== 'suggesting' && phase !== 'exporting' && (
+                <div className="section" style={{ marginTop: 16 }}>
+                  <div className="section-label">Find a specific moment</div>
+                  <textarea
+                    className="input"
+                    rows={3}
+                    placeholder="Paste a quote or describe the moment(s) you want — the AI searches the transcript and adds them to your clips."
+                    value={momentText}
+                    onChange={(e) => { setMomentText(e.target.value); setMomentNotice(null); }}
+                    disabled={findingMoment}
+                    style={{ width: '100%', resize: 'vertical', fontFamily: 'inherit' }}
+                  />
+                  <div style={{ display: 'flex', alignItems: 'center', gap: 10, marginTop: 8 }}>
+                    <button className="btn btn-primary btn-sm" onClick={findMoment} disabled={!momentText.trim() || findingMoment}>
+                      {findingMoment
+                        ? <span style={{ display: 'flex', alignItems: 'center', gap: 6 }}><div className="spinner sm" />Searching{'…'}</span>
+                        : 'Find moments'}
+                    </button>
+                    {momentNotice && <span style={{ color: 'var(--text2)', fontSize: 13 }}>{momentNotice}</span>}
+                  </div>
+                </div>
+              )}
+
               {/* Parsing */}
               {phase === 'parsing' && (
                 <div className="fade-in" style={{ marginTop: 20 }}>
diff --git a/src/ui/web-server.ts b/src/ui/web-server.ts
index 7b77714..86e6ad5 100644
--- a/src/ui/web-server.ts
+++ b/src/ui/web-server.ts
@@ -2122,6 +2122,77 @@ app.post("/api/claude-suggest", async (req, res) => {
   }
 });
 
+// --- Find user-pasted moments (paste a description/quotes, AI locates them) ---
+
+app.post("/api/find-moment", async (req, res) => {
+  const text = typeof req.body?.text === "string" ? req.body.text.trim() : "";
+  if (!text) {
+    res.status(400).json({ error: "Paste a moment or description to search for." });
+    return;
+  }
+  const segs = uiState.transcript?.segments;
+  if (!segs || !Array.isArray(segs) || segs.length === 0) {
+    res
+      .status(400)
+      .json({ error: "No transcript loaded. Transcribe or import one first." });
+    return;
+  }
+
+  try {
+    const existing = uiState.suggestions.map((s) => ({
+      start_second: s.start_second,
+      end_second: s.end_second,
+      title: s.title,
+    }));
+    const result = await executor.execute<{ clips?: SuggestedClip[] }>(
+      "find_moment",
+      { text, segments: segs, existing_clips: existing, max_results: 8 },
+      (event) =>
+        broadcastSSE("job-update", { progress: event.percent, message: event.message }),
+    );
+
+    const found = result.data?.clips ?? [];
+    // Append to existing suggestions, skipping anything at a range we already have.
+    const seen = new Set(
+      uiState.suggestions.map(
+        (s) => `${Math.round(s.start_second * 10)}-${Math.round(s.end_second * 10)}`,
+      ),
+    );
+    const added: SuggestedClip[] = [];
+    for (const c of found) {
+      const key = `${Math.round(c.start_second * 10)}-${Math.round(c.end_second * 10)}`;
+      if (seen.has(key)) continue;
+      seen.add(key);
+      added.push({
+        clip_id: `manual-${Date.now()}-${added.length}`,
+        title: c.title,
+        start_second: c.start_second,
+        end_second: c.end_second,
+        duration: c.duration ?? c.end_second - c.start_second,
+        segments: c.segments,
+        reasoning: c.reasoning ?? "",
+        preview_text: c.preview_text ?? "",
+        content_type: c.content_type,
+        score: c.score,
+        suggested_caption_style: c.suggested_caption_style || "hormozi",
+      });
+    }
+
+    if (added.length > 0) {
+      uiState.suggestions = [...uiState.suggestions, ...added];
+      uiState.phase = "review";
+      uiState.lastUpdated = Date.now();
+      persistState();
+      broadcastSSE("state-sync", uiState);
+    }
+
+    res.json({ clips: added, found: found.length, added: added.length });
+  } catch (err: unknown) {
+    const msg = errMsg(err);
+    res.status(500).json({ error: `Moment search failed: ${msg.substring(0, 200)}` });
+  }
+});
+
 // --- Per-clip content generation (titles, descriptions, tags) ---
 
 app.post("/api/generate-content", async (req, res) => {
diff --git a/tests/test_find_moments.py b/tests/test_find_moments.py
new file mode 100644
index 0000000..b1d1fb2
--- /dev/null
+++ b/tests/test_find_moments.py
@@ -0,0 +1,73 @@
+"""find_moments_from_text: parse AI output into clip dicts, with the AI CLI mocked."""
+
+import json
+import os
+import sys
+import types
+import unittest
+
+ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+BACKEND_ROOT = os.path.join(ROOT, "backend")
+if BACKEND_ROOT not in sys.path:
+    sys.path.insert(0, BACKEND_ROOT)
+
+from services import claude_suggest as cs
+
+SEGMENTS = [
+    {"start": 10.0, "end": 20.0, "text": "We talked about discipline and habits.", "speaker": "A"},
+    {"start": 20.0, "end": 60.0, "text": "The moment I realized failure was the turning point.", "speaker": "B"},
+]
+
+
+def _fake_run(**kwargs):
+    payload = {
+        "clips": [
+            {
+                "title": "The turning point",
+                "start_second": 20.0,
+                "end_second": 58.0,
+                "segments": [{"start": 20.0, "end": 58.0}],
+                "content_type": "guest_story",
+                "scores": {"standalone": 5, "hook": 5, "relevance": 4, "quotability": 4},
+                "quote": "Failure was the turning point",
+                "why": "Directly matches the pasted moment",
+            }
+        ]
+    }
+    return types.SimpleNamespace(returncode=0, stdout=json.dumps(payload), stderr="")
+
+
+class FindMomentsTests(unittest.TestCase):
+    def setUp(self):
+        self._orig_candidates = cs._find_ai_cli_candidates
+        self._orig_run = cs._run_ai_command
+        cs._find_ai_cli_candidates = lambda: [("/usr/bin/claude", "claude")]
+        cs._run_ai_command = lambda **kw: _fake_run(**kw)
+
+    def tearDown(self):
+        cs._find_ai_cli_candidates = self._orig_candidates
+        cs._run_ai_command = self._orig_run
+
+    def test_finds_and_shapes_moment(self):
+        clips = cs.find_moments_from_text("the turning point", SEGMENTS, [])
+        self.assertEqual(len(clips), 1)
+        c = clips[0]
+        self.assertEqual(c["start_second"], 20.0)
+        self.assertEqual(c["end_second"], 58.0)
+        self.assertEqual(c["title"], "The turning point")
+        self.assertEqual(c["segments"], [{"start": 20.0, "end": 58.0}])
+        self.assertTrue(c["suggested_caption_style"])
+        self.assertGreater(c["score"], 0)
+
+    def test_no_ai_cli_returns_empty(self):
+        cs._find_ai_cli_candidates = lambda: []
+        self.assertEqual(cs.find_moments_from_text("x", SEGMENTS, []), [])
+
+    def test_progress_callback_invoked(self):
+        seen = []
+        cs.find_moments_from_text("x", SEGMENTS, [], progress_callback=lambda p, m: seen.append((p, m)))
+        self.assertTrue(seen)
+
+
+if __name__ == "__main__":
+    unittest.main()

From 835f20df8bf2afdcab1ade55f7548eee6a06f4a5 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Tue, 16 Jun 2026 23:48:09 +0400
Subject: [PATCH 32/41] Fix thumbnail crash + audit follow-ups
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Interactive thumbnails passed output=None (the bare namespace has no argparse
  default), crashing in os.makedirs. Default to ./thumbnails at the point of use
  and in the menu namespace.
- Studio launch derived PODCLI_DATA from PODCLI_OUTPUT, which now points at the
  cwd clip dir — scattering cache/working/logs into the working directory. Derive
  data dir from the cache path and propagate PODCLI_OUTPUT to the child instead.
- Migration: resolve the .env target from PODCLI_ENV_FILE (what the loader reads)
  so migrated secrets land where they're loaded and "pending" clears; never treat
  the global managed dir itself as a legacy project; guard presets move against
  source==target.
- whisper.cpp auto-fallback now also engages when a broken openai-whisper install
  fails to load/run, not only when it's missing.
- Studio binds to 127.0.0.1 by default (it serves local files with no auth);
  set PODCLI_HOST=0.0.0.0 to expose on the LAN.
---
 backend/cli.py                    | 11 ++++++++++-
 backend/config_bundle.py          | 18 ++++++++++++++----
 backend/services/transcription.py | 11 ++++++-----
 src/ui/web-server.ts              |  5 ++++-
 4 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/backend/cli.py b/backend/cli.py
index 93244bf..dd735cf 100644
--- a/backend/cli.py
+++ b/backend/cli.py
@@ -2093,6 +2093,11 @@ def cmd_thumbnails(args):
     video = getattr(args, "video", None)
     as_json = getattr(args, "json", False)
 
+    # Interactive callers build a bare namespace without --output's argparse
+    # default, so fall back to the same default the CLI documents.
+    if not getattr(args, "output", None):
+        args.output = "./thumbnails"
+
     # An exact timestamp wins: extract that frame from the video and use it as the photo.
     timestamp = getattr(args, "timestamp", None)
     if photo is None and video and timestamp is not None:
@@ -3474,7 +3479,10 @@ def interactive_menu():
                     "PODCLI_BACKEND": backend_dir,
                     "PYTHON_PATH": sys.executable,
                     "PODCLI_HOME": paths["home"],
-                    "PODCLI_DATA": os.path.dirname(paths["output"]),
+                    # data_dir is the cache's parent — output is now decoupled
+                    # (clips render to the working dir), so don't derive it from output.
+                    "PODCLI_DATA": os.path.dirname(paths["cache"]),
+                    "PODCLI_OUTPUT": paths["output"],
                     "FFMPEG_PATH": os.environ.get("PODCLI_FFMPEG", "ffmpeg"),
                     "FFPROBE_PATH": os.environ.get("PODCLI_FFPROBE", "ffprobe"),
                 }
@@ -4402,6 +4410,7 @@ def _interactive_thumbnails():
         video=video or None,
         logo=None,
         variations=3,
+        output="./thumbnails",
     )
     cmd_thumbnails(args_ns)
 
diff --git a/backend/config_bundle.py b/backend/config_bundle.py
index a47dc63..a36dff2 100644
--- a/backend/config_bundle.py
+++ b/backend/config_bundle.py
@@ -104,8 +104,15 @@ def _legacy_env_file() -> Path:
     return _legacy_project_dir() / ".env"
 
 
+def _global_env_file() -> Path:
+    # Match the file the launcher/loader actually reads (PODCLI_ENV_FILE), so the
+    # migrated secrets land where they'll be loaded and "pending" clears correctly.
+    return Path(os.environ.get("PODCLI_ENV_FILE") or (_global_home() / ".env"))
+
+
 def _legacy_env_pending() -> bool:
-    return _legacy_env_file().is_file() and not (_global_home() / ".env").exists()
+    src = _legacy_env_file()
+    return src.is_file() and src.resolve() != _global_env_file().resolve() and not _global_env_file().exists()
 
 
 def _legacy_home_pending() -> bool:
@@ -116,6 +123,9 @@ def _legacy_home_pending() -> bool:
 
 
 def _legacy_migration_pending() -> bool:
+    # Never treat the global managed dir itself as a legacy project to import.
+    if _legacy_project_dir().resolve() == _global_home().resolve():
+        return False
     return (
         _legacy_home_pending()
         or _legacy_cache_has_content()
@@ -346,7 +356,7 @@ def migrate_legacy_presets(*, dry_run: bool = False) -> dict[str, Any]:
         "skipped": 0,
         "dry_run": dry_run,
     }
-    if not legacy.is_dir():
+    if not legacy.is_dir() or legacy.resolve() == target.resolve():
         return result
     target.mkdir(parents=True, exist_ok=True)
     for src in legacy.glob("*.json"):
@@ -398,14 +408,14 @@ def migrate_legacy_home(*, dry_run: bool = False) -> dict[str, Any]:
 
 def migrate_legacy_env(*, dry_run: bool = False) -> dict[str, Any]:
     src = _legacy_env_file()
-    dest = _global_home() / ".env"
+    dest = _global_env_file()
     result: dict[str, Any] = {
         "source": str(src),
         "target": str(dest),
         "copied": False,
         "dry_run": dry_run,
     }
-    if not src.is_file() or dest.exists():
+    if not src.is_file() or src.resolve() == dest.resolve() or dest.exists():
         return result
     if not dry_run:
         dest.parent.mkdir(parents=True, exist_ok=True)
diff --git a/backend/services/transcription.py b/backend/services/transcription.py
index 6f35470..9ca581e 100644
--- a/backend/services/transcription.py
+++ b/backend/services/transcription.py
@@ -115,11 +115,14 @@ def transcribe_file(
     if progress_callback:
         progress_callback(5, "Loading Whisper model...")
 
+    # Native installs ship whisper.cpp, not openai-whisper. Fall back to it
+    # automatically — whether whisper is missing OR a broken install fails to
+    # load/run — unless the user explicitly asked for the whisper-py engine.
     try:
         import whisper
-    except ImportError as e:
-        # Native installs ship whisper.cpp, not openai-whisper. Fall back to it
-        # automatically unless the user explicitly asked for the whisper-py engine.
+
+        model = whisper.load_model(model_size)
+    except Exception as e:
         if not requested and _whispercpp_ready(model_size):
             return _transcribe_with_whispercpp(file_path, model_size, language, progress_callback)
         raise RuntimeError(
@@ -127,8 +130,6 @@ def transcribe_file(
             "This native install ships whisper.cpp — rerun with --engine whispercpp."
         ) from e
 
-    model = whisper.load_model(model_size)
-
     if progress_callback:
         progress_callback(10, f"Transcribing with Whisper ({model_size})...")
 
diff --git a/src/ui/web-server.ts b/src/ui/web-server.ts
index 86e6ad5..9b7b9e1 100644
--- a/src/ui/web-server.ts
+++ b/src/ui/web-server.ts
@@ -2491,7 +2491,10 @@ async function main() {
     res.sendFile(join(publicDir, "index.html"));
   });
 
-  app.listen(PORT, () => {
+  // Bind to loopback by default — the studio serves local files (clips, assets,
+  // source video) with no auth. Set PODCLI_HOST=0.0.0.0 to expose it on the LAN.
+  const HOST = process.env.PODCLI_HOST || "127.0.0.1";
+  app.listen(PORT, HOST, () => {
     log.info(`podcli running at http://localhost:${PORT}`);
   });
 }

From bdecf3d8a89bd67565349ffc63b53744c7682e62 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:04:47 +0400
Subject: [PATCH 33/41] Full-audit follow-ups: stdout discipline, asset-path
 rewrite, suggest reset

- clip_generator: Remotion-fallback and boundary-revert messages went to stdout,
  which corrupts the JSON-RPC task channel (create_clip/batch_clips via MCP/studio)
  on the fallback path. Route them to stderr.
- config bundle: asset paths in presets/ui-state were keyed in the rewrite map by
  resolved path but matched against the literal stored string, so any symlinked
  path (macOS /var, /tmp) was left pointing at the source machine after import.
  Register raw/expanded/resolved aliases so the literal value is rewritten. Adds a
  regression test (tmpdirs are symlinked on macOS, reproducing the failure).
- find_moments_from_text: write the prompt via the managed .podcli/tmp helper
  instead of the repo root, matching suggest_with_claude.
- studio: reset deselectedIndices when claude-suggest replaces the suggestion list
  so the wrong clips aren't silently excluded from export.
- whisper.cpp: add timeouts to the ffmpeg and whisper-cli subprocess calls so a
  stalled child can't hang the transcribe task forever.
---
 backend/config_bundle.py                     | 31 +++++++++++++++-----
 backend/services/claude_suggest.py           |  9 +++---
 backend/services/clip_generator.py           |  9 +++---
 backend/services/transcription_whispercpp.py |  3 +-
 src/ui/web-server.ts                         |  2 ++
 tests/test_config_bundle.py                  | 25 ++++++++++++++++
 6 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/backend/config_bundle.py b/backend/config_bundle.py
index a36dff2..b0ad823 100644
--- a/backend/config_bundle.py
+++ b/backend/config_bundle.py
@@ -122,6 +122,18 @@ def _legacy_home_pending() -> bool:
     return _has_managed_content(legacy) and not _has_managed_content(_global_home())
 
 
+def _asset_alias_keys(raw: str, source: Path) -> list[str]:
+    """Every string form an asset path may take in stored JSON. Presets/ui-state
+    keep the literal value the user/app wrote, which differs from its realpath
+    whenever a path component is a symlink (e.g. macOS /var -> /private/var). The
+    bundle rewrite matches literally, so register raw, expanded, and resolved."""
+    keys: list[str] = []
+    for k in (raw, str(Path(raw).expanduser()) if raw else "", str(source), str(source.resolve())):
+        if k and k not in keys:
+            keys.append(k)
+    return keys
+
+
 def _legacy_migration_pending() -> bool:
     # Never treat the global managed dir itself as a legacy project to import.
     if _legacy_project_dir().resolve() == _global_home().resolve():
@@ -207,16 +219,18 @@ def export_config(bundle_path: str, source_home: str | None = None) -> dict[str,
         for index, item in enumerate(raw_assets):
             if not isinstance(item, dict):
                 continue
-            source = Path(str(item.get("path", ""))).expanduser()
+            raw_path = str(item.get("path", ""))
+            source = Path(raw_path).expanduser()
             if not source.exists():
                 continue
             archive_name = _archive_name_for(index, str(item.get("name", "asset")), source)
             archive_path = f"{ASSET_ARCHIVE_DIR}/{archive_name}"
             zf.write(source, arcname=archive_path)
             registry_export.append({**item, "path": archive_path})
-            path_map[str(source.resolve())] = archive_path
+            for key in _asset_alias_keys(raw_path, source):
+                path_map[key] = archive_path
 
-        extra_sources: list[Path] = []
+        extra_sources: list[tuple[str, Path]] = []
         for rel in ["ui-state.json"]:
             src = home / rel
             if src.exists():
@@ -225,7 +239,7 @@ def export_config(bundle_path: str, source_home: str | None = None) -> dict[str,
                     for candidate in _collect_asset_paths(raw):
                         candidate_path = Path(candidate).expanduser()
                         if candidate_path.exists():
-                            extra_sources.append(candidate_path)
+                            extra_sources.append((candidate, candidate_path))
 
         presets_dir = home / "presets"
         if presets_dir.exists():
@@ -236,16 +250,19 @@ def export_config(bundle_path: str, source_home: str | None = None) -> dict[str,
                 for candidate in _collect_asset_paths(raw):
                     candidate_path = Path(candidate).expanduser()
                     if candidate_path.exists():
-                        extra_sources.append(candidate_path)
+                        extra_sources.append((candidate, candidate_path))
 
-        for source in extra_sources:
+        for raw_candidate, source in extra_sources:
             resolved = str(source.resolve())
             if resolved in path_map:
+                # Already archived; still register this literal so its JSON gets rewritten.
+                path_map.setdefault(raw_candidate, path_map[resolved])
                 continue
             archive_name = _archive_name_for(len(path_map), source.stem or "asset", source)
             archive_path = f"{ASSET_ARCHIVE_DIR}/{archive_name}"
             zf.write(source, arcname=archive_path)
-            path_map[resolved] = archive_path
+            for key in _asset_alias_keys(raw_candidate, source):
+                path_map[key] = archive_path
 
         zf.writestr("assets/registry.json", json.dumps({"assets": registry_export}, indent=2) + "\n")
         manifest["path_map"] = path_map
diff --git a/backend/services/claude_suggest.py b/backend/services/claude_suggest.py
index f86947d..e180477 100644
--- a/backend/services/claude_suggest.py
+++ b/backend/services/claude_suggest.py
@@ -437,10 +437,11 @@ def find_moments_from_text(
 Transcript:
 {transcript_text}"""
 
-    project_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "..")
-    with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False, dir=project_dir) as f:
-        f.write(prompt)
-        prompt_file = f.name
+    # Prompt goes to .podcli/tmp/ (gitignored), not the repo root, so a crash
+    # mid-run never litters the working tree with transcript dumps.
+    project_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "..")
+    from utils.prompt_files import write_prompt_file
+    prompt_file = write_prompt_file(prompt)
 
     try:
         for idx, (cli_path, engine) in enumerate(candidates):
diff --git a/backend/services/clip_generator.py b/backend/services/clip_generator.py
index f5ad3d2..90e2adc 100644
--- a/backend/services/clip_generator.py
+++ b/backend/services/clip_generator.py
@@ -550,16 +550,16 @@ def _render_with_remotion(
         if stdout:
             lines = [l.strip() for l in stdout.strip().split("\n") if l.strip()]
             if lines:
-                print(f"  Remotion: {lines[-1][:120]}", flush=True)
+                print(f"  Remotion: {lines[-1][:120]}", file=sys.stderr, flush=True)
 
-        print("  Remotion: falling back to ASS for this clip", flush=True)
+        print("  Remotion: falling back to ASS for this clip", file=sys.stderr, flush=True)
         return False, None
 
     except subprocess.TimeoutExpired:
-        print("  Remotion: timed out, using ASS for this clip", flush=True)
+        print("  Remotion: timed out, using ASS for this clip", file=sys.stderr, flush=True)
         return False, None
     except Exception:
-        print("  Remotion: render error, using ASS for this clip", flush=True)
+        print("  Remotion: render error, using ASS for this clip", file=sys.stderr, flush=True)
         return False, None
     finally:
         try:
@@ -700,6 +700,7 @@ def generate_clip(
         print(
             f"  Boundary revert: post-trim duration {duration:.1f}s < "
             f"75% of asked {llm_total:.1f}s - using original range",
+            file=sys.stderr,
             flush=True,
         )
         start_second = llm_start_second
diff --git a/backend/services/transcription_whispercpp.py b/backend/services/transcription_whispercpp.py
index c236cfe..d62f8be 100644
--- a/backend/services/transcription_whispercpp.py
+++ b/backend/services/transcription_whispercpp.py
@@ -22,6 +22,7 @@ def _extract_wav(media_path: str, wav_path: str, ffmpeg: str = "ffmpeg") -> None
         [ffmpeg, "-y", "-loglevel", "error", "-i", media_path,
          "-ar", "16000", "-ac", "1", wav_path],
         check=True,
+        timeout=1800,
     )
 
 
@@ -169,7 +170,7 @@ def transcribe_file(
             cmd += ["--vad", "--vad-model", vad_model]
         if language:
             cmd += ["-l", language]
-        subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.PIPE, text=True)
+        subprocess.run(cmd, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.PIPE, text=True, timeout=7200)
 
         with open(out_base + ".json", encoding="utf-8") as f:
             data = json.load(f)
diff --git a/src/ui/web-server.ts b/src/ui/web-server.ts
index 9b7b9e1..cabbdf3 100644
--- a/src/ui/web-server.ts
+++ b/src/ui/web-server.ts
@@ -2109,6 +2109,8 @@ app.post("/api/claude-suggest", async (req, res) => {
         score: c.score,
         suggested_caption_style: c.suggested_caption_style || "hormozi",
       }));
+      // Deselection is positional; replacing the list invalidates old indices.
+      uiState.deselectedIndices = [];
       uiState.phase = "review";
       broadcastSSE("state-sync", uiState);
     }
diff --git a/tests/test_config_bundle.py b/tests/test_config_bundle.py
index e4e7b7f..be5f61c 100644
--- a/tests/test_config_bundle.py
+++ b/tests/test_config_bundle.py
@@ -83,6 +83,31 @@ def test_export_and_import_round_trip(self):
         self.assertTrue(registry["assets"][0]["path"].startswith(os.path.realpath(self.dst_home)))
         self.assertTrue(os.path.exists(registry["assets"][0]["path"]))
 
+    def test_import_rewrites_preset_and_ui_state_asset_paths(self):
+        # Regression: a preset/ui-state asset path stored as its literal (possibly
+        # symlinked, e.g. macOS /var) value must be rewritten to the new home on
+        # import — otherwise it leaks the source machine's path and breaks once the
+        # source is gone. self.asset_file is the literal src path (under a tmpdir,
+        # which is symlinked on macOS so raw != realpath — exactly the failure case).
+        with open(os.path.join(self.src_home, "presets", "branded.json"), "w", encoding="utf-8") as f:
+            json.dump({"caption_style": "branded", "logo_path": self.asset_file}, f)
+        with open(os.path.join(self.src_home, "ui-state.json"), "w", encoding="utf-8") as f:
+            json.dump({"settings": {"logoPath": self.asset_file}}, f)
+
+        export_config(self.bundle, source_home=self.src_home)
+        import_config(self.bundle, target_home=self.dst_home)
+
+        home_real = os.path.realpath(self.dst_home)
+        with open(os.path.join(self.dst_home, "presets", "branded.json"), encoding="utf-8") as f:
+            preset = json.load(f)
+        with open(os.path.join(self.dst_home, "ui-state.json"), encoding="utf-8") as f:
+            ui = json.load(f)
+
+        self.assertTrue(preset["logo_path"].startswith(home_real), preset["logo_path"])
+        self.assertTrue(os.path.exists(preset["logo_path"]))
+        self.assertTrue(ui["settings"]["logoPath"].startswith(home_real), ui["settings"]["logoPath"])
+        self.assertNotIn(self.src_home, preset["logo_path"])
+
 
     def test_import_restores_backup_on_failure(self):
         export_config(self.bundle, source_home=self.src_home)

From 0111fd462e5617cbe70bf37b8042ea1a431b2c87 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:10:29 +0400
Subject: [PATCH 34/41] Pin redirect hosts for runtime/model downloads
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

downloadOnce followed redirects to any host while only the checksums.txt fetch
was host-pinned — and Node, CPython, and ffmpeg are downloaded without checksum
verification, then executed. Route all binary/model downloads through a client
that refuses redirects off the known source hosts (GitHub + *.githubusercontent,
HuggingFace + its LFS CDN, nodejs.org, evermeet.cx, johnvansickle.com), and use
the pinned client for the GitHub API calls too. Adds tests for the host
allowlist (incl. suffix-spoofing) and redirect refusal.
---
 cli/internal/provision/provision.go     | 44 +++++++++++++++++++++---
 cli/internal/provision/redirect_test.go | 45 +++++++++++++++++++++++++
 2 files changed, 85 insertions(+), 4 deletions(-)
 create mode 100644 cli/internal/provision/redirect_test.go

diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index bb0fab9..d37e80c 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -142,8 +142,7 @@ func downloadOnce(url, tmp, label string) (bool, error) {
 	if start > 0 {
 		req.Header.Set("Range", fmt.Sprintf("bytes=%d-", start))
 	}
-	client := &http.Client{Transport: &http.Transport{ResponseHeaderTimeout: 60 * time.Second}}
-	resp, err := client.Do(req)
+	resp, err := downloadHTTPClient().Do(req)
 	if err != nil {
 		return false, err
 	}
@@ -263,7 +262,7 @@ func WhisperCLIBin() string {
 func latestReleaseAssets() (map[string]string, error) {
 	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/"+podcliRepo+"/releases/latest", nil)
 	req.Header.Set("Accept", "application/vnd.github+json")
-	resp, err := http.DefaultClient.Do(req)
+	resp, err := releaseHTTPClient().Do(req)
 	if err != nil {
 		return nil, err
 	}
@@ -324,6 +323,43 @@ func releaseHTTPClient() *http.Client {
 	}
 }
 
+// allowedDownloadHost pins large-binary/model downloads to their known source
+// hosts and CDNs. Several of these payloads (Node, CPython, ffmpeg) are not
+// checksum-verified, so blocking redirects to unknown hosts is the main defense
+// against a diverted download. Initial request URLs are hardcoded (trusted);
+// this only constrains where a redirect may land.
+func allowedDownloadHost(h string) bool {
+	h = strings.ToLower(h)
+	for _, base := range []string{
+		"github.com",            // release assets
+		"githubusercontent.com", // objects.* / release-assets.* CDN
+		"huggingface.co",        // whisper.cpp models + cdn-lfs*.huggingface.co
+		"nodejs.org",            // hermetic Node
+		"evermeet.cx",           // macOS ffmpeg
+		"johnvansickle.com",     // linux ffmpeg
+	} {
+		if h == base || strings.HasSuffix(h, "."+base) {
+			return true
+		}
+	}
+	return false
+}
+
+func downloadHTTPClient() *http.Client {
+	return &http.Client{
+		Transport: &http.Transport{ResponseHeaderTimeout: 60 * time.Second},
+		CheckRedirect: func(req *http.Request, via []*http.Request) error {
+			if len(via) >= 10 {
+				return fmt.Errorf("too many redirects")
+			}
+			if !allowedDownloadHost(req.URL.Hostname()) {
+				return fmt.Errorf("refusing redirect to untrusted host %q", req.URL.Hostname())
+			}
+			return nil
+		},
+	}
+}
+
 func httpGetBytes(url string) ([]byte, error) {
 	resp, err := releaseHTTPClient().Get(url)
 	if err != nil {
@@ -541,7 +577,7 @@ func pythonAssetURL() (string, error) {
 	}
 	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/astral-sh/python-build-standalone/releases/latest", nil)
 	req.Header.Set("Accept", "application/vnd.github+json")
-	resp, err := http.DefaultClient.Do(req)
+	resp, err := releaseHTTPClient().Do(req)
 	if err != nil {
 		return "", err
 	}
diff --git a/cli/internal/provision/redirect_test.go b/cli/internal/provision/redirect_test.go
new file mode 100644
index 0000000..8e03c9e
--- /dev/null
+++ b/cli/internal/provision/redirect_test.go
@@ -0,0 +1,45 @@
+package provision
+
+import (
+	"net/http"
+	"net/http/httptest"
+	"testing"
+)
+
+func TestAllowedDownloadHost(t *testing.T) {
+	allow := []string{
+		"github.com", "objects.githubusercontent.com", "release-assets.githubusercontent.com",
+		"huggingface.co", "cdn-lfs.huggingface.co", "cdn-lfs-us-1.huggingface.co",
+		"nodejs.org", "evermeet.cx", "johnvansickle.com", "www.johnvansickle.com",
+	}
+	for _, h := range allow {
+		if !allowedDownloadHost(h) {
+			t.Errorf("expected %q to be allowed", h)
+		}
+	}
+	// Suffix spoofing must not pass: a trusted name as a left-label is not enough.
+	deny := []string{
+		"evil.com", "github.com.evil.com", "huggingface.co.attacker.net",
+		"githubusercontent.com.evil.com", "127.0.0.1", "",
+	}
+	for _, h := range deny {
+		if allowedDownloadHost(h) {
+			t.Errorf("expected %q to be denied", h)
+		}
+	}
+}
+
+func TestDownloadClientRefusesUntrustedRedirect(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		http.Redirect(w, r, "https://attacker.example/payload", http.StatusFound)
+	}))
+	defer srv.Close()
+
+	resp, err := downloadHTTPClient().Get(srv.URL)
+	if err == nil {
+		if resp != nil {
+			resp.Body.Close()
+		}
+		t.Fatal("expected redirect to an untrusted host to be refused")
+	}
+}

From b1a04eda88b9f871a45d0c8ee3ef09725bc29597 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:19:35 +0400
Subject: [PATCH 35/41] Auto-provision on first run

Run setup automatically the first time a runtime command (the interactive menu,
process, transcribe, studio) is invoked with no backend present, so a fresh
install works without a separate `podcli setup`. Skipped for lightweight and mcp
commands. Trim narration comments added in recent commits.
---
 backend/config_bundle.py | 19 ++++++-------------
 cli/main.go              | 26 ++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/backend/config_bundle.py b/backend/config_bundle.py
index b0ad823..1e1e000 100644
--- a/backend/config_bundle.py
+++ b/backend/config_bundle.py
@@ -61,12 +61,8 @@ def _migration_marker_path() -> Path:
     return _data_dir() / MIGRATION_MARKER_NAME
 
 
-# The old ./podcli kept everything project-local. The native CLI keeps the brand
-# brain (presets, knowledge, assets, history, config) and the transcript cache in
-# the global managed dir so they follow the user across directories; only clips
-# stay in the working dir. Migration therefore reads the *working directory* —
-# the old ./podcli folder the user is standing in — and imports it into the global
-# home/cache. PODCLI_CWD is injected by the Go launcher; getcwd() is the fallback.
+# Migration reads the working directory (the old project-local ./podcli folder)
+# and imports it into the global store. PODCLI_CWD is injected by the launcher.
 def _legacy_project_dir() -> Path:
     return Path(os.environ.get("PODCLI_CWD") or os.getcwd()).expanduser().resolve()
 
@@ -123,10 +119,8 @@ def _legacy_home_pending() -> bool:
 
 
 def _asset_alias_keys(raw: str, source: Path) -> list[str]:
-    """Every string form an asset path may take in stored JSON. Presets/ui-state
-    keep the literal value the user/app wrote, which differs from its realpath
-    whenever a path component is a symlink (e.g. macOS /var -> /private/var). The
-    bundle rewrite matches literally, so register raw, expanded, and resolved."""
+    """All string forms of an asset path. The rewrite matches the literal stored
+    value, which differs from its realpath through a symlink (macOS /var)."""
     keys: list[str] = []
     for k in (raw, str(Path(raw).expanduser()) if raw else "", str(source), str(source.resolve())):
         if k and k not in keys:
@@ -394,9 +388,8 @@ def migrate_legacy_presets(*, dry_run: bool = False) -> dict[str, Any]:
 
 
 def migrate_legacy_home(*, dry_run: bool = False) -> dict[str, Any]:
-    """Import a project-local .podcli brand brain (presets, knowledge, assets,
-    history, config) from the working dir into the global home. Only runs when the
-    global home is still empty so it never clobbers an existing global profile."""
+    """Import a project-local .podcli into the global home — only when the global
+    home is empty, so it never clobbers an existing profile."""
     legacy = _legacy_home_dir()
     home = _global_home()
     result: dict[str, Any] = {
diff --git a/cli/main.go b/cli/main.go
index 9f9e048..6d24257 100644
--- a/cli/main.go
+++ b/cli/main.go
@@ -60,8 +60,34 @@ func main() {
 	}
 }
 
+// wantsRuntime gates first-run auto-provisioning: only commands that need the
+// backend trigger the download, not lightweight ones like config.
+func wantsRuntime(args []string) bool {
+	if len(args) == 0 {
+		return true
+	}
+	switch args[0] {
+	case "process", "transcribe", "studio", "auto":
+		return true
+	}
+	return false
+}
+
+// ensureRuntime self-provisions on first run so `podcli` works without a separate
+// `podcli setup`. Not called on the mcp path, whose stdout is the JSON-RPC channel.
+func ensureRuntime() {
+	if _, ok := engine.BackendRoot(); ok {
+		return
+	}
+	fmt.Fprintln(os.Stderr, "First run — setting up podcli (one-time download)…")
+	setup(nil)
+}
+
 func runEngine(args []string) int {
 	update.NotifyIfOutdated(Version)
+	if wantsRuntime(args) {
+		ensureRuntime()
+	}
 	if transcribeEngine(args) == "whispercpp" {
 		model, err := provision.EnsureModel(transcribeModel(args))
 		if err != nil {

From 75de4de73603bcff4ed2ca6ab52ea3933174c553 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:33:06 +0400
Subject: [PATCH 36/41] Fix thumbnail rendering: ensure the browser and cache
 it stably
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The thumbnail screenshot opened a Remotion browser via openBrowser without first
ensuring one exists (renderMedia, used for captions, ensures it implicitly), so
the first render just timed out connecting to a browser that was never
provisioned. Call ensureBrowser() before openBrowser(), and run the screenshot
from the runtime root — as caption rendering already does — so the Chrome
Headless Shell caches under a stable .remotion/ and is reused across working
directories instead of re-downloading into (and polluting) each one.
---
 backend/scripts/remotion_screenshot.cjs | 6 +++++-
 backend/services/thumbnail_html.py      | 6 ++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/backend/scripts/remotion_screenshot.cjs b/backend/scripts/remotion_screenshot.cjs
index e6f2f9b..045ceea 100644
--- a/backend/scripts/remotion_screenshot.cjs
+++ b/backend/scripts/remotion_screenshot.cjs
@@ -2,7 +2,7 @@
 
 const path = require("node:path");
 const rendererRoot = path.resolve(__dirname, "..", "..", "node_modules", "@remotion", "renderer", "dist");
-const {openBrowser} = require("@remotion/renderer");
+const {openBrowser, ensureBrowser} = require("@remotion/renderer");
 const {screenshot} = require(path.join(rendererRoot, "puppeteer-screenshot.js"));
 
 const [htmlPath, outputPath, widthArg, heightArg, waitMsArg] = process.argv.slice(2);
@@ -46,6 +46,10 @@ const waitForAssets = async (page) => {
 };
 
 (async () => {
+  // Download the Chrome Headless Shell if it isn't cached yet. renderMedia does
+  // this implicitly; openBrowser does not, so without it the first thumbnail
+  // render just times out connecting to a browser that was never provisioned.
+  await ensureBrowser({logLevel: "error"});
   const browser = await openBrowser("chrome", {logLevel: "error"});
   const closeBrowser = () => { try { browser.close({silent: true}); } catch {} };
   process.on("SIGINT", () => { closeBrowser(); process.exit(1); });
diff --git a/backend/services/thumbnail_html.py b/backend/services/thumbnail_html.py
index c52ea86..bd0738e 100644
--- a/backend/services/thumbnail_html.py
+++ b/backend/services/thumbnail_html.py
@@ -657,6 +657,11 @@ def generate_thumbnail(
         if not commands:
             raise RuntimeError("No browser screenshot command available")
 
+        # Run from the runtime root (where caption rendering runs too) so Remotion's
+        # Chrome Headless Shell caches under a stable .remotion/ and is reused across
+        # working directories instead of re-downloading into each one.
+        screenshot_cwd = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".."))
+
         for cmd in commands:
             cmd_label = " ".join(cmd[:3]) if len(cmd) > 3 else " ".join(cmd)
             try:
@@ -665,6 +670,7 @@ def generate_thumbnail(
                     capture_output=True,
                     text=True,
                     timeout=timeout_s,
+                    cwd=screenshot_cwd,
                     # .cmd/npx shims need cmd.exe on Windows; shell=True with a list breaks on POSIX.
                     shell=sys.platform == "win32",
                 )

From d169c5d3227f1a6dd3c0de5dda9a3ccb2940b15d Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:46:30 +0400
Subject: [PATCH 37/41] Always produce a clip: default to ASS caption fallback

Default allow_ass_fallback to true so a remotion caption-render failure (browser
unavailable, offline, arch mismatch) falls back to ffmpeg-burned ASS captions
instead of exporting zero clips. Healthy installs still use remotion first, so
this only rescues the cases that would otherwise hard-fail.
---
 backend/presets.py                      | 2 +-
 backend/scripts/remotion_screenshot.cjs | 3 ---
 backend/services/thumbnail_html.py      | 3 ---
 3 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/backend/presets.py b/backend/presets.py
index ff74696..70f30f6 100644
--- a/backend/presets.py
+++ b/backend/presets.py
@@ -40,7 +40,7 @@
     "energy_boost": True,
     "quality": "max",
     "no_speakers": False,
-    "allow_ass_fallback": False,
+    "allow_ass_fallback": True,
     "use_ass_captions": False,
     "generate_thumbnails": True,
     "generate_content": True,
diff --git a/backend/scripts/remotion_screenshot.cjs b/backend/scripts/remotion_screenshot.cjs
index 045ceea..0942669 100644
--- a/backend/scripts/remotion_screenshot.cjs
+++ b/backend/scripts/remotion_screenshot.cjs
@@ -46,9 +46,6 @@ const waitForAssets = async (page) => {
 };
 
 (async () => {
-  // Download the Chrome Headless Shell if it isn't cached yet. renderMedia does
-  // this implicitly; openBrowser does not, so without it the first thumbnail
-  // render just times out connecting to a browser that was never provisioned.
   await ensureBrowser({logLevel: "error"});
   const browser = await openBrowser("chrome", {logLevel: "error"});
   const closeBrowser = () => { try { browser.close({silent: true}); } catch {} };
diff --git a/backend/services/thumbnail_html.py b/backend/services/thumbnail_html.py
index bd0738e..68cf3ea 100644
--- a/backend/services/thumbnail_html.py
+++ b/backend/services/thumbnail_html.py
@@ -657,9 +657,6 @@ def generate_thumbnail(
         if not commands:
             raise RuntimeError("No browser screenshot command available")
 
-        # Run from the runtime root (where caption rendering runs too) so Remotion's
-        # Chrome Headless Shell caches under a stable .remotion/ and is reused across
-        # working directories instead of re-downloading into each one.
         screenshot_cwd = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", ".."))
 
         for cmd in commands:

From 8fe8bc9015beac82a5b47b8f50bf3048b38e4374 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:51:52 +0400
Subject: [PATCH 38/41] Add node-less native installers (curl / PowerShell)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A new machine can install with only curl or PowerShell — no Go, Node, Python, or
FFmpeg required. install.sh / install.ps1 download the prebuilt, statically
linked binary (checksum-verified) and put it on PATH; the binary provisions its
own hermetic runtimes on first launch. README install section rewritten for the
native flow.
---
 README.md   | 54 +++++++++++++------------------------
 install.ps1 | 47 ++++++++++++++++++++++++++++++++
 install.sh  | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 143 insertions(+), 35 deletions(-)
 create mode 100644 install.ps1
 create mode 100755 install.sh

diff --git a/README.md b/README.md
index 7ecaac1..4d4af33 100644
--- a/README.md
+++ b/README.md
@@ -151,59 +151,43 @@ Both halves share the same **knowledge base** (`.podcli/knowledge/`) — your sh
 
 ---
 
-## Prerequisites
+## Install
 
-| Tool                       | Install                                                                                                                                              |
-| -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
-| **Node.js** >= 18          | [nodejs.org](https://nodejs.org)                                                                                                                     |
-| **Python** >= 3.10         | [python.org](https://python.org)                                                                                                                     |
-| **FFmpeg**                 | `brew install ffmpeg` / `sudo apt install ffmpeg`                                                                                                    |
-| **Claude Code** (optional) | [docs.anthropic.com](https://docs.anthropic.com/en/docs/claude-code) — needed for PodStack slash commands                                            |
-| **Codex** (optional)       | [openai.com/codex](https://openai.com/index/introducing-codex/) — alternative AI engine for clip suggestion (auto-detected if Claude is unavailable) |
-
-## Quick Start
+No prerequisites — the install fetches a self-contained binary, and the first run
+provisions everything it needs (Python, Node, FFmpeg, whisper.cpp, models) into a
+managed directory. You don't need Go, Node, Python, or FFmpeg installed.
 
 **macOS / Linux**
 
 ```bash
-git clone https://github.com/nmbrthirteen/podcli.git
-cd podcli
-chmod +x setup.sh podcli
-./setup.sh
+curl -fsSL https://raw.githubusercontent.com/nmbrthirteen/podcli/main/install.sh | sh
 ```
 
-**Windows**
+**Windows (PowerShell)**
 
 ```powershell
-git clone https://github.com/nmbrthirteen/podcli.git
-cd podcli
+irm https://raw.githubusercontent.com/nmbrthirteen/podcli/main/install.ps1 | iex
 ```
 
-Then double-click **`install.cmd`** (or run it in a terminal). It installs everything and keeps the window open so you can see the result. To launch the studio afterwards:
+**With npm** (if you already have Node):
 
-```powershell
-powershell -ExecutionPolicy Bypass -File setup.ps1 -Ui
+```bash
+npm install -g podcli
 ```
 
-This will:
-
-1. Check system dependencies (Node, Python, FFmpeg)
-2. Create a Python virtual environment and install packages
-3. Install Node packages and build TypeScript
-4. Set up PodStack slash commands and knowledge base templates
-5. Create the local `.podcli/` data directory
-6. Launch the web UI at **http://localhost:3847**
-
-### Setup options
+Then just run it — the first launch sets itself up:
 
 ```bash
-./setup.sh              # full install + launch UI
-./setup.sh --install    # install only
-./setup.sh --ui         # launch UI only (skip install)
-./setup.sh --mcp        # print MCP config for Claude
+podcli                       # interactive menu (and Web UI)
+podcli process episode.mp4   # transcribe + export clips
 ```
 
-On Windows, use `.\setup.ps1` with `-Install`, `-Ui`, or `-Mcp`, and run the CLI via `podcli.cmd` (e.g. `podcli process video.mp4 --top 5`).
+**Optional**, for AI clip suggestion and the PodStack slash commands: install
+[Claude Code](https://docs.anthropic.com/en/docs/claude-code) or
+[Codex](https://openai.com/index/introducing-codex/) (auto-detected).
+
+> Building from source needs Go 1.23+ (and Node for the studio bundle); see
+> [`plans/native-cli.md`](plans/native-cli.md).
 
 ---
 
diff --git a/install.ps1 b/install.ps1
new file mode 100644
index 0000000..d89591e
--- /dev/null
+++ b/install.ps1
@@ -0,0 +1,47 @@
+# podcli installer for Windows — downloads the prebuilt native binary (no Go,
+# Node, Python, or ffmpeg needed; the binary provisions those on first run).
+# Usage: irm https://raw.githubusercontent.com/nmbrthirteen/podcli/main/install.ps1 | iex
+$ErrorActionPreference = 'Stop'
+$repo = 'nmbrthirteen/podcli'
+$target = 'windows-amd64'
+
+$homeDir = Join-Path $env:LOCALAPPDATA 'podcli'
+$binDir = Join-Path $homeDir 'bin'
+New-Item -ItemType Directory -Force -Path $binDir | Out-Null
+
+$version = $env:PODCLI_VERSION
+if (-not $version) {
+  $rel = Invoke-RestMethod "https://api.github.com/repos/$repo/releases/latest" -Headers @{ 'User-Agent' = 'podcli-install' }
+  $version = $rel.tag_name -replace '^v', ''
+}
+
+$asset = "podcli-$target.exe"
+$base = "https://github.com/$repo/releases/download/v$version"
+Write-Host "Installing podcli v$version ($target)…"
+
+$dest = Join-Path $binDir 'podcli.exe'
+Invoke-WebRequest "$base/$asset" -OutFile $dest -UseBasicParsing
+
+try {
+  $sums = (Invoke-WebRequest "$base/checksums.txt" -UseBasicParsing).Content
+  $want = $sums -split "`n" |
+    Where-Object { $_ -match ([regex]::Escape($asset) + '\s*$') } |
+    ForEach-Object { ($_ -split '\s+')[0] } | Select-Object -First 1
+  if ($want) {
+    $got = (Get-FileHash $dest -Algorithm SHA256).Hash.ToLower()
+    if ($got -ne $want.ToLower()) { Remove-Item $dest -Force; throw "checksum mismatch (got $got want $want)" }
+    Write-Host "  checksum verified"
+  } else {
+    Write-Host "  no checksum entry for $asset — skipped verification"
+  }
+} catch {
+  Write-Host "  checksum verification skipped: $($_.Exception.Message)"
+}
+
+$userPath = [Environment]::GetEnvironmentVariable('Path', 'User')
+if ($userPath -notlike "*$binDir*") {
+  [Environment]::SetEnvironmentVariable('Path', "$binDir;$userPath", 'User')
+  Write-Host "  added to PATH (restart your terminal)"
+}
+Write-Host ""
+Write-Host "Done — run:  podcli"
diff --git a/install.sh b/install.sh
new file mode 100755
index 0000000..4b2a2db
--- /dev/null
+++ b/install.sh
@@ -0,0 +1,77 @@
+#!/bin/sh
+# podcli installer — downloads the prebuilt native binary (no Go, Node, Python,
+# or ffmpeg needed; the binary provisions those itself on first run).
+# Usage: curl -fsSL https://raw.githubusercontent.com/nmbrthirteen/podcli/main/install.sh | sh
+set -eu
+
+REPO="nmbrthirteen/podcli"
+err() { echo "podcli-install: $*" >&2; exit 1; }
+command -v curl >/dev/null 2>&1 || err "curl is required"
+
+os=$(uname -s 2>/dev/null || echo unknown)
+arch=$(uname -m 2>/dev/null || echo unknown)
+case "$os" in
+  Darwin) goos=darwin; home_dir="$HOME/Library/Application Support/podcli" ;;
+  Linux) goos=linux; home_dir="${XDG_DATA_HOME:-$HOME/.local/share}/podcli" ;;
+  *) err "unsupported OS: $os (on Windows use install.ps1)" ;;
+esac
+case "$arch" in
+  x86_64|amd64) goarch=amd64 ;;
+  arm64|aarch64) goarch=arm64 ;;
+  *) err "unsupported architecture: $arch" ;;
+esac
+target="${goos}-${goarch}"
+bin_dir="$home_dir/bin"
+mkdir -p "$bin_dir"
+
+version="${PODCLI_VERSION:-}"
+if [ -z "$version" ]; then
+  version=$(curl -fsSL "https://api.github.com/repos/$REPO/releases/latest" \
+    | sed -n 's/.*"tag_name":[ ]*"v\{0,1\}\([^"]*\)".*/\1/p' | head -1)
+  [ -n "$version" ] || err "could not resolve the latest release"
+fi
+
+asset="podcli-${target}"
+base="https://github.com/$REPO/releases/download/v${version}"
+echo "Installing podcli v${version} (${target})…"
+
+tmp=$(mktemp -d)
+trap 'rm -rf "$tmp"' EXIT
+curl -fSL --proto '=https' --tlsv1.2 "$base/$asset" -o "$tmp/$asset" || err "download failed"
+
+if curl -fsSL "$base/checksums.txt" -o "$tmp/sums" 2>/dev/null; then
+  want=$(grep "[ /]$asset\$" "$tmp/sums" | awk '{print $1}' | head -1)
+  if [ -n "$want" ]; then
+    if command -v sha256sum >/dev/null 2>&1; then
+      got=$(sha256sum "$tmp/$asset" | awk '{print $1}')
+    else
+      got=$(shasum -a 256 "$tmp/$asset" | awk '{print $1}')
+    fi
+    [ "$got" = "$want" ] || err "checksum mismatch (got $got want $want)"
+    echo "  checksum verified"
+  else
+    echo "  no checksum entry for $asset — skipped verification" >&2
+  fi
+else
+  echo "  no checksums.txt in release — skipped verification" >&2
+fi
+
+cp "$tmp/$asset" "$bin_dir/podcli"
+chmod 0755 "$bin_dir/podcli"
+echo "  installed: $bin_dir/podcli"
+
+linked=""
+for d in /usr/local/bin "$HOME/.local/bin"; do
+  if [ -d "$d" ] && [ -w "$d" ]; then
+    ln -sf "$bin_dir/podcli" "$d/podcli" 2>/dev/null && { linked="$d/podcli"; break; }
+  fi
+done
+
+echo
+if [ -n "$linked" ]; then
+  echo "Done — run:  podcli"
+else
+  echo "Done. Add podcli to your PATH:"
+  echo "  export PATH=\"$bin_dir:\$PATH\""
+  echo "Then run:  podcli"
+fi

From e223dfdd903566be04b7fb47e6931bac6744a2c1 Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 00:55:18 +0400
Subject: [PATCH 39/41] Harden two file-path surfaces

- Knowledge upload wrote the raw multipart originalname into the knowledge dir,
  so a name like ../../x.md escaped it (multer doesn't sanitize separators).
  Use basename for both the stored name and the extension filter.
- extractTarXz shelled out to system tar with no traversal guard (the unverified
  johnvansickle ffmpeg path). List entries first and reject absolute or ..
  paths before extracting.
---
 cli/internal/provision/provision.go | 14 ++++++++++++++
 src/ui/web-server.ts                |  8 +++-----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index d37e80c..b55319b 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -523,6 +523,20 @@ func extractTarXz(archive string, bins []string, dest string) error {
 		return err
 	}
 	defer os.RemoveAll(tmp)
+	listing, err := exec.Command("tar", "-tf", archive).Output()
+	if err != nil {
+		return fmt.Errorf("tar list (is tar installed?): %w", err)
+	}
+	for _, name := range strings.Split(string(listing), "\n") {
+		name = strings.TrimSpace(name)
+		if name == "" {
+			continue
+		}
+		clean := filepath.Clean(name)
+		if filepath.IsAbs(clean) || clean == ".." || strings.HasPrefix(clean, ".."+string(os.PathSeparator)) {
+			return fmt.Errorf("refusing archive with unsafe path: %q", name)
+		}
+	}
 	cmd := exec.Command("tar", "-xf", archive, "-C", tmp)
 	cmd.Stderr = os.Stderr
 	if err := cmd.Run(); err != nil {
diff --git a/src/ui/web-server.ts b/src/ui/web-server.ts
index cabbdf3..fd9d3f8 100644
--- a/src/ui/web-server.ts
+++ b/src/ui/web-server.ts
@@ -1792,13 +1792,11 @@ const knowledgeUpload = multer({
       await mkdir(paths.knowledge, { recursive: true });
       cb(null, paths.knowledge);
     },
-    filename: (_req, file, cb) => cb(null, file.originalname),
+    filename: (_req, file, cb) => cb(null, basename(file.originalname)),
   }),
   fileFilter: (_req, file, cb) => {
-    if (
-      file.originalname.endsWith(".md") ||
-      file.originalname.endsWith(".txt")
-    ) {
+    const name = basename(file.originalname);
+    if (name.endsWith(".md") || name.endsWith(".txt")) {
       cb(null, true);
     } else {
       cb(new Error("Only .md and .txt files are allowed"));

From 108b8f559bdcd03e2a820cbb4f5137d1d433461d Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 01:05:28 +0400
Subject: [PATCH 40/41] Checksum-verify Node/CPython downloads; native-flow
 docs; release guide

- Verify the Node tarball against nodejs.org SHASUMS256.txt and the CPython
  build against python-build-standalone's SHA256SUMS after download. Fail closed
  on mismatch; fail open if the manifest is unreachable so a transient network
  issue can't block provisioning.
- README: rewrite usage for the native binary (podcli, not ./podcli), Web UI via
  the menu, and MCP via `podcli mcp install`.
- Add RELEASE.md (tag-driven release + smoke test); align root package.json to 2.0.0.
---
 README.md                           | 41 +++++++---------
 RELEASE.md                          | 50 +++++++++++++++++++
 cli/internal/provision/provision.go | 75 +++++++++++++++++++++++------
 cli/internal/provision/studio.go    |  3 ++
 package.json                        |  2 +-
 5 files changed, 133 insertions(+), 38 deletions(-)
 create mode 100644 RELEASE.md

diff --git a/README.md b/README.md
index 4d4af33..a325018 100644
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@
 <p align="center"><sub>▶ <a href="https://x.com/nikasiradze_/status/2056061654664708570">Watch with sound on X</a></sub></p>
 
 ```bash
-./podcli process episode.mp4
+podcli process episode.mp4
 ```
 
 One command transcribes, picks the best moments, crops to the face, and burns captions in. Nothing leaves your machine.
@@ -70,7 +70,7 @@ The first half is **video processing** — podcli's core engine. The second half
 Drag your video into the Web UI, or use the CLI:
 
 ```bash
-./podcli process episode.mp4
+podcli process episode.mp4
 ```
 
 ### 2. Get clips automatically
@@ -147,7 +147,7 @@ Both halves share the same **knowledge base** (`.podcli/knowledge/`) — your sh
 - **Preset system** — save named configurations per show
 - **MCP server** — 17 tools for Claude Desktop / Claude Code integration
 - **Web UI** — single-page flow at `localhost:3847`
-- **CLI** — one-command processing: `./podcli process episode.mp4`
+- **CLI** — one-command processing: `podcli process episode.mp4`
 
 ---
 
@@ -196,7 +196,7 @@ podcli process episode.mp4   # transcribe + export clips
 ### Web UI
 
 ```bash
-./setup.sh --ui
+podcli            # then choose "Open Web UI"
 # → http://localhost:3847
 ```
 
@@ -211,17 +211,17 @@ podcli process episode.mp4   # transcribe + export clips
 
 ```bash
 # One command. Auto-transcribes, picks moments, renders clips.
-./podcli process episode.mp4
+podcli process episode.mp4
 ```
 
 With more control:
 
 ```bash
 # Use an existing transcript instead of transcribing
-./podcli process episode.mp4 --transcript transcript.txt --top 5
+podcli process episode.mp4 --transcript transcript.txt --top 5
 
 # Full options
-./podcli process episode.mp4 \
+podcli process episode.mp4 \
   --transcript transcript.txt \
   --top 8 \
   --caption-style branded \
@@ -232,9 +232,9 @@ With more control:
 ### Presets
 
 ```bash
-./podcli presets save myshow --caption-style branded --logo logo.png --top 5
-./podcli presets list
-./podcli process video.mp4 --preset myshow
+podcli presets save myshow --caption-style branded --logo logo.png --top 5
+podcli presets list
+podcli process video.mp4 --preset myshow
 ```
 
 ### Content Workflow (PodStack)
@@ -289,30 +289,25 @@ Manage via the web UI at `/knowledge.html` (drag & drop, inline editor) or throu
 
 podcli is a [Model Context Protocol](https://modelcontextprotocol.io) server — Claude can use it as a tool to create clips through conversation.
 
+**Claude Code** — register the bundled MCP server in one command:
+
+```bash
+podcli mcp install
+```
+
 **Claude Desktop** — add to `claude_desktop_config.json`:
 
 ```json
 {
   "mcpServers": {
     "podcli": {
-      "command": "node",
-      "args": ["/path/to/podcli/dist/index.js"],
-      "env": {
-        "PYTHON_PATH": "/path/to/podcli/venv/bin/python3"
-      }
+      "command": "podcli",
+      "args": ["mcp"]
     }
   }
 }
 ```
 
-**Claude Code:**
-
-```bash
-claude mcp add podcli -- node /path/to/podcli/dist/index.js
-```
-
-Run `./setup.sh --mcp` to get the exact config with your paths filled in.
-
 ### MCP Tools
 
 | Tool                 | Description                                                              |
diff --git a/RELEASE.md b/RELEASE.md
new file mode 100644
index 0000000..f518e74
--- /dev/null
+++ b/RELEASE.md
@@ -0,0 +1,50 @@
+# Releasing podcli
+
+Distribution is fully automated by `.github/workflows/release.yml`: pushing a
+`v*` tag builds the launcher for all five platforms, builds whisper.cpp, bundles
+the studio and Remotion, generates `checksums.txt`, publishes a GitHub release,
+and publishes the npm package. End users then install with no prerequisites:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/nmbrthirteen/podcli/main/install.sh | sh
+# or: npm install -g podcli   (Windows: install.ps1)
+```
+
+## One-time setup
+
+- Set the **`NPM_TOKEN`** repository secret (an npm automation token with publish
+  rights). Without it the GitHub release still publishes; the npm job fails and
+  can be re-run after the secret is added.
+
+## Cutting a release
+
+1. Pick the version `X.Y.Z`. Set it in **`npm/package.json`** (`install.js` fetches
+   the binary from `releases/download/vX.Y.Z/`, so the npm version must equal the
+   tag). Optionally align the root `package.json`.
+2. Merge to `main` and make sure CI is green.
+3. Tag and push:
+   ```bash
+   git tag vX.Y.Z
+   git push origin vX.Y.Z
+   ```
+4. Watch the `release` workflow. It produces these assets (the names the npm
+   wrapper, self-update, and provisioner expect):
+   - `podcli-{darwin,linux,windows}-{amd64,arm64}[.exe]` — static launchers
+   - `whisper-cli-<os>-<arch>[.exe]`
+   - `studio-bundle.tar.gz`
+   - `remotion-<os>-<arch>.tar.gz`
+   - `checksums.txt`
+5. Verify the npm publish succeeded (`npm view podcli version`).
+
+## Smoke test (ideally one machine per OS)
+
+```bash
+curl -fsSL .../install.sh | sh      # or npm i -g podcli
+podcli doctor                        # paths + engine resolution
+podcli process sample.mp4 --top 1    # transcribe -> suggest -> export a clip
+podcli                               # interactive menu -> Open Web UI
+```
+
+First run downloads the hermetic runtimes (Python, Node, FFmpeg, model) once.
+Needs outbound HTTPS to github.com, huggingface.co, nodejs.org, and the ffmpeg
+hosts. glibc Linux only (Alpine/musl is unsupported).
diff --git a/cli/internal/provision/provision.go b/cli/internal/provision/provision.go
index b55319b..ce31c5d 100644
--- a/cli/internal/provision/provision.go
+++ b/cli/internal/provision/provision.go
@@ -129,6 +129,41 @@ func download(url, dest, wantSHA, label string) error {
 	return nil
 }
 
+// verifyDownload checks a downloaded archive against an upstream sha256sum-style
+// manifest. Fails closed on a mismatch (removes the file); fails open with a
+// warning when the manifest or its entry can't be fetched, so a transient
+// network issue on the sums file doesn't block provisioning.
+func verifyDownload(archive, sumsURL, name string) error {
+	resp, err := downloadHTTPClient().Get(sumsURL)
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "  (could not fetch checksums for %s — skipped verification)\n", name)
+		return nil
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		fmt.Fprintf(os.Stderr, "  (no checksums for %s — skipped verification)\n", name)
+		return nil
+	}
+	data, err := io.ReadAll(io.LimitReader(resp.Body, 4<<20))
+	if err != nil {
+		return err
+	}
+	want := ParseChecksums(data)[name]
+	if want == "" {
+		fmt.Fprintf(os.Stderr, "  (no checksum entry for %s — skipped verification)\n", name)
+		return nil
+	}
+	got, err := sha256file(archive)
+	if err != nil {
+		return err
+	}
+	if got != want {
+		os.Remove(archive)
+		return fmt.Errorf("checksum mismatch for %s: got %s want %s", name, got, want)
+	}
+	return nil
+}
+
 func downloadOnce(url, tmp, label string) (bool, error) {
 	var start int64
 	if fi, err := os.Stat(tmp); err == nil {
@@ -584,20 +619,20 @@ func PythonBin() string {
 // pythonAssetURL resolves a python-build-standalone install_only tarball for
 // this platform via the GitHub latest-release API, so it tracks upstream
 // without a hardcoded version that rots.
-func pythonAssetURL() (string, error) {
+func pythonAssetURL() (url, name, sumsURL string, err error) {
 	triple, ok := pyTriples[runtime.GOOS+"/"+runtime.GOARCH]
 	if !ok {
-		return "", fmt.Errorf("no python build for %s/%s", runtime.GOOS, runtime.GOARCH)
+		return "", "", "", fmt.Errorf("no python build for %s/%s", runtime.GOOS, runtime.GOARCH)
 	}
 	req, _ := http.NewRequest(http.MethodGet, "https://api.github.com/repos/astral-sh/python-build-standalone/releases/latest", nil)
 	req.Header.Set("Accept", "application/vnd.github+json")
 	resp, err := releaseHTTPClient().Do(req)
 	if err != nil {
-		return "", err
+		return "", "", "", err
 	}
 	defer resp.Body.Close()
 	if resp.StatusCode != http.StatusOK {
-		return "", fmt.Errorf("github api: HTTP %d", resp.StatusCode)
+		return "", "", "", fmt.Errorf("github api: HTTP %d", resp.StatusCode)
 	}
 	var rel struct {
 		Assets []struct {
@@ -606,29 +641,36 @@ func pythonAssetURL() (string, error) {
 		} `json:"assets"`
 	}
 	if err := json.NewDecoder(resp.Body).Decode(&rel); err != nil {
-		return "", err
+		return "", "", "", err
 	}
-	match := func(prefer string) string {
+	sums := ""
+	for _, a := range rel.Assets {
+		if a.Name == "SHA256SUMS" {
+			sums = a.URL
+			break
+		}
+	}
+	match := func(prefer string) (string, string) {
 		for _, a := range rel.Assets {
 			if strings.Contains(a.Name, triple) && strings.HasSuffix(a.Name, "install_only.tar.gz") && strings.Contains(a.Name, prefer) {
-				return a.URL
+				return a.URL, a.Name
 			}
 		}
-		return ""
+		return "", ""
 	}
-	if u := match("cpython-3.12."); u != "" {
-		return u, nil
+	if u, n := match("cpython-3.12."); u != "" {
+		return u, n, sums, nil
 	}
-	if u := match("cpython-3."); u != "" {
-		return u, nil
+	if u, n := match("cpython-3."); u != "" {
+		return u, n, sums, nil
 	}
-	return "", fmt.Errorf("no install_only python asset for %s", triple)
+	return "", "", "", fmt.Errorf("no install_only python asset for %s", triple)
 }
 
 func EnsurePython(requirements string) (string, error) {
 	bin := PythonBin()
 	if !have(bin) {
-		url, err := pythonAssetURL()
+		url, name, sumsURL, err := pythonAssetURL()
 		if err != nil {
 			return "", err
 		}
@@ -637,6 +679,11 @@ func EnsurePython(requirements string) (string, error) {
 		if err := fetch(url, archive, "cpython"); err != nil {
 			return "", err
 		}
+		if sumsURL != "" {
+			if err := verifyDownload(archive, sumsURL, name); err != nil {
+				return "", err
+			}
+		}
 		err = extractTarGz(archive, paths.RuntimeDir())
 		os.Remove(archive)
 		if err != nil {
diff --git a/cli/internal/provision/studio.go b/cli/internal/provision/studio.go
index 43a6e17..f11a40e 100644
--- a/cli/internal/provision/studio.go
+++ b/cli/internal/provision/studio.go
@@ -60,6 +60,9 @@ func EnsureNode() (string, error) {
 		return "", err
 	}
 	defer os.Remove(archive)
+	if err := verifyDownload(archive, fmt.Sprintf("https://nodejs.org/dist/v%s/SHASUMS256.txt", nodeVersion), base+"."+ext); err != nil {
+		return "", err
+	}
 	if err := os.RemoveAll(NodeDir()); err != nil {
 		return "", err
 	}
diff --git a/package.json b/package.json
index fde9871..9164b4e 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "podcli",
-  "version": "1.1.0",
+  "version": "2.0.0",
   "description": "AI-powered podcast clip generator for TikTok/YouTube Shorts. Transcribe, find viral moments, export vertical clips with burned captions.",
   "type": "module",
   "license": "AGPL-3.0-only",

From 97d8fdc1155ef8affe7d7e16391a6b31495bbb6a Mon Sep 17 00:00:00 2001
From: Nika Siradze <nikushasiradzee@gmail.com>
Date: Wed, 17 Jun 2026 01:13:04 +0400
Subject: [PATCH 41/41] Align docs to the native install flow

- README: Web UI via the menu (not setup.sh), project tree shows cli/ launcher +
  installers, setup.sh labeled as dev setup.
- CONTRIBUTING: document the Go launcher build + go test/gofmt in the PR checklist
  alongside the Python/TS dev setup.
- AGENTS.podstack: drop the nonexistent `setup.sh --host` flag.
---
 AGENTS.podstack.md |  3 ++-
 CONTRIBUTING.md    | 38 ++++++++++++++++++++++++++++++--------
 README.md          |  7 ++++---
 3 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/AGENTS.podstack.md b/AGENTS.podstack.md
index 13623a9..a046563 100644
--- a/AGENTS.podstack.md
+++ b/AGENTS.podstack.md
@@ -174,7 +174,8 @@ PodStack ships one source-of-truth (`commands/`) and installs to the right locat
 | opencode | `.opencode/commands/*.md` | `AGENTS.md` |
 | Generic | `commands/*.md` | `AGENTS.md` |
 
-Install: `./setup.sh --host <name>`. See `README.md` for per-host usage examples.
+These command files ship with podcli; place the set for your tool (left column) in
+its command dir. See `README.md` for per-host usage examples.
 
 ---
 
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 3468c15..5f9ddfe 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,20 +2,41 @@
 
 Thank you for helping improve podcli. This project is AGPL-3.0 — contributions are welcome under the same license.
 
+## Architecture
+
+podcli ships as a native Go launcher (`cli/`) that provisions hermetic runtimes
+and routes commands to the Python backend (`backend/`) and the Node studio/MCP
+server (`src/`). End users install a prebuilt binary; the source trees below are
+for development.
+
 ## Development setup
 
+Backend (Python) + studio (TypeScript):
+
 ```bash
-./setup.sh
+./setup.sh        # venv + Python deps
 npm install
-npm run build
+npm run build     # tsc + studio bundle
+```
+
+Native launcher (Go 1.23+):
+
+```bash
+cd cli
+go generate ./... # sync backend + PodStack commands for go:embed
+go build -o podcli .
 ```
 
-Python backend lives in `backend/`. TypeScript MCP server and Web UI live in `src/`.
+`cli/internal/backend/files/` is generated by `go generate` (gitignored) — edit
+the canonical `backend/` instead.
 
 ## Project layout
 
-| Path                                    | Purpose                                                         |
+| Path                                    | Purpose                                                          |
 | --------------------------------------- | --------------------------------------------------------------- |
+| `cli/`                                  | Go launcher — provisioning, self-update, command routing         |
+| `backend/`                              | Python engine (transcription, rendering, clip generation)        |
+| `src/`                                  | TypeScript MCP server + Web UI                                   |
 | `.podcli/`                              | Config home (knowledge, presets, assets, settings) — gitignored |
 | `data/`                                 | Runtime data (cache, output, working) — gitignored              |
 | `backend/config/paths.py`               | Canonical path resolution (Python)                              |
@@ -25,10 +46,11 @@ Python backend lives in `backend/`. TypeScript MCP server and Web UI live in `sr
 
 ## Before you open a PR
 
-1. Run tests: `python3 -m unittest discover -s tests -v`
-2. Run TypeScript build: `npm run build`
-3. If you change paths, env vars, or cache layout, update `README.md` and add a note to the config migration logic.
-4. Keep diffs focused — one feature or fix per PR when possible.
+1. Python tests: `python3 -m pytest tests/ -q`
+2. TypeScript: `npm run build` and `npx vitest run`
+3. Go (if you touched `cli/`): `cd cli && go build ./... && go test ./... && gofmt -l .`
+4. If you change paths, env vars, or cache layout, update `README.md` and the config migration logic, and keep `backend/config/paths.py` and `src/config/paths.ts` aligned.
+5. Keep diffs focused — one feature or fix per PR when possible.
 
 ## Path and cache conventions
 
diff --git a/README.md b/README.md
index a325018..24fbfde 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ The first half is **video processing** — podcli's core engine. The second half
 ### 1. Drop in your episode
 
 ```bash
-./setup.sh --ui
+podcli            # then choose "Open Web UI"
 # → http://localhost:3847
 ```
 
@@ -347,8 +347,9 @@ podcli mcp install
 
 ```
 podcli/
-├── podcli                    # CLI entry point
-├── setup.sh                  # one-command install & launch
+├── cli/                      # Go launcher (native binary, provisioning, self-update)
+├── install.sh / install.ps1 # node-less installers
+├── setup.sh                  # dev environment setup (venv + npm)
 ├── package.json
 ├── CLAUDE.md                 # PodStack master config
 │