gander is a one-shot CLI that takes a local media file (image / video / audio)
and returns a transcript (when speech is present) plus a structured
description, by driving an agentic CLI backend (agy / claude / codex)
under the hood. It persists every result in SQLite keyed by content hash and prints a
single JSON envelope to stdout.
It runs no model itself: no weights, no API keys, no local inference, no networking of
its own. ffmpeg/ffprobe handle probing/segmentation/extraction;
agy/claude/codex/ffmpeg/ffprobe are external binaries discovered at runtime.
- Consumers: any agent harness that can run a shell command shells out to
ganderand reads the JSON. No service, no socket, no daemon. - Backends:
agy(full vision+audio+video, PTY),claude(vision-only floor),codex(codex exec --yolo, agy-class). Pluggable via one adapter trait.
To take a gander is to take a look — and that is the whole job: hand it a file, it takes a gander and tells you what it sees.
just release # → target/release/gander (single ~6 MB binary)
just test # 54 deterministic tests, no live backendsSee Install to put gander on your PATH.
The binary statically bundles SQLite (rusqlite bundled), so there is no libsqlite
runtime dependency. A fully static musl build for Linux release artifacts is best done
on Linux CI or via cross (just musl).
To build — the Rust toolchain (and just for the
dev recipes, optional):
# Rust (https://rustup.rs)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# just (optional, for `just <recipe>`)
brew install just # macOS
cargo install just # any platformTo run — these external binaries on PATH (or pointed to via GANDER_*_BIN env
vars):
| Binary | Purpose | Install |
|---|---|---|
ffmpeg, ffprobe |
probe / segment / frame + audio extraction | brew install ffmpeg · apt install ffmpeg |
agy and/or claude and/or codex |
the analysis backend(s) — at least one | each is its own CLI, installed + logged in separately |
gander shells out to whichever backend you select; each backend handles its own
login through its own CLI — gander never sees credentials. Everything else (SQLite) is
statically bundled into the binary, so there is nothing else to install at runtime.
Build the binary and copy it onto your PATH (macOS and Linux):
just release
# user-local (no sudo) — ensure ~/.local/bin is on your PATH:
install -m 755 target/release/gander ~/.local/bin/gander
# or system-wide (sudo, on PATH everywhere by default):
sudo install -m 755 target/release/gander /usr/local/bin/gander
gander --version # verify it's on PATHsudo is only needed for the system-wide path because /usr/local/bin is root-owned;
the user-local option needs none.
gander --check # verify which backends are reachable
gander photo.jpg # describe → human-readable text
gander clip.mp4 --output-format json # describe → JSON envelope
gander recall # browse what you've already analyzedRe-running on the same file is an instant $0 cache hit. See Usage for all
flags.
just test # ~54 deterministic tests, ~1s (no backends); CI gates on this
just test-live # optional live smoke against a real agy backendNotes: gander --check may report agy as down — a probe artifact, since the no-media
health prompt does not elicit agy's sentinel block; agy works fine in real describe runs.
Video CHUNKED runs are slow (one backend call per chunk plus a synthesis call), so a
60s clip can take a few minutes.
Run just with no arguments to list every dev recipe.
gander is built to be shelled out to. The contract an integrator codes against:
- Always pass
--output-format json. stdout is then a single envelope object with every key always present; the picker and all diagnostics go to stderr, so stdout stays pure JSON. - Branch on the exit code, then
status— never on stdout prose.0ok/partial,2usage,3input (failed),4backend/auth,1unexpected;partialis0, so read the JSONstatus/warningsto tell them apart. See Exit codes. - Non-interactive by construction. With no TTY, gander never prompts and never
writes
config.toml; pass--backend/--model(orGANDER_*env) to pin behavior. - Free to re-run. Same file ⇒ instant
$0cache hit (keyed by content hash);--forcerecomputes. Untrusted paths ⇒ constrain with--allowed-root.
Command surface:
gander SOURCE --output-format json # describe one local file → envelope
gander recall --query Q --output-format json # full-text search the cache (no model call)
gander recall --output-format json # browse/filter prior results (no model call)
gander cache clear [SOURCE] # forget all assets, or just one file
gander --check # which backends + ffmpeg are reachablerecall and cache never call a backend — they read the local SQLite cache, so they
cost nothing and return instantly. See Recall for the full filter set and
--query for FTS5 search syntax.
gander SOURCE [options] # describe one local file
gander recall [filters] # read-only cache browse (no model call)
gander config {path,show,clear} # inspect / reset persisted defaults
gander cache path # print the cache DB path
gander cache clear [SOURCE] # forget all assets, or just one file
gander --check # health-probe the backends + ffmpeg
gander --version| Flag | Default | Meaning |
|---|---|---|
SOURCE |
— | local path to one image / video / audio file (no URLs) |
--output-format {raw,json} |
raw |
json emits the canonical envelope |
--model {pro,flash,sonnet,haiku,opus,gpt-5.5,gpt-5.4,gpt-5.4-mini} |
pro |
primary model |
--backend {agy,claude,codex} |
agy |
primary backend |
--fallback-model {…,none} |
flash |
model for the fallback attempt (same set + none) |
--fallback-backend {…,none} |
agy |
backend for the fallback attempt |
--no-transcript |
off | skip speech transcription |
--no-translate |
off | no English translation block |
--max-frames N |
12 |
evenly-spaced frames per clip/chunk (clamped [1,64]) |
--fps RATE |
— | fixed-rate frame sampling, capped by --max-frames |
--chunk-length S |
60 |
segment length for the chunked tier |
--max-chunks N |
8 |
cap on chunks; over-limit ⇒ widen segment length |
--max-duration S |
unset | hard-reject videos longer than S |
--force |
off | ignore any cached row and recompute |
--keep-temp |
off | keep the per-call temp workdir (path on stderr) |
--allowed-root DIR |
— | restrict SOURCE to paths under DIR |
--db PATH |
~/.gander/media.db |
override the cache DB location |
--timeout S |
300 |
per-backend wall-clock seconds |
--reconfigure |
— | re-run the interactive first-run setup |
--no-config |
off | ignore the persisted config file for this run |
--check |
— | health-probe the backends + ffmpeg (ignores SOURCE) |
-V, --version |
— | print version and exit |
-h, --help |
— | print help |
Two attempts, each an explicit (backend, model) pair. The primary runs first; on
capacity(429)/timeout/transient/empty/unparseable it demotes to the fallback; an
auth signal aborts immediately (no fallback). --fallback-model none (or
--fallback-backend none) disables the second attempt. A backend/model mismatch
(e.g. --backend agy --model sonnet) is a usage error (exit 2).
| Duration | Tier | What runs |
|---|---|---|
< 30s |
DIRECT | whole file to the backend |
30–60s |
SINGLE-BATCH | ffmpeg frames+audio → one backend call |
> 60s |
CHUNKED | stream-copy segments → per-chunk call → deterministic merge + one Flash prose-synthesis call |
gander recall [--query Q] [--keyword K] [--text T] [--rating {keep,review,cull}]
[--language L] [--kind {image,video,audio}] [--min-people N]
[--min-duration S] [--has-transcript|--no-transcript]
[--has-audio|--no-audio] [--chunked] [--include-failed] [--all-versions]
[--order-by {updated_at,created_at,rating,people_count,duration_seconds}]
[--asc] [--limit N] [--db PATH] [--output-format {raw,json}]
gander config path | show | clear # ~/.gander/config.toml
gander cache path # print the cache DB path
gander cache clear [SOURCE] [--db PATH] # forget all assets, or just one file
--query is full-text search (SQLite FTS5, BM25-ranked, porter-stemmed) over
summary, description, transcript, English translation, keywords, and filename.
Results come back best-match first (unless --order-by is given), each with the
asset's source_path and a match_context snippet showing why it matched.
FTS5 query syntax passes through (steel OR crane, transcript:prueba, weld*);
strings that fail to parse as FTS5 are retried as plain quoted terms.
gander recall --query "steel beam" # ranked full-text search
gander recall --query worker --kind video --rating keepOne object; every key always present (null/""/0/[] for N/A):
0 ok/partial · 2 usage error · 3 input error (failed) · 4 backend/auth
failure · 1 unexpected. partial exits 0 — read the JSON status/warnings.
Per-setting precedence: flag > GANDER_* env > ~/.gander/config.toml > built-in.
On the first interactive run (a real TTY), gander shows an arrow-key picker (↑/↓,
Enter) for the primary and fallback defaults — backend first, then a model scoped to
that backend (so the pair is always valid) — and saves them to
~/.gander/config.toml. A fallback backend of none skips the fallback model question.
The picker renders to stderr, so it never pollutes the stdout envelope. A non-interactive
run (the agent path) never prompts and never writes. --reconfigure re-runs the picker;
--no-config ignores the file for one run.
Env vars: GANDER_DB_PATH, GANDER_AGY_BIN / GANDER_CLAUDE_BIN / GANDER_CODEX_BIN,
GANDER_FFPROBE_BIN / GANDER_FFMPEG_BIN, GANDER_ALLOWED_ROOT,
GANDER_MODEL_DEFAULT / GANDER_BACKEND_DEFAULT /
GANDER_FALLBACK_MODEL_DEFAULT / GANDER_FALLBACK_BACKEND_DEFAULT,
GANDER_PRINT_TIMEOUT_S, GANDER_CHUNK_LEN_S / GANDER_MAX_CHUNKS,
GANDER_MAX_DURATION_S, GANDER_MAX_FRAMES / GANDER_FRAME_FPS.
SOURCE is treated as untrusted: symlinks/.. are resolved first; URLs, NUL bytes,
non-regular files, and paths outside --allowed-root are rejected. Backend args are
passed as argv vectors (no shell). Frames and standalone images are EXIF/GPS-stripped
(-map_metadata -1). The source file is never mutated. The cache DB is 0600.
Footgun: the backends run with
--dangerously-skip-permissions(agy),bypassPermissions(claude), and--yolo=danger-full-access(codex). These are on by design so the one-shot call is non-interactive — use--allowed-rootand run against trusted inputs.
{ "status": "ok", // "ok" | "partial" | "failed" "error": null, "warnings": [], "parse_ok": true, "media_kind": "image", "content_sha256": "…", "cached": false, "summary": "…", "description": "**Scene:** …", "transcript": null, "language": null, "english_translation": null, "structured": { "rating": "keep", "people_count": 0, "keywords": […], … }, "media": { "duration": null, "wxh": "2048x2048", "has_audio": false, … }, "backend": { "model_used": "…", "backend_used": "agy", "attempts": [ … ] }, "schema_version": "2026-06-08.1", "tool_version": "0.1.0" }