hopper

Rent a GPU, run vLLM on it, and stop paying once the daily budget is gone. Works against Vast.ai and RunPod, with Tailscale handling the network so the pod is reachable by hostname from anywhere.

What you get	How
Vendors	Vast.ai, RunPod (one CLI, identical command surface)
Inference server	vLLM, OpenAI-compatible at `:8000` — any client that speaks OpenAI works
Network	Tailscale hostname (e.g. `vllm-qwen27b-vast:8000`) — no port-forward, no SSH tunnel per session
Cost control	Hard `DAILY_BUDGET` ($/day) and optional `DAILY_HOURS` cap with grace countdown before terminate
Reproducibility	Per-model + per-GPU profiles in `src/profiles/*.env`
Adding a vendor	One file: `src/vendors/<name>.sh` implementing `create`/`ssh`/`terminate`/`status`/`logs`

New here? There's a course that walks you from zero to a personal coding agent running on a rented GPU, reachable from your laptop and your phone. Start at docs/course/README.md.

Quickstart

# 1. One-time: secrets + caps in src/.env (gitignored).
cp src/.env.example src/.env
# edit src/.env: VAST_API_KEY (and/or RUNPOD_API_KEY), TAILSCALE_AUTHKEY,
#                DAILY_BUDGET, DAILY_HOURS

# 2. Pick a profile and a vendor.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast search   # find an offer (or set VAST_OFFER_ID in the profile)

# 3. Create the pod (provisions, installs vLLM, brings up Tailscale).
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast create

# 4. In another terminal, watch the budget.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast supervise

# 5. Use it (Tailscale handles routing — no port-forward).
curl http://vllm-qwen27b-vast:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"cyankiwi/Qwen3.6-27B-AWQ-INT4","messages":[{"role":"user","content":"hi"}]}'

# 6. Tear it down.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast terminate

Layout

Path	Purpose
`src/gpu.sh`	Vendor dispatcher. `gpu.sh <vendor> <command>` → `<vendor>_<command>`.
`src/lib/gpu-common.sh`	Shared: env loading, spend log, supervise loop, tunnel mgmt.
`src/vendors/runpod.sh`	RunPod implementation.
`src/vendors/vast.sh`	Vast.ai implementation.
`src/profiles/*.env`	Per-model + per-GPU configs. Loaded via `PROFILE=name`.
`src/setup-vllm.sh`	Runs on the pod; installs vLLM, joins Tailscale, serves the model.
`src/remote-session.sh`	Mac-side tmux helper for mobile sessions over Tailscale.
`docs/course/`	End-to-end course: zero → personal coding agent on a rented GPU.
`docs/`	Course, cost analysis, GPU runbook, and archived spike notes.

Commands

gpu.sh <vendor> create         # provision pod, install vLLM, bring up Tailscale
gpu.sh <vendor> ssh            # interactive shell on pod
gpu.sh <vendor> tunnel         # SSH tunnel localhost:8000 -> pod:8000 (fallback if Tailscale absent)
gpu.sh <vendor> status         # pod state + tunnel + vLLM health
gpu.sh <vendor> logs           # tail vLLM log on the pod
gpu.sh <vendor> spend          # today's accumulated spend + hours
gpu.sh <vendor> supervise      # cost/hours watchdog, terminates on cap with grace countdown
gpu.sh <vendor> terminate      # destroy the pod (network volume survives if used)
gpu.sh vast search             # list affordable Vast offers (works around broken CLI filter)

Vendors: runpod, vast. Default vendor controlled by DEFAULT_GPU_VENDOR (fallback: vast).

Supervisor

The supervisor is a cost/hours watchdog. When it hits the budget or hours cap, it notifies you, counts down SUPERVISOR_GRACE_SECS (default 60s) so you can finish or Ctrl+C to abort, then terminates. It does not restart things.

DAILY_BUDGET=5              # $/day before terminate
DAILY_HOURS=5               # optional uptime cap; empty = no hours cap
SUPERVISOR_POLL_SECS=30     # check interval
SUPERVISOR_GRACE_SECS=60    # countdown before terminate

Adding a vendor

Drop src/vendors/<name>.sh.
Implement <name>_create, <name>_terminate, <name>_ssh, <name>_status, <name>_logs. <name>_tunnel is optional.
<name>_supervise and <name>_spend can be one-liners — delegate to supervise_loop and compute_spend_today in src/lib/gpu-common.sh.

Setup guides

Course lesson 05: Pi at ~/.pi/agent/models.json pointed at the vLLM endpoint, plus Termius (phone → Tailscale → Mac → tmux) and Cline (VS Code → vLLM) configured against the same hostname. The full course (docs/course/README.md) walks through provisioning end to end.

Known gotchas

See spec/project.yaml gotchas: for the full list, and docs/gpu-runbook.md for vendor-specific war stories.

References

Cost analysis: $100/month with 27B: how the GPU and model choices shake out on a real budget

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docs		docs
spec		spec
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hopper

Quickstart

Layout

Commands

Supervisor

Adding a vendor

Setup guides

Known gotchas

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hopper

Quickstart

Layout

Commands

Supervisor

Adding a vendor

Setup guides

Known gotchas

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages