Skip to content

ugudlado/hopper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hopper

Rent a GPU, run vLLM on it, and stop paying once the daily budget is gone. Works against Vast.ai and RunPod, with Tailscale handling the network so the pod is reachable by hostname from anywhere.

What you get How
Vendors Vast.ai, RunPod (one CLI, identical command surface)
Inference server vLLM, OpenAI-compatible at :8000 — any client that speaks OpenAI works
Network Tailscale hostname (e.g. vllm-qwen27b-vast:8000) — no port-forward, no SSH tunnel per session
Cost control Hard DAILY_BUDGET ($/day) and optional DAILY_HOURS cap with grace countdown before terminate
Reproducibility Per-model + per-GPU profiles in src/profiles/*.env
Adding a vendor One file: src/vendors/<name>.sh implementing create/ssh/terminate/status/logs

New here? There's a course that walks you from zero to a personal coding agent running on a rented GPU, reachable from your laptop and your phone. Start at docs/course/README.md.

Quickstart

# 1. One-time: secrets + caps in src/.env (gitignored).
cp src/.env.example src/.env
# edit src/.env: VAST_API_KEY (and/or RUNPOD_API_KEY), TAILSCALE_AUTHKEY,
#                DAILY_BUDGET, DAILY_HOURS

# 2. Pick a profile and a vendor.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast search   # find an offer (or set VAST_OFFER_ID in the profile)

# 3. Create the pod (provisions, installs vLLM, brings up Tailscale).
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast create

# 4. In another terminal, watch the budget.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast supervise

# 5. Use it (Tailscale handles routing — no port-forward).
curl http://vllm-qwen27b-vast:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"cyankiwi/Qwen3.6-27B-AWQ-INT4","messages":[{"role":"user","content":"hi"}]}'

# 6. Tear it down.
PROFILE=qwen3.6-27b-4b-vast ./src/gpu.sh vast terminate

Layout

Path Purpose
src/gpu.sh Vendor dispatcher. gpu.sh <vendor> <command><vendor>_<command>.
src/lib/gpu-common.sh Shared: env loading, spend log, supervise loop, tunnel mgmt.
src/vendors/runpod.sh RunPod implementation.
src/vendors/vast.sh Vast.ai implementation.
src/profiles/*.env Per-model + per-GPU configs. Loaded via PROFILE=name.
src/setup-vllm.sh Runs on the pod; installs vLLM, joins Tailscale, serves the model.
src/remote-session.sh Mac-side tmux helper for mobile sessions over Tailscale.
docs/course/ End-to-end course: zero → personal coding agent on a rented GPU.
docs/ Course, cost analysis, GPU runbook, and archived spike notes.

Commands

gpu.sh <vendor> create         # provision pod, install vLLM, bring up Tailscale
gpu.sh <vendor> ssh            # interactive shell on pod
gpu.sh <vendor> tunnel         # SSH tunnel localhost:8000 -> pod:8000 (fallback if Tailscale absent)
gpu.sh <vendor> status         # pod state + tunnel + vLLM health
gpu.sh <vendor> logs           # tail vLLM log on the pod
gpu.sh <vendor> spend          # today's accumulated spend + hours
gpu.sh <vendor> supervise      # cost/hours watchdog, terminates on cap with grace countdown
gpu.sh <vendor> terminate      # destroy the pod (network volume survives if used)
gpu.sh vast search             # list affordable Vast offers (works around broken CLI filter)

Vendors: runpod, vast. Default vendor controlled by DEFAULT_GPU_VENDOR (fallback: vast).

Supervisor

The supervisor is a cost/hours watchdog. When it hits the budget or hours cap, it notifies you, counts down SUPERVISOR_GRACE_SECS (default 60s) so you can finish or Ctrl+C to abort, then terminates. It does not restart things.

DAILY_BUDGET=5              # $/day before terminate
DAILY_HOURS=5               # optional uptime cap; empty = no hours cap
SUPERVISOR_POLL_SECS=30     # check interval
SUPERVISOR_GRACE_SECS=60    # countdown before terminate

Adding a vendor

  1. Drop src/vendors/<name>.sh.
  2. Implement <name>_create, <name>_terminate, <name>_ssh, <name>_status, <name>_logs. <name>_tunnel is optional.
  3. <name>_supervise and <name>_spend can be one-liners — delegate to supervise_loop and compute_spend_today in src/lib/gpu-common.sh.

Setup guides

  • Course lesson 05: Pi at ~/.pi/agent/models.json pointed at the vLLM endpoint, plus Termius (phone → Tailscale → Mac → tmux) and Cline (VS Code → vLLM) configured against the same hostname. The full course (docs/course/README.md) walks through provisioning end to end.

Known gotchas

See spec/project.yaml gotchas: for the full list, and docs/gpu-runbook.md for vendor-specific war stories.

References

About

Personal GPU-rental harness — provision a vLLM pod on Vast.ai or RunPod, expose it over Tailscale, terminate when the daily budget is spent.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages