Skip to content

ormahim/Web-Learning

Repository files navigation

GPT Coding Stack — Self-hosted (DuckDNS + freemodel.dev edition)

LibreChat (chat UI) + code-server (VS Code in browser) + LiteLLM (GPT proxy)

  • Caddy (auto-HTTPS), behind free DuckDNS subdomains. GPT-5.5 served via freemodel.dev's api.freemodel.dev OpenAI-compatible endpoint.

You touch the terminal ONCE to bring it up. After that everything is in the browser.

⚠️ Cannot run alongside claude-stack simultaneously. Both stacks bind to ports 80 and 443. Either stop one before starting the other, or pick one as your daily driver. See the "Switching stacks" section near the bottom.

What you'll end up with

  • https://gptbyorm.duckdns.org — ChatGPT-style web UI, conversation memory, large file uploads
  • https://gptcodebyorm.duckdns.org — Full VS Code in the browser; install Continue.dev to use GPT inside the editor
  • LiteLLM in between, with automatic fallback GPT-5.5 → GPT-5.5-mini → GPT-4o-mini when one model rate-limits

1. Register DuckDNS subdomains

  1. Open https://www.duckdns.org and sign in with GitHub / Google / Twitter / Reddit.
  2. In the "sub domain" box, register two new subdomains (you've probably already used chatbyorm and codebyorm for the Claude stack — these are separate):
    • gptbyorm (or any prefix you like — just keep it consistent)
    • gptcodebyorm
  3. For each subdomain row, paste your VPS public IP into the current ip field and click update ip.

DuckDNS free tier allows up to 5 subdomains per account, so this still fits.

Verify from the VPS:

nslookup gptbyorm.duckdns.org
nslookup gptcodebyorm.duckdns.org

Both must resolve to your VPS public IP. If not, re-check the "current ip" field on DuckDNS.

Heads-up: Do NOT put DuckDNS records behind Cloudflare's proxy. Caddy needs to answer the Let's Encrypt HTTP-01 challenge directly. Use DuckDNS as-is.

2. Verify ports & firewall (same as claude-stack)

If you already opened 80 + 443 for claude-stack, you're done. Otherwise:

sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

Also open 80 + 443 in your VPS provider's firewall (Oracle, AWS, Hetzner, etc.) if they have one.

3. Docker — already installed if you did claude-stack first

docker --version
docker compose version

If both print versions, skip ahead. If not:

curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
newgrp docker

4. Copy this folder to the VPS

From your local machine:

scp -r gpt-stack/ user@VPS_IP:~/
ssh user@VPS_IP
cd ~/gpt-stack

Or via WinSCP / FileZilla over SFTP. Don't forget hidden files (.env, .env.example) — in file managers, enable "Show Hidden Files" (Ctrl+H).

5. Configure secrets

.env is pre-filled with random secrets. You only need to replace three placeholders:

nano .env
Variable What to paste
OPENAI_API_KEY Your freemodel.dev key (fe_oa_...). The SAME one you used for the Codex CLI docs.
CODE_SERVER_PASSWORD Strong password for the VS Code browser login
CODE_SERVER_SUDO_PASSWORD A different strong password for sudo inside code-server's terminal

If you're sharing the freemodel key between this stack and claude-stack, you can use the same key in both — they hit different endpoints (api. vs cc.) and consume from the same account.

Save (Ctrl+O → Enter → Ctrl+X).

The Caddyfile is already wired for gptbyorm / gptcodebyorm. If you used different prefixes, search-and-replace in Caddyfile. Also confirm the email line points to a real inbox.

6. Sanity-check the freemodel key for GPT BEFORE launching

curl -i https://api.freemodel.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_FREEMODEL_KEY" \
  -H "content-type: application/json" \
  -d '{"model":"gpt-5.5","messages":[{"role":"user","content":"say hi"}],"max_tokens":30}'
  • HTTP/2 200 + JSON with text → key works, model slug works, ready to launch.
  • 401 Unauthorized → key is invalid or expired. Re-generate in dashboard.
  • 404 Not Found on the model → freemodel uses a different slug (e.g. gpt-5, gpt-4o). Copy the right slug into litellm-config.yaml and the default: list in librechat.yaml.
  • 404 Not Found on the path → freemodel uses Responses API only. See the "Responses API mode" section at the bottom.

7. Launch

docker compose up -d
docker compose logs -f

First boot takes 2–5 minutes (image pulls + Caddy fetching Let's Encrypt certs). Watch for:

  • litellm Proxy initialized — LiteLLM is up
  • certificate obtained successfully × 2 — TLS works for both subdomains
  • LibreChat server listening on port 3080

Press Ctrl+C to stop tailing (containers keep running).

8. First login

After your account is created, lock down signup:

sed -i 's/ALLOW_REGISTRATION=true/ALLOW_REGISTRATION=false/' .env
docker compose restart librechat

9. Wire GPT into VS Code (optional)

Inside the browser VS Code at https://gptcodebyorm.duckdns.org:

  1. Extensions (Ctrl+Shift+X) → search Continue by Continue.dev → Install
  2. Open Continue side panel → "Configure manually"
  3. Replace ~/.continue/config.json with:
{
  "models": [
    {
      "title": "GPT-5.5",
      "provider": "openai",
      "model": "gpt-5.5",
      "apiBase": "http://litellm:4000/v1",
      "apiKey": "<paste your LITELLM_MASTER_KEY from .env>"
    },
    {
      "title": "GPT-5.5 mini",
      "provider": "openai",
      "model": "gpt-5.5-mini",
      "apiBase": "http://litellm:4000/v1",
      "apiKey": "<same LITELLM_MASTER_KEY>"
    }
  ]
}

(litellm:4000 is reachable directly inside the Docker network — no Caddy/DuckDNS needed.)

Switching between claude-stack and gpt-stack

You can have BOTH folders on the VPS but only ONE running at a time (port 80/443 conflict).

# Switch from claude → gpt
cd ~/claude-stack && docker compose down
cd ~/gpt-stack && docker compose up -d

# Switch from gpt → claude
cd ~/gpt-stack && docker compose down
cd ~/claude-stack && docker compose up -d

Data is preserved across switches — MongoDB / code-server volumes live separately under each folder.

Common ops

# Update all images
docker compose pull && docker compose up -d

# Tail one service
docker compose logs -f librechat

# Restart after editing config
docker compose restart litellm

# Lock down signup once you've registered
sed -i 's/ALLOW_REGISTRATION=true/ALLOW_REGISTRATION=false/' .env
docker compose restart librechat

Responses API mode (if Chat Completions doesn't work)

freemodel's official Codex docs use wire_api = "responses", which is OpenAI's newer Responses API (/v1/responses instead of /v1/chat/completions). Most resellers expose BOTH endpoints, so Chat Completions usually works. But if step 6's curl gives 404 on the path, switch to Responses mode:

Edit litellm-config.yaml — for each block, change api_base to keep /v1 and let LiteLLM hit /v1/responses instead. LiteLLM auto-detects this when the upstream rejects /chat/completions. If that doesn't auto-resolve, add this under each litellm_params:

      extra_headers:
        x-prefer-responses-api: "true"

Or as a last resort, set the env var OPENAI_API_TYPE=responses in .env and docker compose restart litellm.

If you get stuck, paste the LiteLLM error log — there's usually a clear hint in the upstream response body.

Notes / gotchas

  • freemodel.dev product linesapi.freemodel.dev = GPT/Codex side, cc.freemodel.dev = Claude side. Same vendor, different APIs, possibly different keys (check your dashboard).
  • GPT model slugs change — freemodel may version their model strings (gpt-5.5-turbo-20251001 etc.). Check their dashboard if a slug fails.
  • Prompt caching — LiteLLM + OpenAI-compatible upstreams support automatic prompt caching. Conversation history in long chats costs much less than the per-request bill suggests.
  • Big file uploads — LibreChat is set to 512 MB per file. Caddy passes that through without re-buffering. Don't sandwich Cloudflare in front (their free plan caps uploads at 100 MB).
  • Conversation memory — every chat is stored in MongoDB and re-sent on each turn. Docker volumes persist them across restarts.
  • DuckDNS account hygiene — log in to duckdns.org once every ~30 days to keep all subdomains active.

File reference

File Purpose
docker-compose.yml Defines all 6 services (Caddy, LiteLLM, MongoDB, Meilisearch, LibreChat, code-server)
Caddyfile Reverse-proxy + auto-HTTPS rules
litellm-config.yaml GPT model routing through api.freemodel.dev
librechat.yaml LibreChat → LiteLLM wiring + file-upload limits
.env All secrets (THIS FILE CONTAINS YOUR API KEY — keep private)
.env.example Template only — safe to commit / share

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors