LibreChat (chat UI) + code-server (VS Code in browser) + LiteLLM (GPT proxy)
- Caddy (auto-HTTPS), behind free DuckDNS subdomains. GPT-5.5 served via
freemodel.dev's
api.freemodel.devOpenAI-compatible endpoint.
You touch the terminal ONCE to bring it up. After that everything is in the browser.
⚠️ Cannot run alongsideclaude-stacksimultaneously. Both stacks bind to ports 80 and 443. Either stop one before starting the other, or pick one as your daily driver. See the "Switching stacks" section near the bottom.
https://gptbyorm.duckdns.org— ChatGPT-style web UI, conversation memory, large file uploadshttps://gptcodebyorm.duckdns.org— Full VS Code in the browser; install Continue.dev to use GPT inside the editor- LiteLLM in between, with automatic fallback GPT-5.5 → GPT-5.5-mini → GPT-4o-mini when one model rate-limits
- Open https://www.duckdns.org and sign in with GitHub / Google / Twitter / Reddit.
- In the "sub domain" box, register two new subdomains (you've probably
already used
chatbyormandcodebyormfor the Claude stack — these are separate):gptbyorm(or any prefix you like — just keep it consistent)gptcodebyorm
- For each subdomain row, paste your VPS public IP into the current ip field and click update ip.
DuckDNS free tier allows up to 5 subdomains per account, so this still fits.
Verify from the VPS:
nslookup gptbyorm.duckdns.org
nslookup gptcodebyorm.duckdns.orgBoth must resolve to your VPS public IP. If not, re-check the "current ip" field on DuckDNS.
Heads-up: Do NOT put DuckDNS records behind Cloudflare's proxy. Caddy needs to answer the Let's Encrypt HTTP-01 challenge directly. Use DuckDNS as-is.
If you already opened 80 + 443 for claude-stack, you're done. Otherwise:
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enableAlso open 80 + 443 in your VPS provider's firewall (Oracle, AWS, Hetzner, etc.) if they have one.
docker --version
docker compose versionIf both print versions, skip ahead. If not:
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
newgrp dockerFrom your local machine:
scp -r gpt-stack/ user@VPS_IP:~/
ssh user@VPS_IP
cd ~/gpt-stackOr via WinSCP / FileZilla over SFTP. Don't forget hidden files (.env,
.env.example) — in file managers, enable "Show Hidden Files" (Ctrl+H).
.env is pre-filled with random secrets. You only need to replace three
placeholders:
nano .env| Variable | What to paste |
|---|---|
OPENAI_API_KEY |
Your freemodel.dev key (fe_oa_...). The SAME one you used for the Codex CLI docs. |
CODE_SERVER_PASSWORD |
Strong password for the VS Code browser login |
CODE_SERVER_SUDO_PASSWORD |
A different strong password for sudo inside code-server's terminal |
If you're sharing the freemodel key between this stack and
claude-stack, you can use the same key in both — they hit different endpoints (api.vscc.) and consume from the same account.
Save (Ctrl+O → Enter → Ctrl+X).
The Caddyfile is already wired for gptbyorm / gptcodebyorm. If you used
different prefixes, search-and-replace in Caddyfile. Also confirm the email
line points to a real inbox.
curl -i https://api.freemodel.dev/v1/chat/completions \
-H "Authorization: Bearer YOUR_FREEMODEL_KEY" \
-H "content-type: application/json" \
-d '{"model":"gpt-5.5","messages":[{"role":"user","content":"say hi"}],"max_tokens":30}'- ✅
HTTP/2 200+ JSON with text → key works, model slug works, ready to launch. - ❌
401 Unauthorized→ key is invalid or expired. Re-generate in dashboard. - ❌
404 Not Foundon the model → freemodel uses a different slug (e.g.gpt-5,gpt-4o). Copy the right slug intolitellm-config.yamland thedefault:list inlibrechat.yaml. - ❌
404 Not Foundon the path → freemodel uses Responses API only. See the "Responses API mode" section at the bottom.
docker compose up -d
docker compose logs -fFirst boot takes 2–5 minutes (image pulls + Caddy fetching Let's Encrypt certs). Watch for:
- ✅
litellm Proxy initialized— LiteLLM is up - ✅
certificate obtained successfully× 2 — TLS works for both subdomains - ✅
LibreChat server listening on port 3080
Press Ctrl+C to stop tailing (containers keep running).
- Visit https://gptbyorm.duckdns.org → register your first LibreChat account
→ pick
gpt-5.5from the model dropdown → say hi - Visit https://gptcodebyorm.duckdns.org → log in using
CODE_SERVER_PASSWORD
After your account is created, lock down signup:
sed -i 's/ALLOW_REGISTRATION=true/ALLOW_REGISTRATION=false/' .env
docker compose restart librechatInside the browser VS Code at https://gptcodebyorm.duckdns.org:
- Extensions (Ctrl+Shift+X) → search Continue by Continue.dev → Install
- Open Continue side panel → "Configure manually"
- Replace
~/.continue/config.jsonwith:
{
"models": [
{
"title": "GPT-5.5",
"provider": "openai",
"model": "gpt-5.5",
"apiBase": "http://litellm:4000/v1",
"apiKey": "<paste your LITELLM_MASTER_KEY from .env>"
},
{
"title": "GPT-5.5 mini",
"provider": "openai",
"model": "gpt-5.5-mini",
"apiBase": "http://litellm:4000/v1",
"apiKey": "<same LITELLM_MASTER_KEY>"
}
]
}(litellm:4000 is reachable directly inside the Docker network — no Caddy/DuckDNS needed.)
You can have BOTH folders on the VPS but only ONE running at a time (port 80/443 conflict).
# Switch from claude → gpt
cd ~/claude-stack && docker compose down
cd ~/gpt-stack && docker compose up -d
# Switch from gpt → claude
cd ~/gpt-stack && docker compose down
cd ~/claude-stack && docker compose up -dData is preserved across switches — MongoDB / code-server volumes live separately under each folder.
# Update all images
docker compose pull && docker compose up -d
# Tail one service
docker compose logs -f librechat
# Restart after editing config
docker compose restart litellm
# Lock down signup once you've registered
sed -i 's/ALLOW_REGISTRATION=true/ALLOW_REGISTRATION=false/' .env
docker compose restart librechatfreemodel's official Codex docs use wire_api = "responses", which is OpenAI's
newer Responses API (/v1/responses instead of /v1/chat/completions). Most
resellers expose BOTH endpoints, so Chat Completions usually works. But if
step 6's curl gives 404 on the path, switch to Responses mode:
Edit litellm-config.yaml — for each block, change api_base to keep /v1
and let LiteLLM hit /v1/responses instead. LiteLLM auto-detects this when
the upstream rejects /chat/completions. If that doesn't auto-resolve,
add this under each litellm_params:
extra_headers:
x-prefer-responses-api: "true"Or as a last resort, set the env var OPENAI_API_TYPE=responses in .env
and docker compose restart litellm.
If you get stuck, paste the LiteLLM error log — there's usually a clear hint in the upstream response body.
- freemodel.dev product lines —
api.freemodel.dev= GPT/Codex side,cc.freemodel.dev= Claude side. Same vendor, different APIs, possibly different keys (check your dashboard). - GPT model slugs change — freemodel may version their model strings
(
gpt-5.5-turbo-20251001etc.). Check their dashboard if a slug fails. - Prompt caching — LiteLLM + OpenAI-compatible upstreams support automatic prompt caching. Conversation history in long chats costs much less than the per-request bill suggests.
- Big file uploads — LibreChat is set to 512 MB per file. Caddy passes that through without re-buffering. Don't sandwich Cloudflare in front (their free plan caps uploads at 100 MB).
- Conversation memory — every chat is stored in MongoDB and re-sent on each turn. Docker volumes persist them across restarts.
- DuckDNS account hygiene — log in to duckdns.org once every ~30 days to keep all subdomains active.
| File | Purpose |
|---|---|
docker-compose.yml |
Defines all 6 services (Caddy, LiteLLM, MongoDB, Meilisearch, LibreChat, code-server) |
Caddyfile |
Reverse-proxy + auto-HTTPS rules |
litellm-config.yaml |
GPT model routing through api.freemodel.dev |
librechat.yaml |
LibreChat → LiteLLM wiring + file-upload limits |
.env |
All secrets (THIS FILE CONTAINS YOUR API KEY — keep private) |
.env.example |
Template only — safe to commit / share |