██████╗ █████╗ ███╗ ███╗██████╗ ██╗
██╔════╝ ██╔══██╗████╗ ████║██╔══██╗██║
██║ ███╗███████║██╔████╔██║██████╔╝██║
██║ ██║██╔══██║██║╚██╔╝██║██╔══██╗██║
╚██████╔╝██║ ██║██║ ╚═╝ ██║██████╔╝██║
╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝
Gambi is a local-first system for sharing OpenAI-compatible LLM endpoints across a trusted network. A central hub tracks rooms and participants, proxies inference requests, and publishes real-time events over SSE.
Participants now connect through a hub-managed tunnel. The hub never needs direct network reachability to the participant's provider endpoint, so localhost and provider credentials can remain local to the participant machine.
The public name Gambi is the short form of gambiarra. Here it means the good kind: creative improvisation under constraints, turned into a practical tool.
Gambi exposes two distinct surfaces:
- Management plane: native Gambi HTTP endpoints under
/v1, plus the operational CLI and SDK management client. - Inference plane: OpenAI-compatible room-scoped endpoints under
/rooms/:code/v1/*, consumed bycreateGambi()and other OpenAI-compatible clients.
That split is deliberate. The management plane is optimized for agents and automation. The inference plane is optimized for application compatibility.
The default inference protocol is the OpenAI Responses API. Chat Completions remains available as a compatibility surface.
Linux / macOS:
curl -fsSL https://gambi.sh/install | bashWindows:
irm https://gambi.sh/install.ps1 | iexnpm / bun:
npm install -g gambi
# or
bun add -g gambiVerify:
gambi --versionnpm install gambi-sdk
# or
bun add gambi-sdkgambi-tui is the human-first monitoring interface. It is separate from the CLI.
bun add -g gambi-tuigambi hub serveMachine-readable dry run:
gambi hub serve --dry-run --format ndjsongambi room create --name "Demo"With room defaults from JSON:
gambi room create --name "Demo" --config ./room-defaults.jsongambi participant join \
--room ABC123 \
--participant-id worker-1 \
--model llama3 \
--endpoint http://localhost:11434gambi participant join probes the local endpoint, registers the participant, opens a participant tunnel back to the hub, and keeps the session alive until interrupted. This works the same way for local hubs and remote hubs on the same trusted network: the endpoint can stay loopback-only on the participant machine.
Preview the registration flow:
gambi participant join \
--room ABC123 \
--participant-id worker-1 \
--model llama3 \
--dry-run \
--format ndjsongambi events watch --room ABC123As NDJSON for scripts:
gambi events watch --room ABC123 --format ndjsonRoom event streams include lifecycle signals such as llm.request, llm.complete, and llm.error.
llm.complete includes baseline observability metrics when available:
ttftMsdurationMsinputTokensoutputTokenstotalTokenstokensPerSecond
import { createGambi } from "gambi-sdk";
import { generateText } from "ai";
const gambi = createGambi({
roomCode: "ABC123",
hubUrl: "http://localhost:3000",
});
const result = await generateText({
model: gambi.any(),
prompt: "Explain how SSE works.",
});
console.log(result.text);import { createGambi, resolveGambiTarget } from "gambi-sdk";
import { generateText } from "ai";
const target = await resolveGambiTarget({
roomCode: "ABC123",
timeoutMs: 1500,
});
const gambi = createGambi({
hubUrl: target.hubUrl,
roomCode: target.roomCode,
});
const result = await generateText({
model: gambi.any(),
prompt: "Hello from a discovered room.",
});Use this when your app is running on a local network and you want to resolve the hub and room before creating the provider. For fixed deployments, you can keep passing hubUrl and roomCode directly.
import { createClient } from "gambi-sdk";
const client = createClient({ hubUrl: "http://localhost:3000" });
const created = await client.rooms.create({ name: "Ops" });
console.log(created.data.room.code);
const participants = await client.participants.list(created.data.room.code);
console.log(participants.data.length);The CLI is resource-oriented:
gambi hub serve
gambi room create
gambi room list
gambi room get
gambi participant join
gambi participant leave
gambi participant heartbeat
gambi events watch
gambi self updateAgent-first behavior:
--format text|json|ndjsonon the operational commands--interactiveand--no-interactive- default
jsonorndjsonwhen stdout is piped - XDG config at
~/.config/gambi/config.json --config -for stdin-driven JSON on commands that accept runtime config
Example config:
{
"defaultEnv": "local",
"envs": {
"local": {
"hubUrl": "http://localhost:3000",
"endpoint": "http://localhost:11434"
},
"staging": {
"hubUrl": "http://192.168.1.10:3000",
"endpoint": "http://localhost:11434"
}
}
}Use createGambi() when your application wants inference through the OpenAI-compatible room endpoints:
const gambi = createGambi({ roomCode: "ABC123" });
gambi.any();
gambi.participant("worker-1");
gambi.model("llama3");
gambi.openResponses.any();
gambi.chatCompletions.any();The top-level helpers default to openResponses. Use the chatCompletions namespace only when you need explicit compatibility with legacy clients or providers.
Use resolveGambiTarget() when the room or hub should be discovered from the local network first:
import { createGambi, resolveGambiTarget } from "gambi-sdk";
const target = await resolveGambiTarget({
roomCode: "ABC123",
});
const gambi = createGambi(target);The SDK also exposes discoverHubs() and discoverRooms() for lower-level discovery workflows.
Use createClient() when your application needs operational control:
const client = createClient({ hubUrl: "http://localhost:3000" });
await client.rooms.list();
await client.rooms.get("ABC123");
await client.participants.upsert("ABC123", "worker-1", {
nickname: "worker-1",
model: "llama3",
endpoint: "http://192.168.1.25:11434",
});
await client.participants.heartbeat("ABC123", "worker-1");
await client.participants.remove("ABC123", "worker-1");Room event watching:
for await (const event of client.events.watchRoom({ roomCode: "ABC123" })) {
console.log(event.type, event.data);
}Management API:
GET /v1/healthGET /v1/roomsPOST /v1/roomsGET /v1/rooms/:codeGET /v1/rooms/:code/participantsPUT /v1/rooms/:code/participants/:idDELETE /v1/rooms/:code/participants/:idPOST /v1/rooms/:code/participants/:id/heartbeatGET /v1/rooms/:code/events
Inference API:
GET /rooms/:code/v1/modelsPOST /rooms/:code/v1/responsesGET /rooms/:code/v1/responses/:idDELETE /rooms/:code/v1/responses/:idPOST /rooms/:code/v1/responses/:id/cancelGET /rooms/:code/v1/responses/:id/input_itemsPOST /rooms/:code/v1/chat/completions
Management responses use envelopes:
{
"data": {
"status": "ok",
"timestamp": 1743884000000
},
"meta": {
"requestId": "req_123"
}
}Management errors are structured:
{
"error": {
"code": "ROOM_NOT_FOUND",
"message": "Room 'ABC123' not found.",
"hint": "Create the room first or verify the code."
},
"meta": {
"requestId": "req_456"
}
}Rooms and participants can both provide runtime defaults. The hub merges them at proxy time with this precedence:
- room defaults
- participant defaults
- request-time overrides
Sensitive config is redacted from public management responses. Public room and participant payloads expose safe summaries instead of raw secrets or instructions.
Participant registrations also expose tunnel connection state through connection, including whether the tunnel is currently connected and the timestamp of the last tunnel heartbeat seen by the hub.
Streaming commands always emit NDJSON for machine-readable output. If you pass --format json to a streaming command, the CLI coerces it to ndjson.
bun install
bun run dev
bun run dev:hub
bun run dev:cli -- --help
bun run dev:cli -- room list --format json
bun run dev:cli -- hub serve --dry-run --format ndjson
bun run build
bun run check-typesRoot dev workflow:
bun run devandbun run dev:hubstart the hub withgambi hub servebun run dev:cli -- <subcommand...>forwards any CLI command from the repo rootbun run dev:monitoris a TUI alias for human-first monitoring- Prefer
bun run dev:cli -- room create --helpandbun run dev:cli -- participant join --helpfor CLI discovery during development
Workspace-specific:
bun run --cwd packages/core check-types
bun run --cwd packages/cli check-types
bun run --cwd packages/sdk check-types
bun run --cwd apps/tui testGambi is designed for trusted local networks. The hub does not provide built-in authentication. Do not expose it directly to the public internet without an external proxy and auth layer.
For longer-term product direction, see:
docs/reference/architecture.mdfor the current transport and proxy modeldocs/reference/observability.mdfor baseline metrics and future observability workdocs/product/vision.mdfor the futuregambi agentsdirection above the current hub