Flip between local LLM runtimes from your menu bar.
One click to activate. One click to stop everything.
|
|
Running local models on an Apple Silicon Mac usually means a sprawl of terminal windows, half-remembered launch scripts, and no clean way to see what's actually running. Model Switchboard puts llama.cpp, MLX, Ollama, vLLM, SGLang, TGI, MLC-LLM, Mistral.rs, oMLX, vLLM-MLX, rVLLM MLX, LM Studio, Jan, and named command launchers behind one menu bar panel. Click Activate, every other model stops, and the one you picked comes up at an OpenAI-compatible endpoint.
No terminals. No orphan processes. No "green dot" lies.
|
Pick a profile. Click Activate. Every other model stops. The chosen runtime comes up at an OpenAI-compatible endpoint, and the menu bar reflects the state in real time. |
Profiles go green only after a real |
Native adapters for |
Activate stops every other profile and brings the chosen one up. No more forgetting to kill -9 a 24 GB process before starting the next one. Profiles are marked ready only after a real /v1/models probe (or your custom HTTP check) passes — if it says green, it means green. Built with SwiftUI and MenuBarExtra: no Electron, no bundled inference engine, no resident background worker pegging your CPU.
Model Switchboard is runtime-oriented, not model-family-oriented: if your runtime can expose an OpenAI-compatible endpoint, the app can track it, health-check it, switch it, and tag it. That means Qwen, Gemma, Llama, Mistral, GLM, DeepSeek, and other local models are supported through whichever backend serves them.
| Support level | Runtimes and providers |
|---|---|
| Native command adapters | llama.cpp, MLX / mlx_lm.server, rVLLM MLX, vLLM-MLX, Ollama, vLLM, SGLang, Hugging Face TGI, llama-cpp-python |
| Named launcher profiles | DDTree MLX, TurboQuant, oMLX, Mistral.rs, MLC-LLM, LightLLM, FastChat, OpenLLM, Nexa, ExLlamaV2, Aphrodite, LMDeploy, MLX Omni Server, MLX OpenAI Server, MLX Serve, text-generation-webui, KoboldCpp, TabbyAPI, llamafile |
| External endpoints | LM Studio, Jan, LocalAI, LiteLLM, ollmlx, Triton-backed OpenAI-compatible servers, or any local/custom OpenAI-compatible base URL |
Use RUNTIME_TAGS for model-level traits such as coding, q8, long-context, vision, or agentic. The full canonical runtime table lives in Controller/RUNTIME_SUPPORT.md.
Same codebase, two apps. Pick at install time. They live side by side as Model Switchboard.app and Model Switchboard Plus.app under ~/Applications/.
The controller contract, profile discovery, runtime tags, and launcher support are shared by both editions. Plus adds the extra operator UI: live utilization badges, benchmarks, reopen-last, and integrations.
| Base | Plus | |
|---|---|---|
| Profile list with live status | ✓ | ✓ |
Activate / Start / Stop / Restart |
✓ | ✓ |
Refresh / Stop All |
✓ | ✓ |
Launch At Login + attached Settings / Help |
✓ | ✓ |
| CPU / RAM / GPU utilization badges | — | ✓ |
Benchmark All + per-profile Benchmark |
— | ✓ |
| In-app Benchmarks panel + CSV export | — | ✓ |
Reopen Last |
— | ✓ |
Sync Droid and future integration adapters |
— | ✓ |
- macOS 14 (Sonoma) or later
- Apple Silicon recommended. Intel Macs run the app fine, but MLX runtimes require Apple Silicon
- A running controller that exposes the controller contract. This repo ships a reference controller under
Controller/
Signed DMG (recommended). Grab the latest from Releases:
Model-Switchboard-<version>.dmg(Base)Model-Switchboard-Plus-<version>.dmg(Plus)
Open, drag to Applications, launch.
From source.
git clone https://github.com/AdityaVG13/Model-Switchboard.git
cd Model-Switchboard
./Scripts/install.sh # Base
./Scripts/install.sh --variant plus # PlusThe installer places a fresh build under ~/Applications/, installs model-switchboardctl to ~/.local/bin, writes bash/zsh/fish completions, registers the app with Launch Services, and forces a Spotlight import so Raycast and Alfred pick it up immediately. Use ./Scripts/install.sh --help for quiet mode, custom install paths, --verify, and --skip-open.
Model Switchboard is the control surface. It does not run models itself. You need a controller that knows how to launch and health-check models.
1. Install the reference controller:
./Controller/install-model-switchboard-controller.shThe controller exposes its API at http://127.0.0.1:8877 under a per-user LaunchAgent. Use --root, --host, --port, --no-start, or --verify when installing a dedicated controller checkout.
2. Drop a profile manifest into the controller's model-profiles/ folder (the exact path is shown in Settings). If you run the reference controller in this repo, that is Controller/model-profiles/; if you keep a dedicated controller root, it is <controller-root>/model-profiles/. A minimal llama.cpp example:
DISPLAY_NAME=Qwen 3.5 35B Local
RUNTIME=llama.cpp
MODEL_PATH=/path/to/model.gguf
PORT=8080
REQUEST_MODEL=qwen35-local
SERVER_MODEL_ID=qwen35-local3. Open the menu bar icon. Your profile appears. Click Activate.
Every profile must resolve to a unique endpoint. Reusing the same HOST:PORT or BASE_URL across two profiles is a configuration error, and the controller doctor will flag it.
Using your own runtime or launcher? Any OpenAI-compatible endpoint works. The controller has adapters and tags for MLX, Ollama, vLLM, SGLang, TGI, llama-cpp-python, rVLLM MLX, vLLM-MLX, DDTree MLX, TurboQuant, Mistral.rs, MLC-LLM, LightLLM, FastChat, OpenLLM, Nexa, ExLlamaV2, Aphrodite, LMDeploy, LiteLLM, external endpoints, and generic binaries. See runtime support.
I already downloaded the app — set up the controller for me
Paste this prompt into your favorite AI agent to wire the controller up against the runtimes you actually have installed:
I already downloaded Model Switchboard on my Mac.
Set up the reference controller for me, create working model profiles for the runtimes I actually have installed, and make the configuration portable instead of hardcoding your own assumptions.
Rules:
- This is macOS-only, and that is intentional.
- Do not hardcode Homebrew paths, repo-local build paths, or personal directories unless you first verify they exist on this machine.
- Prefer profile-driven config:
- `MODEL_PATH` or `MODEL_FILE` with `MODEL_ROOT` for llama.cpp
- `MODEL_DIR` or `MODEL_REPO` for MLX
- `SERVER_BIN` when a runtime binary is not already on PATH
- A named `RUNTIME` plus `START_COMMAND` or `SERVER_BIN` for launchers without a native adapter yet
- JSON profiles for structured values; `.env` profiles are declarative key/value files, not shell scripts
- Use the controller contract and profile format documented in this repo's `SETUP.md`.
- Put profiles in the controller's `model-profiles` directory.
- Verify that each profile can be started, health-checked, and stopped cleanly.
- If something is missing, inspect the machine and ask me only the minimum necessary question.
End state:
- Model Switchboard opens with valid profiles visible
- `Activate` works
- health checks go green only when the endpoint is actually ready
- nothing is tied to one specific Mac beyond what is truly installed here
If something looks off, these labels tell you what the app is seeing right now:
| Label | Where it appears | Meaning |
|---|---|---|
RUNNING |
Profile card badge | Process is currently running. |
NOT RUNNING |
Profile card badge | Process is not running. |
STARTING / STOPPING / RESTARTING / ACTIVATING |
Profile card badge | Action is in progress for that profile. |
STALE |
Footer chip near the clock | Last successful status refresh is older than ~45 seconds. |
CACHED |
Footer chip near the clock | Controller was temporarily unavailable and the app is showing last cached status. |
ERROR |
Footer chip near the clock | Latest refresh or action returned an error. |
RUNNING / READY |
Plus Benchmarks panel | Benchmark job is active / no benchmark currently running. |
See CHANGELOG.md for release-by-release detail and Releases for signed DMG downloads. Current version: 1.1.5.
- Raycast — extension scripts under
Integrations/Raycast/call the same controller API the menu bar uses. - SwiftBar — drop-in plugin at
Controller/swiftbar/local-models.15s.shrenders the same status, start, stop, and restart actions in any SwiftBar-friendly menu. - Factory Droid —
Sync Droid(Plus only) pushes managed profiles into Droid's custom-model settings. First of several planned sync adapters.
All the deeper material lives in one place so this README stays skimmable:
SETUP.md — profile formats, supported runtimes, health checks, controller API contract, build-from-source flow, release pipeline, Raycast power-user notes, troubleshooting, and known limitations. Controller/RUNTIME_SUPPORT.md — canonical runtime table, launch modes, profile templates, readiness modes. CHANGELOG.md — release-by-release changes and distribution hardening notes.
The app's Help button opens the same doc.
PRs, issues, and profile recipes are welcome. A few ground rules that keep the project reusable:
- Keep the app generic. Runtime-specific behavior belongs in the controller or a profile manifest.
- The controller HTTP contract is the stability boundary. Make additive changes only.
- External tools stay optional integrations, never required features.
- Ship a runnable example with any new adapter.
Sync Droid is currently Factory-Droid-specific because that's the agent I run. The integration slot is generic, but the adapter is not. PRs that add sync adapters for other local-model terminals or agentic tools are very welcome, including but not limited to Cursor, Windsurf, OpenAI Codex CLI, Zed, Continue, Aider, LM Studio, Ollama chat frontends, or any OpenAI-compatible consumer.
If you build one, follow the shape of Controller/sync-droid-local-models.py and register it under Controller/integrations/ so it shows up in the Plus menu automatically. Full contributor guide and the release pipeline live in SETUP.md.
Before opening a PR:
swift test && ./Scripts/check-cycles.py && ./Scripts/build-app.shFor maintainers, release prep is scriptable instead of hand-editing version files:
python3 Scripts/bump-version.py patch # or minor / major / x.y.z
./Scripts/release-preflight.shMIT © 2026 AdityaVG13



