Run local LLMs on Apple Silicon via [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm). Menubar app + CLI with an OpenAI-compatible API on `localhost:11434`.
-
Updated
May 20, 2026 - Swift
Run local LLMs on Apple Silicon via [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm). Menubar app + CLI with an OpenAI-compatible API on `localhost:11434`.
Lightweight evaluation framework for Retrieval Augmented Generation systems, focused on simplicity and long-term consistency.
CLI + API server to download, manage, and run 500K+ HuggingFace models locally with Ollama & OpenAI compatibility
Android AI inference server with OpenAI-compatible API. Turn your phone into a local LLM co-processor — runs MLC LLM (GGUF) + LiteRT-LM (.litertlm) with dual-engine routing, bearer auth, thermal governor, KV cache, and resumable model downloads. No cloud, no GPU, no friction.
Add a description, image, and links to the ollama-compatible topic page so that developers can more easily learn about it.
To associate your repository with the ollama-compatible topic, visit your repo's landing page and select "manage topics."