A collection of useful AI services for AI sovereignty.
video.mp4
This repository contains a set of containerized AI services that can be run locally to provide various AI capabilities without relying on external cloud providers. Each service is designed to be easy to deploy and use.
Local LLM inference with multiple backends (vLLM, llama.cpp, SGLang, MLX) and hardware targets (RTX PRO 6000, DGX Spark, AMD Vulkan, Apple Silicon).
| Model | Description | Location |
|---|---|---|
| Qwen 3.5 | Flagship family, 0.8B to 122B variants | models/qwen3.5 |
| Qwen3-Coder-Next | 80B MoE coding specialist | models/qwen3-coder-next |
| Qwopus | Opus-reasoning distilled 27B | models/qwopus |
| GLM-4.7-Flash | 30B MoE, ~3.6B active params | models/glm-4.7-flash |
Shared test and benchmark scripts live in models/shared.
| Service | Description | Location | Port |
|---|---|---|---|
| Whisper | Speech-to-text using OpenAI Whisper | speech/whisper | 8000 |
| Faster Whisper | Optimized Whisper variant | speech/faster-whisper | — |
| Orpheus TTS | High-quality voice synthesis | speech/orpheus | 5005 |
A server that runs large language models (LLMs) locally with GPU acceleration support.
- Features: Supports various open-source models, API access
- Location: ollama
- Port: 11434
A real-time voice assistant integrating WebRTC, Whisper, Gemma 3, and Orpheus for end-to-end voice chat.
- Location: demoapp
- Port: 7860
Each service has its own README.md with specific setup instructions and usage examples. Generally, you can start each service using:
cd service_directory
docker compose up -dThis project would not have been possible without the great works of many people who steadily contribute to the open source community!
- https://canopylabs.ai/
- https://github.com/Lex-au/Orpheus-FastAPI
- https://github.com/richardr1126/LlamaCpp-Orpheus-FastAPI
- https://github.com/freddyaboulton/fastrtc
- https://huggingface.co/
- https://ollama.com/
- https://www.gradio.app/
- https://www.langchain.com/
- Docker and Docker Compose
- NVIDIA GPU with CUDA support (recommended for optimal performance)
- Sufficient disk space for model storage
See the LICENSE file for details.
