Local LLM inference engine written from scratch in Rust — hand-written AVX-512 assembly kernels, Metal & Vulkan compute shaders. Supports Qwen3, Mistral3, ... Q4/INT8/BF16 quantization.
-
Updated
Mar 18, 2026 - Rust
Local LLM inference engine written from scratch in Rust — hand-written AVX-512 assembly kernels, Metal & Vulkan compute shaders. Supports Qwen3, Mistral3, ... Q4/INT8/BF16 quantization.
Open Weight Definition (OWD)
Add a description, image, and links to the openweight topic page so that developers can more easily learn about it.
To associate your repository with the openweight topic, visit your repo's landing page and select "manage topics."