trellis-quantization

Here are 2 public repositories matching this topic...

LL4nc33 / llama-tq

llama.cpp fork: TurboQuant KV cache (ktq+vtq, 1-4 bit, 83% smaller at f16 quality, 100k+ ctx on 12GB GPU) + sparse fine-tuning for hybrid MoE+SSM models (Qwen3.5/3.6-A3B, no Mamba backward needed). Turing-tuned Vulkan path.

vulkan cuda inference moe lora quantization ssm mamba fine-tuning kv-cache llm llama-cpp gguf trellis-quantization turboquant polarquant

Updated May 18, 2026
C++

domadomlab / pdf-optimizer-win

Star

💎 LTS Industrial Standard: PDF/Word optimization with scientific Trellis Mimic engine. Features Turbo Parallel processing & Global Camouflage (Ricoh/Fujitsu/Canon 2025 profiles). Embedded Python 3.12, zero-install, no admin needed. Ultimate document privacy for Windows LTSC/Enterprise.

productivity privacy compression cross-platform lts stealth scientific-imaging docx-to-pdf embedded-python industrial-grade pdf-optimizer trellis-quantization

Updated Apr 14, 2026
Python

Improve this page

Add a description, image, and links to the trellis-quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the trellis-quantization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly