semafold

Here are 4 public repositories matching this topic...

mindtro / semafold

Vector compression with TurboQuant codecs for embeddings, retrieval, and KV-cache. 10x compression, pure NumPy core — optional GPU acceleration via PyTorch (CUDA/MPS) or MLX (Metal).

retrieval quantization vector-database kv-cache llm-inference embedding-compression turboquant vector-compression qjl semafold

Updated Apr 1, 2026
Python

Tobiaszn8972 / turboquant-gpu

Star

Compress KV cache for LLM inference with 5.02x efficiency on NVIDIA GPUs using cuTile kernels.

windows docker retrieval glsl cuda pytorch transformer triton attention multi-gpu vector-database ggml local-ai qjl semafold

Updated May 8, 2026
Python

Ac3v3d0 / semafold

Star

Compress embeddings, retrieval vectors, and KV-cache with TurboQuant codecs for 10x smaller storage and NumPy-first AI workloads

swift machine-learning retrieval transformers pytorch moe quantization vector-database kv-cache openai-api llm-inference qwen embedding-compression turboquant qjl semafold

Updated May 8, 2026
Python

Labyrinthine-saltiness744 / turboquant-mlx

Star

Compress MLX KV cache on Apple Silicon with TurboQuant mixed-precision and fused Metal kernels for lower memory use and fast decode

swift machine-learning deep-learning metal retrieval transformers pytorch moe quantization openai-api llm llm-inference qwen turboquant qjl semafold

Updated May 8, 2026
Python

Improve this page

Add a description, image, and links to the semafold topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semafold topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semafold

Here are 4 public repositories matching this topic...

mindtro / semafold

Tobiaszn8972 / turboquant-gpu

Ac3v3d0 / semafold

Labyrinthine-saltiness744 / turboquant-mlx

Improve this page

Add this topic to your repo