IbadKhalid7

IbadKhalid7

Popular repositories Loading

turboquant-model turboquant-model Public

Optimize LLM inference with near-optimal 4-bit weight quantization and on-the-fly dequantization for lower memory use and faster matmul

Python