Picture for Rawn Henry

Rawn Henry

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs

Add code
Aug 16, 2023
Viaarxiv icon

Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production

Add code
Nov 18, 2022
Viaarxiv icon