Picture for Raffy Fahim

Raffy Fahim

Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness

Add code
Oct 03, 2023
Viaarxiv icon

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs

Add code
Aug 16, 2023
Viaarxiv icon

Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production

Add code
Nov 18, 2022
Viaarxiv icon