Picture for Wanzin Yazar

Wanzin Yazar

Understanding the difficulty of low-precision post-training quantization of large language models

Add code
Oct 18, 2024
Viaarxiv icon

Scaling laws for post-training quantized large language models

Add code
Oct 15, 2024
Viaarxiv icon

Combining multiple post-training techniques to achieve most efficient quantized LLMs

Add code
May 12, 2024
Viaarxiv icon

Self-Selected Attention Span for Accelerating Large Language Model Inference

Add code
Apr 14, 2024
Viaarxiv icon