Picture for Wanzin Yazar

Wanzin Yazar

Understanding the difficulty of low-precision post-training quantization of large language models

Add code
Oct 18, 2024
Figure 1 for Understanding the difficulty of low-precision post-training quantization of large language models
Figure 2 for Understanding the difficulty of low-precision post-training quantization of large language models
Figure 3 for Understanding the difficulty of low-precision post-training quantization of large language models
Figure 4 for Understanding the difficulty of low-precision post-training quantization of large language models
Viaarxiv icon

Scaling laws for post-training quantized large language models

Add code
Oct 15, 2024
Figure 1 for Scaling laws for post-training quantized large language models
Figure 2 for Scaling laws for post-training quantized large language models
Figure 3 for Scaling laws for post-training quantized large language models
Figure 4 for Scaling laws for post-training quantized large language models
Viaarxiv icon

Combining multiple post-training techniques to achieve most efficient quantized LLMs

Add code
May 12, 2024
Viaarxiv icon

Self-Selected Attention Span for Accelerating Large Language Model Inference

Add code
Apr 14, 2024
Viaarxiv icon