Picture for Mart van Baalen

Mart van Baalen

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Add code
Feb 23, 2024
Viaarxiv icon

The LLM Surgeon

Add code
Dec 28, 2023
Viaarxiv icon

QBitOpt: Fast and Accurate Bitwidth Reallocation during Training

Add code
Jul 10, 2023
Figure 1 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 2 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 3 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 4 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Viaarxiv icon

Pruning vs Quantization: Which is Better?

Add code
Jul 06, 2023
Viaarxiv icon

FP8 versus INT8 for efficient deep learning inference

Add code
Mar 31, 2023
Viaarxiv icon

A Practical Mixed Precision Algorithm for Post-Training Quantization

Add code
Feb 10, 2023
Viaarxiv icon

Quantized Sparse Weight Decomposition for Neural Network Compression

Add code
Jul 22, 2022
Figure 1 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 2 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 3 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 4 for Quantized Sparse Weight Decomposition for Neural Network Compression
Viaarxiv icon

Cyclical Pruning for Sparse Neural Networks

Add code
Feb 02, 2022
Figure 1 for Cyclical Pruning for Sparse Neural Networks
Figure 2 for Cyclical Pruning for Sparse Neural Networks
Figure 3 for Cyclical Pruning for Sparse Neural Networks
Figure 4 for Cyclical Pruning for Sparse Neural Networks
Viaarxiv icon

A White Paper on Neural Network Quantization

Add code
Jun 15, 2021
Figure 1 for A White Paper on Neural Network Quantization
Figure 2 for A White Paper on Neural Network Quantization
Figure 3 for A White Paper on Neural Network Quantization
Figure 4 for A White Paper on Neural Network Quantization
Viaarxiv icon

Bayesian Bits: Unifying Quantization and Pruning

Add code
May 15, 2020
Figure 1 for Bayesian Bits: Unifying Quantization and Pruning
Figure 2 for Bayesian Bits: Unifying Quantization and Pruning
Figure 3 for Bayesian Bits: Unifying Quantization and Pruning
Figure 4 for Bayesian Bits: Unifying Quantization and Pruning
Viaarxiv icon