Picture for Eldar Kurtic

Eldar Kurtic

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Add code
Nov 04, 2024
Viaarxiv icon

EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search

Add code
Oct 18, 2024
Viaarxiv icon

Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence

Add code
May 24, 2024
Viaarxiv icon

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Add code
May 06, 2024
Viaarxiv icon

How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark

Add code
Dec 21, 2023
Viaarxiv icon

Sparse Fine-tuning for Inference Acceleration of Large Language Models

Add code
Oct 13, 2023
Viaarxiv icon

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Add code
Aug 03, 2023
Viaarxiv icon

Error Feedback Can Accurately Compress Preconditioners

Add code
Jun 16, 2023
Viaarxiv icon

Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

Add code
Mar 25, 2023
Viaarxiv icon