Picture for Yanshu Wang

Yanshu Wang

Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview

Add code
Sep 18, 2024
Viaarxiv icon

HERA: High-efficiency Matrix Compression via Element Replacement

Add code
Jul 04, 2024
Viaarxiv icon

Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information

Add code
May 24, 2024
Viaarxiv icon