Picture for Wenyang He

Wenyang He

Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information

Add code
May 24, 2024
Viaarxiv icon

AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

Add code
Jan 21, 2024
Viaarxiv icon