Picture for Khalid Shaikh

Khalid Shaikh

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Add code
Jun 22, 2024
Viaarxiv icon