Picture for Renee St. Amant

Renee St. Amant

Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers

Add code
May 17, 2024
Viaarxiv icon