Picture for Georgy Tyukin

Georgy Tyukin

Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models

Add code
Jul 22, 2024
Viaarxiv icon

Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations

Add code
Apr 02, 2024
Viaarxiv icon