Picture for Georgy Tyukin

Georgy Tyukin

Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models

Add code
Jul 22, 2024
Figure 1 for Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Figure 2 for Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Figure 3 for Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Figure 4 for Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Viaarxiv icon

Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations

Add code
Apr 02, 2024
Figure 1 for Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations
Figure 2 for Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations
Figure 3 for Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations
Figure 4 for Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations
Viaarxiv icon