Picture for Bharat Venkitesh

Bharat Venkitesh

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Add code
Jan 30, 2025
Figure 1 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 2 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 3 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 4 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Viaarxiv icon

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Add code
Dec 05, 2024
Figure 1 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 2 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 3 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 4 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Viaarxiv icon

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Add code
Aug 15, 2024
Figure 1 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 2 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 3 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 4 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Viaarxiv icon

Aya 23: Open Weight Releases to Further Multilingual Progress

Add code
May 23, 2024
Figure 1 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 2 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 3 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 4 for Aya 23: Open Weight Releases to Further Multilingual Progress
Viaarxiv icon

SnapKV: LLM Knows What You are Looking for Before Generation

Add code
Apr 22, 2024
Figure 1 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 2 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 3 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 4 for SnapKV: LLM Knows What You are Looking for Before Generation
Viaarxiv icon

Intriguing Properties of Quantization at Scale

Add code
May 30, 2023
Figure 1 for Intriguing Properties of Quantization at Scale
Figure 2 for Intriguing Properties of Quantization at Scale
Figure 3 for Intriguing Properties of Quantization at Scale
Figure 4 for Intriguing Properties of Quantization at Scale
Viaarxiv icon

Exploring Low Rank Training of Deep Neural Networks

Add code
Sep 27, 2022
Figure 1 for Exploring Low Rank Training of Deep Neural Networks
Figure 2 for Exploring Low Rank Training of Deep Neural Networks
Figure 3 for Exploring Low Rank Training of Deep Neural Networks
Figure 4 for Exploring Low Rank Training of Deep Neural Networks
Viaarxiv icon

Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition

Add code
Nov 09, 2019
Figure 1 for Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition
Figure 2 for Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition
Figure 3 for Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition
Figure 4 for Fully Quantizing a Simplified Transformer for End-to-end Speech Recognition
Viaarxiv icon