Picture for Anmol Gulati

Anmol Gulati

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Add code
Mar 31, 2023
Figure 1 for Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Figure 2 for Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Figure 3 for Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Figure 4 for Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Viaarxiv icon

SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training

Add code
Oct 20, 2021
Figure 1 for SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Figure 2 for SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Figure 3 for SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Figure 4 for SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Viaarxiv icon

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Add code
Oct 01, 2021
Figure 1 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon

Scaling End-to-End Models for Large-Scale Multilingual ASR

Add code
Apr 30, 2021
Figure 1 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 2 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Figure 3 for Scaling End-to-End Models for Large-Scale Multilingual ASR
Viaarxiv icon

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization

Add code
Oct 21, 2020
Figure 1 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 2 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 3 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Figure 4 for FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Viaarxiv icon

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling

Add code
Oct 12, 2020
Figure 1 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 2 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 3 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Figure 4 for Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling
Viaarxiv icon

Dynamic Sparsity Neural Networks for Automatic Speech Recognition

Add code
May 16, 2020
Figure 1 for Dynamic Sparsity Neural Networks for Automatic Speech Recognition
Figure 2 for Dynamic Sparsity Neural Networks for Automatic Speech Recognition
Figure 3 for Dynamic Sparsity Neural Networks for Automatic Speech Recognition
Viaarxiv icon

Conformer: Convolution-augmented Transformer for Speech Recognition

Add code
May 16, 2020
Figure 1 for Conformer: Convolution-augmented Transformer for Speech Recognition
Figure 2 for Conformer: Convolution-augmented Transformer for Speech Recognition
Figure 3 for Conformer: Convolution-augmented Transformer for Speech Recognition
Figure 4 for Conformer: Convolution-augmented Transformer for Speech Recognition
Viaarxiv icon