Picture for Gil Keren

Gil Keren

Efficient Streaming LLM for Speech Recognition

Add code
Oct 02, 2024
Viaarxiv icon

M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses

Add code
Sep 17, 2024
Figure 1 for M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Figure 2 for M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Figure 3 for M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Figure 4 for M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Viaarxiv icon

Faster Speech-LLaMA Inference with Multi-token Prediction

Add code
Sep 12, 2024
Viaarxiv icon

Token-Weighted RNN-T for Learning from Flawed Data

Add code
Jun 26, 2024
Viaarxiv icon

Towards Selection of Text-to-speech Data to Augment ASR Training

Add code
May 30, 2023
Viaarxiv icon

Text Generation with Speech Synthesis for ASR Data Augmentation

Add code
May 22, 2023
Figure 1 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 2 for Text Generation with Speech Synthesis for ASR Data Augmentation
Figure 3 for Text Generation with Speech Synthesis for ASR Data Augmentation
Viaarxiv icon

A Token-Wise Beam Search Algorithm for RNN-T

Add code
Feb 28, 2023
Viaarxiv icon

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

Add code
Dec 15, 2022
Viaarxiv icon

Scaling ASR Improves Zero and Few Shot Learning

Add code
Nov 29, 2021
Figure 1 for Scaling ASR Improves Zero and Few Shot Learning
Figure 2 for Scaling ASR Improves Zero and Few Shot Learning
Figure 3 for Scaling ASR Improves Zero and Few Shot Learning
Figure 4 for Scaling ASR Improves Zero and Few Shot Learning
Viaarxiv icon

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion

Add code
Apr 05, 2021
Figure 1 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 2 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 3 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 4 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Viaarxiv icon