Picture for George Saon

George Saon

Exploring the limits of decoder-only models trained on public speech recognition corpora

Add code
Jan 31, 2024
Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Add code
Nov 24, 2023
Viaarxiv icon

Semi-Autoregressive Streaming ASR With Label Context

Add code
Sep 19, 2023
Viaarxiv icon

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

Add code
Sep 07, 2023
Viaarxiv icon

Diagonal State Space Augmented Transformers for Speech Recognition

Add code
Feb 27, 2023
Viaarxiv icon

VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

Add code
Aug 03, 2022
Figure 1 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 2 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 3 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 4 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Viaarxiv icon

Extending RNN-T-based speech recognition systems with emotion and language classification

Add code
Jul 28, 2022
Figure 1 for Extending RNN-T-based speech recognition systems with emotion and language classification
Figure 2 for Extending RNN-T-based speech recognition systems with emotion and language classification
Figure 3 for Extending RNN-T-based speech recognition systems with emotion and language classification
Figure 4 for Extending RNN-T-based speech recognition systems with emotion and language classification
Viaarxiv icon

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

Add code
Jun 16, 2022
Figure 1 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 2 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 3 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Viaarxiv icon

Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems

Add code
Apr 01, 2022
Figure 1 for Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Figure 2 for Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Figure 3 for Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Figure 4 for Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Viaarxiv icon

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing

Add code
Mar 29, 2022
Figure 1 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 2 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 3 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 4 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Viaarxiv icon