Picture for Venkatesh Ravichandran

Venkatesh Ravichandran

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Add code
Mar 28, 2024
Viaarxiv icon

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

Add code
Jan 26, 2024
Figure 1 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 2 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 3 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 4 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Add code
Jan 17, 2024
Viaarxiv icon

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Add code
Dec 22, 2023
Viaarxiv icon

Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

Add code
Nov 16, 2023
Viaarxiv icon

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Add code
Mar 27, 2023
Viaarxiv icon

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

Add code
Mar 23, 2023
Viaarxiv icon

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Add code
Nov 04, 2022
Viaarxiv icon

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification

Add code
Jul 08, 2022
Figure 1 for Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
Figure 2 for Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
Figure 3 for Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
Viaarxiv icon

Enhancing ASR for Stuttered Speech with Limited Data Using Detect and Pass

Add code
Feb 08, 2022
Viaarxiv icon