Picture for Venkatesh Ravichandran

Venkatesh Ravichandran

Hearing Between the Lines: Unlocking the Reasoning Power of LLMs for Speech Evaluation

Add code
Jan 24, 2026
Viaarxiv icon

Dimension-First Evaluation of Speech-to-Speech Models with Structured Acoustic Cues

Add code
Jan 20, 2026
Viaarxiv icon

The Interspeech 2025 Speech Accessibility Project Challenge

Add code
Jul 29, 2025
Viaarxiv icon

Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

Add code
Mar 14, 2025
Figure 1 for Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images
Figure 2 for Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images
Figure 3 for Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images
Figure 4 for Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images
Viaarxiv icon

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Add code
Mar 28, 2024
Figure 1 for Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Figure 2 for Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Figure 3 for Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Figure 4 for Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Viaarxiv icon

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

Add code
Jan 26, 2024
Figure 1 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 2 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 3 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 4 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Add code
Jan 17, 2024
Viaarxiv icon

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Add code
Dec 22, 2023
Figure 1 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 2 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 3 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 4 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Viaarxiv icon

Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

Add code
Nov 16, 2023
Figure 1 for Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Figure 2 for Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Figure 3 for Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Figure 4 for Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Viaarxiv icon

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Add code
Mar 27, 2023
Figure 1 for Cross-utterance ASR Rescoring with Graph-based Label Propagation
Figure 2 for Cross-utterance ASR Rescoring with Graph-based Label Propagation
Figure 3 for Cross-utterance ASR Rescoring with Graph-based Label Propagation
Figure 4 for Cross-utterance ASR Rescoring with Graph-based Label Propagation
Viaarxiv icon