Picture for Aparna Khare

Aparna Khare

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Add code
Mar 28, 2024
Viaarxiv icon

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

Add code
Jan 26, 2024
Figure 1 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 2 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 3 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 4 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Add code
Jan 17, 2024
Viaarxiv icon

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Add code
Mar 27, 2023
Viaarxiv icon

ASR-Aware End-to-end Neural Diarization

Add code
Feb 02, 2022
Figure 1 for ASR-Aware End-to-end Neural Diarization
Figure 2 for ASR-Aware End-to-end Neural Diarization
Viaarxiv icon

Audiovisual Highlight Detection in Videos

Add code
Feb 11, 2021
Figure 1 for Audiovisual Highlight Detection in Videos
Figure 2 for Audiovisual Highlight Detection in Videos
Figure 3 for Audiovisual Highlight Detection in Videos
Figure 4 for Audiovisual Highlight Detection in Videos
Viaarxiv icon

Self-Supervised learning with cross-modal transformers for emotion recognition

Add code
Nov 20, 2020
Figure 1 for Self-Supervised learning with cross-modal transformers for emotion recognition
Figure 2 for Self-Supervised learning with cross-modal transformers for emotion recognition
Figure 3 for Self-Supervised learning with cross-modal transformers for emotion recognition
Viaarxiv icon

Multi-modal embeddings using multi-task learning for emotion recognition

Add code
Sep 10, 2020
Figure 1 for Multi-modal embeddings using multi-task learning for emotion recognition
Figure 2 for Multi-modal embeddings using multi-task learning for emotion recognition
Figure 3 for Multi-modal embeddings using multi-task learning for emotion recognition
Viaarxiv icon

Multiresolution and Multimodal Speech Recognition with Transformers

Add code
Apr 29, 2020
Figure 1 for Multiresolution and Multimodal Speech Recognition with Transformers
Figure 2 for Multiresolution and Multimodal Speech Recognition with Transformers
Figure 3 for Multiresolution and Multimodal Speech Recognition with Transformers
Figure 4 for Multiresolution and Multimodal Speech Recognition with Transformers
Viaarxiv icon

Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Add code
Feb 01, 2020
Figure 1 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 2 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 3 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 4 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Viaarxiv icon