Picture for Rohit Paturi

Rohit Paturi

Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization

Add code
Jun 26, 2024
Viaarxiv icon

AG-LSEC: Audio Grounded Lexical Speaker Error Correction

Add code
Jun 25, 2024
Viaarxiv icon

SpeechVerse: A Large-scale Generalizable Audio Language Model

Add code
May 14, 2024
Viaarxiv icon

Generalized zero-shot audio-to-intent classification

Add code
Nov 04, 2023
Figure 1 for Generalized zero-shot audio-to-intent classification
Figure 2 for Generalized zero-shot audio-to-intent classification
Figure 3 for Generalized zero-shot audio-to-intent classification
Figure 4 for Generalized zero-shot audio-to-intent classification
Viaarxiv icon

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation

Add code
Nov 01, 2023
Figure 1 for End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Figure 2 for End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Figure 3 for End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Figure 4 for End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Viaarxiv icon

Speaker Diarization of Scripted Audiovisual Content

Add code
Aug 04, 2023
Viaarxiv icon

Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction

Add code
Jun 15, 2023
Viaarxiv icon

Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech

Add code
Dec 10, 2021
Figure 1 for Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Figure 2 for Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Figure 3 for Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Figure 4 for Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Viaarxiv icon