Picture for Srikanth Ronanki

Srikanth Ronanki

Zero-resource Speech Translation and Recognition with LLMs

Add code
Dec 24, 2024
Viaarxiv icon

Speech Retrieval-Augmented Generation without Automatic Speech Recognition

Add code
Dec 21, 2024
Viaarxiv icon

Sequential Editing for Lifelong Training of Speech Recognition Models

Add code
Jun 25, 2024
Viaarxiv icon

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Add code
May 14, 2024
Viaarxiv icon

SpeechVerse: A Large-scale Generalizable Audio Language Model

Add code
May 14, 2024
Figure 1 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 2 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 3 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Figure 4 for SpeechVerse: A Large-scale Generalizable Audio Language Model
Viaarxiv icon

Retrieve and Copy: Scaling ASR Personalization to Large Catalogs

Add code
Nov 14, 2023
Figure 1 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 2 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 3 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 4 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Viaarxiv icon

Generalized zero-shot audio-to-intent classification

Add code
Nov 04, 2023
Figure 1 for Generalized zero-shot audio-to-intent classification
Figure 2 for Generalized zero-shot audio-to-intent classification
Figure 3 for Generalized zero-shot audio-to-intent classification
Figure 4 for Generalized zero-shot audio-to-intent classification
Viaarxiv icon

DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer

Add code
Jun 13, 2023
Viaarxiv icon

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

Add code
Apr 25, 2023
Viaarxiv icon

Device Directedness with Contextual Cues for Spoken Dialog Systems

Add code
Nov 23, 2022
Viaarxiv icon