Picture for Rama Doddipatla

Rama Doddipatla

WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding

Add code
Aug 29, 2024
Viaarxiv icon

Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis

Add code
Jul 04, 2024
Viaarxiv icon

Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding

Add code
Jun 21, 2024
Viaarxiv icon

Geodesic interpolation of frame-wise speaker embeddings for the diarization of meeting scenarios

Add code
Jan 08, 2024
Viaarxiv icon

Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues

Add code
Sep 21, 2023
Viaarxiv icon

Frame-wise and overlap-robust speaker embeddings for meeting diarization

Add code
Jun 01, 2023
Viaarxiv icon

Adversarial learning of neural user simulators for dialogue policy optimisation

Add code
Jun 01, 2023
Viaarxiv icon

A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures

Add code
Jun 01, 2023
Viaarxiv icon

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

Add code
Apr 24, 2023
Viaarxiv icon

Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

Add code
Apr 21, 2023
Viaarxiv icon