Picture for Joseph Keshet

Joseph Keshet

Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASR

Add code
Sep 24, 2024
Viaarxiv icon

WhisperNER: Unified Open Named Entity and Speech Recognition

Add code
Sep 12, 2024
Viaarxiv icon

HebDB: a Weakly Supervised Dataset for Hebrew Speech Processing

Add code
Jul 10, 2024
Viaarxiv icon

Tradition or Innovation: A Comparison of Modern ASR Methods for Forced Alignment

Add code
Jun 27, 2024
Viaarxiv icon

Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network

Add code
Jun 27, 2024
Viaarxiv icon

Keyword-Guided Adaptation of Automatic Speech Recognition

Add code
Jun 04, 2024
Viaarxiv icon

Combining Language Models For Specialized Domains: A Colorful Approach

Add code
Nov 01, 2023
Viaarxiv icon

DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation

Add code
Oct 02, 2023
Viaarxiv icon

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

Add code
Sep 13, 2023
Viaarxiv icon

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

Add code
Jul 12, 2022
Figure 1 for A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Figure 2 for A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Figure 3 for A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Figure 4 for A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Viaarxiv icon