Picture for Shinji Watanabe

Shinji Watanabe

CLSP

SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR

Add code
Dec 07, 2024
Viaarxiv icon

Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition

Add code
Nov 27, 2024
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Viaarxiv icon

Findings of the IWSLT 2024 Evaluation Campaign

Add code
Nov 07, 2024
Viaarxiv icon

VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning

Add code
Oct 23, 2024
Viaarxiv icon

FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model

Add code
Oct 03, 2024
Figure 1 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 2 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 3 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Figure 4 for FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Viaarxiv icon

End-to-End Speech Recognition with Pre-trained Masked Language Model

Add code
Oct 01, 2024
Figure 1 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 2 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 3 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Figure 4 for End-to-End Speech Recognition with Pre-trained Masked Language Model
Viaarxiv icon

Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking

Add code
Sep 27, 2024
Figure 1 for Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Figure 2 for Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Figure 3 for Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Figure 4 for Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models

Add code
Sep 21, 2024
Figure 1 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 2 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 3 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 4 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Viaarxiv icon