Picture for Shinji Watanabe

Shinji Watanabe

CLSP

Aligning Text-to-Music Evaluation with Human Preferences

Add code
Mar 20, 2025
Viaarxiv icon

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Add code
Mar 11, 2025
Viaarxiv icon

Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics

Add code
Mar 03, 2025
Viaarxiv icon

Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM

Add code
Feb 24, 2025
Viaarxiv icon

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

Add code
Feb 21, 2025
Viaarxiv icon

Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment

Add code
Feb 10, 2025
Viaarxiv icon

Discrete Speech Unit Extraction via Independent Component Analysis

Add code
Jan 11, 2025
Viaarxiv icon

Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization

Add code
Dec 26, 2024
Figure 1 for Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization
Figure 2 for Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization
Figure 3 for Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization
Figure 4 for Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization
Viaarxiv icon

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music

Add code
Dec 23, 2024
Viaarxiv icon

Deep Speech Synthesis from Multimodal Articulatory Representations

Add code
Dec 17, 2024
Viaarxiv icon