Picture for Takuya Yoshioka

Takuya Yoshioka

Target conversation extraction: Source separation using turn-taking dynamics

Add code
Jul 15, 2024
Viaarxiv icon

Look Once to Hear: Target Speech Hearing with Noisy Examples

Add code
May 10, 2024
Figure 1 for Look Once to Hear: Target Speech Hearing with Noisy Examples
Figure 2 for Look Once to Hear: Target Speech Hearing with Noisy Examples
Figure 3 for Look Once to Hear: Target Speech Hearing with Noisy Examples
Figure 4 for Look Once to Hear: Target Speech Hearing with Noisy Examples
Viaarxiv icon

Anatomy of Industrial Scale Multilingual ASR

Add code
Apr 16, 2024
Figure 1 for Anatomy of Industrial Scale Multilingual ASR
Figure 2 for Anatomy of Industrial Scale Multilingual ASR
Figure 3 for Anatomy of Industrial Scale Multilingual ASR
Figure 4 for Anatomy of Industrial Scale Multilingual ASR
Viaarxiv icon

Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables

Add code
Nov 01, 2023
Figure 1 for Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Figure 2 for Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Figure 3 for Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Figure 4 for Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Viaarxiv icon

Profile-Error-Tolerant Target-Speaker Voice Activity Detection

Add code
Sep 21, 2023
Viaarxiv icon

t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability

Add code
Sep 15, 2023
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Sep 14, 2023
Viaarxiv icon

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Add code
Aug 14, 2023
Viaarxiv icon

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

Add code
May 30, 2023
Viaarxiv icon

i-Code Studio: A Configurable and Composable Framework for Integrative AI

Add code
May 23, 2023
Viaarxiv icon