Picture for Emiru Tsunoo

Emiru Tsunoo

Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems

Add code
Oct 02, 2025
Viaarxiv icon

LibriTTS-VI: A Public Corpus and Novel Methods for Efficient Voice Impression Control

Add code
Sep 19, 2025
Viaarxiv icon

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs

Add code
Jun 12, 2025
Viaarxiv icon

Differentiable K-means for Fully-optimized Discrete Token-based ASR

Add code
May 22, 2025
Viaarxiv icon

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Add code
Mar 11, 2025
Figure 1 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 2 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 3 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 4 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Viaarxiv icon

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features

Add code
Dec 26, 2024
Viaarxiv icon

Task Arithmetic for Language Expansion in Speech Translation

Add code
Sep 17, 2024
Figure 1 for Task Arithmetic for Language Expansion in Speech Translation
Figure 2 for Task Arithmetic for Language Expansion in Speech Translation
Figure 3 for Task Arithmetic for Language Expansion in Speech Translation
Figure 4 for Task Arithmetic for Language Expansion in Speech Translation
Viaarxiv icon

Decoder-only Architecture for Streaming End-to-end Speech Recognition

Add code
Jun 23, 2024
Figure 1 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Figure 2 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Figure 3 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Viaarxiv icon

Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model

Add code
Jun 18, 2024
Figure 1 for Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Figure 2 for Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Figure 3 for Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Figure 4 for Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Viaarxiv icon

Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting

Add code
Jun 18, 2024
Figure 1 for Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
Figure 2 for Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
Figure 3 for Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
Figure 4 for Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
Viaarxiv icon