Picture for Kei Sawada

Kei Sawada

PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems

Add code
Jun 18, 2024
Figure 1 for PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Figure 2 for PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Figure 3 for PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Figure 4 for PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Viaarxiv icon

Release of Pre-Trained Models for the Japanese Language

Add code
Apr 02, 2024
Figure 1 for Release of Pre-Trained Models for the Japanese Language
Figure 2 for Release of Pre-Trained Models for the Japanese Language
Figure 3 for Release of Pre-Trained Models for the Japanese Language
Figure 4 for Release of Pre-Trained Models for the Japanese Language
Viaarxiv icon

An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition

Add code
Dec 06, 2023
Viaarxiv icon

Towards human-like spoken dialogue generation between AI agents from written dialogue

Add code
Oct 02, 2023
Viaarxiv icon

Focused Prefix Tuning for Controllable Text Generation

Add code
Jun 10, 2023
Viaarxiv icon

UniFLG: Unified Facial Landmark Generator from Text or Speech

Add code
Feb 28, 2023
Viaarxiv icon

Text-Guided Scene Sketch-to-Photo Synthesis

Add code
Feb 14, 2023
Figure 1 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 2 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 3 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 4 for Text-Guided Scene Sketch-to-Photo Synthesis
Viaarxiv icon

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

Add code
Jun 24, 2022
Figure 1 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 2 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 3 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 4 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Viaarxiv icon

MSR-NV: Neural vocoder using multiple sampling rates

Add code
Sep 28, 2021
Figure 1 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 2 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 3 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 4 for MSR-NV: Neural vocoder using multiple sampling rates
Viaarxiv icon

Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis

Add code
Sep 17, 2020
Figure 1 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 2 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 3 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 4 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Viaarxiv icon