Picture for Yuma Shirahata

Yuma Shirahata

Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control

Add code
Sep 26, 2024
Viaarxiv icon

Universal Score-based Speech Enhancement with High Content Preservation

Add code
Jun 18, 2024
Viaarxiv icon

Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data

Add code
Jun 12, 2024
Viaarxiv icon

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

Add code
Jun 12, 2024
Viaarxiv icon

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Add code
Sep 15, 2023
Viaarxiv icon

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Add code
Oct 28, 2022
Viaarxiv icon

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Add code
Oct 28, 2022
Viaarxiv icon

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Add code
Apr 21, 2022
Figure 1 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 2 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 3 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 4 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Viaarxiv icon