Picture for Tomoki Koriyama

Tomoki Koriyama

VAE-based Phoneme Alignment Using Gradient Annealing and SSL Acoustic Features

Add code
Jul 03, 2024
Viaarxiv icon

An Attribute Interpolation Method in Speech Synthesis by Model Merging

Add code
Jun 30, 2024
Viaarxiv icon

Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech

Add code
Feb 01, 2024
Viaarxiv icon

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Add code
Feb 27, 2023
Viaarxiv icon

Structured State Space Decoder for Speech Recognition and Synthesis

Add code
Oct 31, 2022
Viaarxiv icon

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022

Add code
Apr 05, 2022
Figure 1 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 2 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 3 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 4 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Viaarxiv icon

Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes

Add code
Aug 07, 2020
Figure 1 for Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes
Figure 2 for Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes
Figure 3 for Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes
Figure 4 for Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes
Viaarxiv icon

Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit

Add code
Apr 22, 2020
Figure 1 for Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Figure 2 for Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Figure 3 for Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Figure 4 for Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Viaarxiv icon

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking

Add code
Feb 09, 2019
Figure 1 for Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking
Figure 2 for Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking
Figure 3 for Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking
Figure 4 for Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking
Viaarxiv icon

Sampling-based speech parameter generation using moment-matching networks

Add code
Apr 12, 2017
Figure 1 for Sampling-based speech parameter generation using moment-matching networks
Figure 2 for Sampling-based speech parameter generation using moment-matching networks
Figure 3 for Sampling-based speech parameter generation using moment-matching networks
Figure 4 for Sampling-based speech parameter generation using moment-matching networks
Viaarxiv icon