Picture for Shinji Takaki

Shinji Takaki

Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System

Add code
Nov 21, 2022
Viaarxiv icon

Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism

Add code
Aug 31, 2021
Figure 1 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 2 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 3 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Viaarxiv icon

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

Add code
Feb 15, 2021
Figure 1 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 2 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 3 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 4 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Viaarxiv icon

Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks

Add code
Oct 24, 2019
Figure 1 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 2 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 3 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 4 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Viaarxiv icon

Neural source-filter waveform models for statistical parametric speech synthesis

Add code
Apr 27, 2019
Figure 1 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 2 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 3 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 4 for Neural source-filter waveform models for statistical parametric speech synthesis
Viaarxiv icon

Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform

Add code
Apr 07, 2019
Figure 1 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 2 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 3 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 4 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Viaarxiv icon

Neural source-filter-based waveform model for statistical parametric speech synthesis

Add code
Oct 31, 2018
Figure 1 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 2 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 3 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 4 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Viaarxiv icon

STFT spectral loss for training a neural speech waveform model

Add code
Oct 30, 2018
Figure 1 for STFT spectral loss for training a neural speech waveform model
Figure 2 for STFT spectral loss for training a neural speech waveform model
Figure 3 for STFT spectral loss for training a neural speech waveform model
Figure 4 for STFT spectral loss for training a neural speech waveform model
Viaarxiv icon

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

Add code
Oct 29, 2018
Figure 1 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 2 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 3 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 4 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Viaarxiv icon

Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

Add code
Jul 31, 2018
Figure 1 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 2 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 3 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 4 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Viaarxiv icon