Picture for Frank Soong

Frank Soong

Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives

Add code
Jul 06, 2022
Figure 1 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 2 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 3 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Figure 4 for Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives
Viaarxiv icon

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

Add code
May 10, 2022
Figure 1 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 2 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 3 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 4 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Viaarxiv icon

An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings

Add code
Oct 14, 2021
Figure 1 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 2 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 3 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Figure 4 for An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Viaarxiv icon

A Survey on Neural Speech Synthesis

Add code
Jul 23, 2021
Figure 1 for A Survey on Neural Speech Synthesis
Figure 2 for A Survey on Neural Speech Synthesis
Figure 3 for A Survey on Neural Speech Synthesis
Figure 4 for A Survey on Neural Speech Synthesis
Viaarxiv icon

MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network

Add code
Feb 27, 2021
Figure 1 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 2 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 3 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Figure 4 for MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network
Viaarxiv icon

Improving pronunciation assessment via ordinal regression with anchored reference samples

Add code
Oct 26, 2020
Figure 1 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Figure 2 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Figure 3 for Improving pronunciation assessment via ordinal regression with anchored reference samples
Viaarxiv icon

Feature reinforcement with word embedding and parsing information in neural TTS

Add code
Jan 03, 2019
Figure 1 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 2 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 3 for Feature reinforcement with word embedding and parsing information in neural TTS
Figure 4 for Feature reinforcement with word embedding and parsing information in neural TTS
Viaarxiv icon

Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice

Add code
Dec 18, 2018
Figure 1 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 2 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 3 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Figure 4 for Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Viaarxiv icon