Picture for Dongyang Dai

Dongyang Dai

learning discriminative features from spectrograms using center loss for speech emotion recognition

Add code
Jan 02, 2025
Viaarxiv icon

Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT

Add code
Jan 02, 2025
Figure 1 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 2 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 3 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 4 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Viaarxiv icon

Multi-modal Adversarial Training for Zero-Shot Voice Cloning

Add code
Aug 28, 2024
Viaarxiv icon

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Add code
Mar 08, 2024
Viaarxiv icon

Cloning one's voice using very limited data in the wild

Add code
Oct 08, 2021
Figure 1 for Cloning one's voice using very limited data in the wild
Figure 2 for Cloning one's voice using very limited data in the wild
Figure 3 for Cloning one's voice using very limited data in the wild
Figure 4 for Cloning one's voice using very limited data in the wild
Viaarxiv icon

Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network

Add code
Dec 21, 2020
Figure 1 for Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network
Figure 2 for Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network
Figure 3 for Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network
Figure 4 for Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network
Viaarxiv icon

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

Add code
Jun 20, 2020
Figure 1 for Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams
Figure 2 for Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams
Figure 3 for Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams
Figure 4 for Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams
Viaarxiv icon

Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement

Add code
May 26, 2020
Figure 1 for Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement
Figure 2 for Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement
Figure 3 for Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement
Figure 4 for Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement
Viaarxiv icon