Picture for Benlai Tang

Benlai Tang

Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

Add code
Apr 14, 2024
Viaarxiv icon

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

Add code
Sep 11, 2023
Viaarxiv icon

TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection

Add code
Jun 27, 2023
Viaarxiv icon

CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation

Add code
May 23, 2023
Viaarxiv icon

Towards Realistic Visual Dubbing with Heterogeneous Sources

Add code
Jan 17, 2022
Figure 1 for Towards Realistic Visual Dubbing with Heterogeneous Sources
Figure 2 for Towards Realistic Visual Dubbing with Heterogeneous Sources
Figure 3 for Towards Realistic Visual Dubbing with Heterogeneous Sources
Figure 4 for Towards Realistic Visual Dubbing with Heterogeneous Sources
Viaarxiv icon

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

Add code
Oct 25, 2021
Figure 1 for Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation
Figure 2 for Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation
Figure 3 for Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation
Figure 4 for Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation
Viaarxiv icon

Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding

Add code
Oct 10, 2021
Figure 1 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 2 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 3 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Figure 4 for Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Viaarxiv icon

PPG-based singing voice conversion with adversarial representation learning

Add code
Oct 28, 2020
Figure 1 for PPG-based singing voice conversion with adversarial representation learning
Figure 2 for PPG-based singing voice conversion with adversarial representation learning
Figure 3 for PPG-based singing voice conversion with adversarial representation learning
Figure 4 for PPG-based singing voice conversion with adversarial representation learning
Viaarxiv icon

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech

Add code
May 19, 2020
Figure 1 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 2 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 3 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 4 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Viaarxiv icon