Picture for Mengxiao Bi

Mengxiao Bi

Revealing Directions for Text-guided 3D Face Editing

Add code
Oct 07, 2024
Figure 1 for Revealing Directions for Text-guided 3D Face Editing
Figure 2 for Revealing Directions for Text-guided 3D Face Editing
Figure 3 for Revealing Directions for Text-guided 3D Face Editing
Figure 4 for Revealing Directions for Text-guided 3D Face Editing
Viaarxiv icon

E1 TTS: Simple and Fast Non-Autoregressive TTS

Add code
Sep 14, 2024
Figure 1 for E1 TTS: Simple and Fast Non-Autoregressive TTS
Figure 2 for E1 TTS: Simple and Fast Non-Autoregressive TTS
Figure 3 for E1 TTS: Simple and Fast Non-Autoregressive TTS
Figure 4 for E1 TTS: Simple and Fast Non-Autoregressive TTS
Viaarxiv icon

MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion

Add code
Sep 14, 2024
Figure 1 for MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
Figure 2 for MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
Figure 3 for MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
Figure 4 for MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
Viaarxiv icon

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

Add code
Jul 17, 2024
Figure 1 for HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Figure 2 for HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Figure 3 for HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Figure 4 for HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Viaarxiv icon

DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion

Add code
Jun 12, 2024
Viaarxiv icon

EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

Add code
Apr 02, 2024
Viaarxiv icon

DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion

Add code
Sep 27, 2023
Figure 1 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 2 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 3 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 4 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Viaarxiv icon

Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models

Add code
Aug 31, 2023
Figure 1 for Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models
Figure 2 for Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models
Viaarxiv icon

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding

Add code
May 21, 2023
Figure 1 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 2 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 3 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 4 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Viaarxiv icon

Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features

Add code
Nov 09, 2022
Viaarxiv icon