Picture for Yuepeng Jiang

Yuepeng Jiang

Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation

Add code
Aug 28, 2024
Figure 1 for Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation
Figure 2 for Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation
Figure 3 for Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation
Figure 4 for Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation
Viaarxiv icon

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling

Add code
Jun 11, 2024
Viaarxiv icon

WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

Add code
Jun 11, 2024
Figure 1 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 2 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 3 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 4 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Viaarxiv icon

VITS-Based Singing Voice Conversion Leveraging Whisper and multi-scale F0 Modeling

Add code
Oct 04, 2023
Viaarxiv icon

DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion

Add code
Sep 27, 2023
Figure 1 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 2 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 3 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Figure 4 for DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Viaarxiv icon

HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS

Add code
Sep 25, 2023
Viaarxiv icon

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding

Add code
May 21, 2023
Figure 1 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 2 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 3 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Figure 4 for DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Viaarxiv icon

BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning

Add code
Oct 12, 2021
Figure 1 for BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning
Figure 2 for BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning
Figure 3 for BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning
Viaarxiv icon