Picture for Dake Guo

Dake Guo

The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings

Add code
Oct 31, 2024
Figure 1 for The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings
Figure 2 for The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings
Figure 3 for The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings
Figure 4 for The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings
Viaarxiv icon

The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge

Add code
Oct 31, 2024
Viaarxiv icon

NPU-NTU System for Voice Privacy 2024 Challenge

Add code
Sep 06, 2024
Figure 1 for NPU-NTU System for Voice Privacy 2024 Challenge
Figure 2 for NPU-NTU System for Voice Privacy 2024 Challenge
Viaarxiv icon

SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis

Add code
Sep 02, 2024
Viaarxiv icon

Text-aware and Context-aware Expressive Audiobook Speech Synthesis

Add code
Jun 12, 2024
Viaarxiv icon

WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

Add code
Jun 11, 2024
Figure 1 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 2 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 3 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 4 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Viaarxiv icon

Automatic channel selection and spatial feature integration for multi-channel speech recognition across various array topologies

Add code
Dec 15, 2023
Viaarxiv icon

HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS

Add code
Sep 25, 2023
Viaarxiv icon