Picture for Liumeng Xue

Liumeng Xue

The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings

Add code
Oct 31, 2024
Viaarxiv icon

Text-aware and Context-aware Expressive Audiobook Speech Synthesis

Add code
Jun 12, 2024
Viaarxiv icon

WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

Add code
Jun 11, 2024
Figure 1 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 2 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 3 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Figure 4 for WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Viaarxiv icon

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation

Add code
Jun 11, 2024
Viaarxiv icon

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder

Add code
Apr 26, 2024
Figure 1 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 2 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 3 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 4 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Viaarxiv icon

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Add code
Feb 25, 2024
Figure 1 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 2 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 3 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 4 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Viaarxiv icon

SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion

Add code
Feb 20, 2024
Figure 1 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 2 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 3 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 4 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Viaarxiv icon

Transfer the linguistic representations from TTS to accent conversion with non-parallel data

Add code
Jan 07, 2024
Viaarxiv icon

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Add code
Dec 15, 2023
Figure 1 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 2 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 3 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 4 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Viaarxiv icon

Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder

Add code
Nov 25, 2023
Viaarxiv icon