Picture for Yicheng Gu

Yicheng Gu

Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

Add code
Jul 07, 2024
Figure 1 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 2 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 3 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 4 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Viaarxiv icon

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Add code
Jul 01, 2024
Figure 1 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Figure 2 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Figure 3 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Figure 4 for FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Viaarxiv icon

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder

Add code
Apr 26, 2024
Figure 1 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 2 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 3 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 4 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Viaarxiv icon

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Add code
Dec 15, 2023
Figure 1 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 2 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 3 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 4 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Viaarxiv icon

Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder

Add code
Nov 25, 2023
Viaarxiv icon

Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion

Add code
Oct 17, 2023
Viaarxiv icon