Picture for Xueyao Zhang

Xueyao Zhang

Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement

Add code
Feb 11, 2025
Viaarxiv icon

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Add code
Feb 05, 2025
Viaarxiv icon

Overview of the Amphion Toolkit (v0.2)

Add code
Jan 26, 2025
Viaarxiv icon

Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities

Add code
Nov 29, 2024
Figure 1 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 2 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 3 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Figure 4 for Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Viaarxiv icon

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder

Add code
Apr 26, 2024
Figure 1 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 2 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 3 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Figure 4 for An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Viaarxiv icon

SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion

Add code
Feb 20, 2024
Figure 1 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 2 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 3 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Figure 4 for SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Viaarxiv icon

Reconfigurable Intelligent Computational Surfaces for MEC-Assisted Autonomous Driving Networks

Add code
Feb 01, 2024
Viaarxiv icon

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Add code
Dec 15, 2023
Figure 1 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 2 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 3 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 4 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Viaarxiv icon

Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder

Add code
Nov 25, 2023
Viaarxiv icon

Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion

Add code
Oct 17, 2023
Viaarxiv icon