Picture for Zhaoheng Ni

Zhaoheng Ni

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition

Add code
Sep 01, 2024
Viaarxiv icon

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Add code
Jul 04, 2024
Figure 1 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 2 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 3 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Figure 4 for High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Viaarxiv icon

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Viaarxiv icon

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Add code
Jan 18, 2024
Viaarxiv icon

On The Open Prompt Challenge In Conditional Audio Generation

Add code
Nov 01, 2023
Figure 1 for On The Open Prompt Challenge In Conditional Audio Generation
Figure 2 for On The Open Prompt Challenge In Conditional Audio Generation
Figure 3 for On The Open Prompt Challenge In Conditional Audio Generation
Figure 4 for On The Open Prompt Challenge In Conditional Audio Generation
Viaarxiv icon

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

Add code
Oct 27, 2023
Viaarxiv icon

FoleyGen: Visually-Guided Audio Generation

Add code
Sep 19, 2023
Viaarxiv icon

Exploring Speech Enhancement for Low-resource Speech Synthesis

Add code
Sep 19, 2023
Viaarxiv icon

Stack-and-Delay: a new codebook pattern for music generation

Add code
Sep 15, 2023
Viaarxiv icon

Enhance audio generation controllability through representation similarity regularization

Add code
Sep 15, 2023
Viaarxiv icon