Picture for Zhaoxi Mu

Zhaoxi Mu

Spiking Vocos: An Energy-Efficient Neural Vocoder

Add code
Sep 16, 2025
Viaarxiv icon

SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation

Add code
May 06, 2025
Viaarxiv icon

Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction

Add code
Apr 19, 2024
Figure 1 for Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
Figure 2 for Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
Figure 3 for Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
Figure 4 for Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
Viaarxiv icon

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

Add code
Dec 16, 2023
Figure 1 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 2 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 3 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Figure 4 for Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
Viaarxiv icon

Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning

Add code
Mar 07, 2023
Figure 1 for Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning
Figure 2 for Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning
Figure 3 for Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning
Figure 4 for Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning
Viaarxiv icon

A Multi-Stage Triple-Path Method for Speech Separation in Noisy and Reverberant Environments

Add code
Mar 07, 2023
Figure 1 for A Multi-Stage Triple-Path Method for Speech Separation in Noisy and Reverberant Environments
Figure 2 for A Multi-Stage Triple-Path Method for Speech Separation in Noisy and Reverberant Environments
Figure 3 for A Multi-Stage Triple-Path Method for Speech Separation in Noisy and Reverberant Environments
Figure 4 for A Multi-Stage Triple-Path Method for Speech Separation in Noisy and Reverberant Environments
Viaarxiv icon

Review of end-to-end speech synthesis technology based on deep learning

Add code
Apr 20, 2021
Figure 1 for Review of end-to-end speech synthesis technology based on deep learning
Figure 2 for Review of end-to-end speech synthesis technology based on deep learning
Figure 3 for Review of end-to-end speech synthesis technology based on deep learning
Figure 4 for Review of end-to-end speech synthesis technology based on deep learning
Viaarxiv icon