Picture for Zexu Pan

Zexu Pan

ClearerVoice-Studio: Bridging Advanced Speech Processing Research and Practical Deployment

Add code
Jun 24, 2025
Viaarxiv icon

Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction

Add code
May 27, 2025
Viaarxiv icon

Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation

Add code
Apr 03, 2025
Viaarxiv icon

Context-Aware Two-Step Training Scheme for Domain Invariant Speech Separation

Add code
Mar 16, 2025
Viaarxiv icon

Conditional Latent Diffusion-Based Speech Enhancement Via Dual Context Learning

Add code
Jan 17, 2025
Viaarxiv icon

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution

Add code
Jan 17, 2025
Viaarxiv icon

Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction

Add code
Jan 03, 2025
Figure 1 for Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
Figure 2 for Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
Figure 3 for Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
Figure 4 for Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
Viaarxiv icon

pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues

Add code
Nov 05, 2024
Figure 1 for pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues
Figure 2 for pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues
Figure 3 for pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues
Figure 4 for pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues
Viaarxiv icon

Speech Separation with Pretrained Frontend to Minimize Domain Mismatch

Add code
Nov 05, 2024
Viaarxiv icon

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions

Add code
Sep 25, 2024
Viaarxiv icon