Picture for Zexu Pan

Zexu Pan

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions

Add code
Sep 25, 2024
Viaarxiv icon

Enhanced Reverberation as Supervision for Unsupervised Speech Separation

Add code
Aug 06, 2024
Viaarxiv icon

TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

Add code
Aug 06, 2024
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection

Add code
Dec 12, 2023
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Viaarxiv icon

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

Add code
Oct 17, 2023
Viaarxiv icon

Generation or Replication: Auscultating Audio Latent Diffusion Models

Add code
Oct 16, 2023
Viaarxiv icon

Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech

Add code
Sep 15, 2023
Viaarxiv icon

NeuroHeed: Neuro-Steered Speaker Extraction using EEG Signals

Add code
Jul 26, 2023
Viaarxiv icon