Picture for Haizhou Li

Haizhou Li

IML-Spikeformer: Input-aware Multi-Level Spiking Transformer for Speech Processing

Add code
Jul 10, 2025
Viaarxiv icon

VP-SelDoA: Visual-prompted Selective DoA Estimation of Target Sound via Semantic-Spatial Matching

Add code
Jul 10, 2025
Viaarxiv icon

SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

Add code
Jun 16, 2025
Viaarxiv icon

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction

Add code
Jun 11, 2025
Viaarxiv icon

SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement

Add code
Jun 09, 2025
Viaarxiv icon

Exploring Length Generalization For Transformer-based Speech Enhancement

Add code
Jun 07, 2025
Viaarxiv icon

From Word to World: Evaluate and Mitigate Culture Bias via Word Association Test

Add code
May 24, 2025
Viaarxiv icon

Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset

Add code
May 24, 2025
Viaarxiv icon

PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs

Add code
May 20, 2025
Viaarxiv icon

Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis

Add code
May 19, 2025
Viaarxiv icon