Picture for Zheng-Hua Tan

Zheng-Hua Tan

Aalborg University

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Add code
Jan 10, 2025
Viaarxiv icon

Vocal Tract Length Warped Features for Spoken Keyword Spotting

Add code
Jan 07, 2025
Viaarxiv icon

Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining

Add code
Jan 06, 2025
Figure 1 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 2 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 3 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Figure 4 for Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Viaarxiv icon

BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning

Add code
Oct 03, 2024
Viaarxiv icon

Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion Models

Add code
Sep 12, 2024
Viaarxiv icon

Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder

Add code
Sep 05, 2024
Viaarxiv icon

Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs

Add code
Sep 02, 2024
Viaarxiv icon

The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems

Add code
Jun 10, 2024
Viaarxiv icon

Zero-Shot Audio Captioning Using Soft and Hard Prompts

Add code
Jun 10, 2024
Viaarxiv icon

Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations

Add code
Jun 04, 2024
Viaarxiv icon