Picture for Xiao-Lei Zhang

Xiao-Lei Zhang

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model

Add code
Feb 26, 2025
Viaarxiv icon

UniForm: A Unified Diffusion Transformer for Audio-Video Generation

Add code
Feb 08, 2025
Viaarxiv icon

Enhancing Intelligibility for Generative Target Speech Extraction via Joint Optimization with Target Speaker ASR

Add code
Jan 24, 2025
Viaarxiv icon

Speaker Contrastive Learning for Source Speaker Tracing

Add code
Sep 16, 2024
Figure 1 for Speaker Contrastive Learning for Source Speaker Tracing
Figure 2 for Speaker Contrastive Learning for Source Speaker Tracing
Figure 3 for Speaker Contrastive Learning for Source Speaker Tracing
Figure 4 for Speaker Contrastive Learning for Source Speaker Tracing
Viaarxiv icon

Rethinking the Output Architecture for Sound Source Localization

Add code
Nov 21, 2023
Figure 1 for Rethinking the Output Architecture for Sound Source Localization
Figure 2 for Rethinking the Output Architecture for Sound Source Localization
Figure 3 for Rethinking the Output Architecture for Sound Source Localization
Figure 4 for Rethinking the Output Architecture for Sound Source Localization
Viaarxiv icon

Diffusion-Based Adversarial Purification for Speaker Verification

Add code
Oct 24, 2023
Viaarxiv icon

Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays

Add code
Jul 03, 2023
Viaarxiv icon

Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays

Add code
Apr 15, 2023
Figure 1 for Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays
Figure 2 for Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays
Figure 3 for Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays
Figure 4 for Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays
Viaarxiv icon

Optimizing Quantum Federated Learning Based on Federated Quantum Natural Gradient Descent

Add code
Feb 27, 2023
Viaarxiv icon

Interpretable Spectrum Transformation Attacks to Speaker Recognition

Add code
Feb 21, 2023
Figure 1 for Interpretable Spectrum Transformation Attacks to Speaker Recognition
Figure 2 for Interpretable Spectrum Transformation Attacks to Speaker Recognition
Figure 3 for Interpretable Spectrum Transformation Attacks to Speaker Recognition
Figure 4 for Interpretable Spectrum Transformation Attacks to Speaker Recognition
Viaarxiv icon