Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xujiang Xing

A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Aug 22, 2024

Xujiang Xing, Mingxing Xu, Thomas Fang Zheng

Figure 1 for A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Figure 2 for A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Figure 3 for A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Figure 4 for A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Abstract:Automatic Speaker Verification (ASV) suffers from performance degradation in noisy conditions. To address this issue, we propose a novel adversarial learning framework that incorporates noise-disentanglement to establish a noise-independent speaker invariant embedding space. Specifically, the disentanglement module includes two encoders for separating speaker related and irrelevant information, respectively. The reconstruction module serves as a regularization term to constrain the noise. A feature-robust loss is also used to supervise the speaker encoder to learn noise-independent speaker embeddings without losing speaker information. In addition, adversarial training is introduced to discourage the speaker encoder from encoding acoustic condition information for achieving a speaker-invariant embedding space. Experiments on VoxCeleb1 indicate that the proposed method improves the performance of the speaker verification system under both clean and noisy conditions.

* 5 pages, accepted by Interspeech2024

Via

Access Paper or Ask Questions