Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Man-wai Mak

Spectral-Aware Low-Rank Adaptation for Speaker Verification

Jan 07, 2025

Zhe Li, Man-wai Mak, Mert Pilanci, Hung-yi Lee, Helen Meng

Figure 1 for Spectral-Aware Low-Rank Adaptation for Speaker Verification

Figure 2 for Spectral-Aware Low-Rank Adaptation for Speaker Verification

Figure 3 for Spectral-Aware Low-Rank Adaptation for Speaker Verification

Figure 4 for Spectral-Aware Low-Rank Adaptation for Speaker Verification

Abstract:Previous research has shown that the principal singular vectors of a pre-trained model's weight matrices capture critical knowledge. In contrast, those associated with small singular values may contain noise or less reliable information. As a result, the LoRA-based parameter-efficient fine-tuning (PEFT) approach, which does not constrain the use of the spectral space, may not be effective for tasks that demand high representation capacity. In this study, we enhance existing PEFT techniques by incorporating the spectral information of pre-trained weight matrices into the fine-tuning process. We investigate spectral adaptation strategies with a particular focus on the additive adjustment of top singular vectors. This is accomplished by applying singular value decomposition (SVD) to the pre-trained weight matrices and restricting the fine-tuning within the top spectral space. Extensive speaker verification experiments on VoxCeleb1 and CN-Celeb1 demonstrate enhanced tuning performance with the proposed approach. Code is released at https://github.com/lizhepolyu/SpectralFT.

* Accepted by ICASSP 2025

Via

Access Paper or Ask Questions

Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker Embedding

Mar 28, 2023

Haiquan Mao, Feng Hong, Man-wai Mak

Abstract:Recent studies have shown that pseudo labels can contribute to unsupervised domain adaptation (UDA) for speaker verification. Inspired by the self-training strategies that use an existing classifier to label the unlabeled data for retraining, we propose a cluster-guided UDA framework that labels the target domain data by clustering and combines the labeled source domain data and pseudo-labeled target domain data to train a speaker embedding network. To improve the cluster quality, we train a speaker embedding network dedicated for clustering by minimizing the contrastive center loss. The goal is to reduce the distance between an embedding and its assigned cluster center while enlarging the distance between the embedding and the other cluster centers. Using VoxCeleb2 as the source domain and CN-Celeb1 as the target domain, we demonstrate that the proposed method can achieve an equal error rate (EER) of 8.10% on the CN-Celeb1 evaluation set without using any labels from the target domain. This result outperforms the supervised baseline by 39.6% and is the state-of-the-art UDA performance on this corpus.

Via

Access Paper or Ask Questions