Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Apr 14, 2020

Youngmoon Jung, Seong Min Kye, Yeunju Choi, Myunghun Jung, Hoirin Kim

Figure 1 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Figure 2 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Figure 3 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Figure 4 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Share this with someone who'll enjoy it:

Abstract:Currently, the most widely used approach for speaker verification is the deep speaker embedding learning. In this approach, convolutional neural networks are mainly used as a frame-level feature extractor, and speaker embeddings are extracted from the last layer of the feature extractor. Multi-scale aggregation (MSA), which utilizes multi-scale features from different layers of the feature extractor, has recently been introduced into the approach and has shown improved performance for both short and long utterances. This paper improves the MSA by using a feature pyramid module, which enhances speaker-discriminative information of features at multiple layers via a top-down pathway and lateral connections. We extract speaker embeddings using the enhanced features that contain rich speaker information at different resolutions. Experiments on the VoxCeleb dataset show that the proposed module improves previous MSA methods with a smaller number of parameters, providing better performance than state-of-the-art approaches.

* Submitted to Interspeech 2020

View paper on

Share this with someone who'll enjoy it:

Title:Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Paper and Code