Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reliable Visualization for Deep Speaker Recognition

Apr 12, 2022

Pengqi Li, Lantian Li, Askar Hamdulla, Dong Wang

Figure 1 for Reliable Visualization for Deep Speaker Recognition

Figure 2 for Reliable Visualization for Deep Speaker Recognition

Figure 3 for Reliable Visualization for Deep Speaker Recognition

Figure 4 for Reliable Visualization for Deep Speaker Recognition

Share this with someone who'll enjoy it:

Abstract:In spite of the impressive success of convolutional neural networks (CNNs) in speaker recognition, our understanding to CNNs' internal functions is still limited. A major obstacle is that some popular visualization tools are difficult to apply, for example those producing saliency maps. The reason is that speaker information does not show clear spatial patterns in the temporal-frequency space, which makes it hard to interpret the visualization results, and hence hard to confirm the reliability of a visualization tool. In this paper, we conduct an extensive analysis on three popular visualization methods based on CAM: Grad-CAM, Score-CAM and Layer-CAM, to investigate their reliability for speaker recognition tasks. Experiments conducted on a state-of-the-art ResNet34SE model show that the Layer-CAM algorithm can produce reliable visualization, and thus can be used as a promising tool to explain CNN-based speaker models. The source code and examples are available in our project page: http://project.cslt.org/.

* submitted to INTERSPEECH 2022

View paper on

Share this with someone who'll enjoy it:

Title:Reliable Visualization for Deep Speaker Recognition

Paper and Code