Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Jan 06, 2019

Yuzhong Wu, Tan Lee

Figure 1 for Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Figure 2 for Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Figure 3 for Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Figure 4 for Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Share this with someone who'll enjoy it:

Abstract:Acoustic scene classification is the task of identifying the scene from which the audio signal is recorded. Convolutional neural network (CNN) models are widely adopted with proven successes in acoustic scene classification. However, there is little insight on how an audio scene is perceived in CNN, as what have been demonstrated in image recognition research. In the present study, the Class Activation Mapping (CAM) is utilized to analyze how the log-magnitude Mel-scale filter-bank (log-Mel) features of different acoustic scenes are learned in a CNN classifier. It is noted that distinct high-energy time-frequency components of audio signals generally do not correspond to strong activation on CAM, while the background sound texture are well learned in CNN. In order to make the sound texture more salient, we propose to apply the Difference of Gaussian (DoG) and Sobel operator to process the log-Mel features and enhance edge information of the time-frequency image. Experimental results on the DCASE 2017 ASC challenge show that using edge enhanced log-Mel images as input feature of CNN significantly improves the performance of audio scene classification.

* Submitted to ICASSP 2019

View paper on

Share this with someone who'll enjoy it:

Title:Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Paper and Code