Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Jun 28, 2022

Byeonggeun Kim, Seunghan Yang, Jangho Kim, Simyung Chang

Figure 1 for QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Figure 2 for QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Figure 3 for QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Figure 4 for QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Share this with someone who'll enjoy it:

Abstract:This technical report describes the details of our TASK1A submission of the DCASE2021 challenge. The goal of the task is to design an audio scene classification system for device-imbalanced datasets under the constraints of model complexity. This report introduces four methods to achieve the goal. First, we propose Residual Normalization, a novel feature normalization method that uses instance normalization with a shortcut path to discard unnecessary device-specific information without losing useful information for classification. Second, we design an efficient architecture, BC-ResNet-Mod, a modified version of the baseline architecture with a limited receptive field. Third, we exploit spectrogram-to-spectrogram translation from one to multiple devices to augment training data. Finally, we utilize three model compression schemes: pruning, quantization, and knowledge distillation to reduce model complexity. The proposed system achieves an average test accuracy of 76.3% in TAU Urban Acoustic Scenes 2020 Mobile, development dataset with 315k parameters, and average test accuracy of 75.3% after compression to 61.0KB of non-zero parameters.

* tech report; won 1st place in DCASE2021 challenge. arXiv admin note: substantial text overlap with arXiv:2111.06531

View paper on

Share this with someone who'll enjoy it:

Title:QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

Paper and Code