Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jen-Wei Huang

A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability

May 21, 2024

Li-Yang Tseng, Tzu-Ling Lin, Hong-Han Shuai, Jen-Wei Huang, Wen-Whei Chang

Abstract:Nowadays, humans are constantly exposed to music, whether through voluntary streaming services or incidental encounters during commercial breaks. Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. To achieve this, we collect a new music piece dataset with reliable memorability labels using a novel interactive experimental procedure. We then train baselines to predict and analyze music memorability, leveraging both interpretable features and audio mel-spectrograms as inputs. To the best of our knowledge, we are the first to explore music memorability using data-driven deep learning-based methods. Through a series of experiments and ablation studies, we demonstrate that while there is room for improvement, predicting music memorability with limited data is possible. Certain intrinsic elements, such as higher valence, arousal, and faster tempo, contribute to memorable music. As prediction techniques continue to evolve, real-life applications like music recommendation systems and music style transfer will undoubtedly benefit from this new area of research.

* Proceedings of the 24th International Society for Music Information Retrieval Conference, 174-181. Milan, Italy, November 5-9, 2023

Via

Access Paper or Ask Questions

Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

Sep 26, 2019

Chang-Le Liu, Szu-Wei Fu, You-Jin Lee, Yu Tsao, Jen-Wei Huang, Hsin-Min Wang

Figure 1 for Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

Figure 2 for Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

Figure 3 for Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

Figure 4 for Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

Abstract:In recent years, waveform-mapping-based speech enhancement (SE) methods have garnered significant attention. These methods generally use a deep learning model to directly process and reconstruct speech waveforms. Because both the input and output are in waveform format, the waveform-mapping-based SE methods can overcome the distortion caused by imperfect phase estimation, which may be encountered in spectral-mapping-based SE systems. So far, most waveform-mapping-based SE methods have focused on single-channel tasks. In this paper, we propose a novel fully convolutional network (FCN) with Sinc and dilated convolutional layers (termed SDFCN) for multichannel SE that operates in the time domain. We also propose an extended version of SDFCN, called the residual SDFCN (termed rSDFCN). The proposed methods are evaluated on two multichannel SE tasks, namely the dual-channel inner-ear microphones SE task and the distributed microphones SE task. The experimental results confirm the outstanding denoising capability of the proposed SE systems on both tasks and the benefits of using the residual architecture on the overall SE performance.

Via

Access Paper or Ask Questions