Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Feb 19, 2022

Phoebe Chua, Dimos Makris, Dorien Herremans, Gemma Roig, Kat Agres

Figure 1 for Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Figure 2 for Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Figure 3 for Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Figure 4 for Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Share this with someone who'll enjoy it:

Abstract:Although media content is increasingly produced, distributed, and consumed in multiple combinations of modalities, how individual modalities contribute to the perceived emotion of a media item remains poorly understood. In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media. The data were collected by presenting music videos to participants in three conditions: music, visual, and audiovisual. Participants annotated the music videos for valence and arousal over time, as well as the overall emotion conveyed. We present detailed descriptive statistics for key measures in the dataset and the results of feature importance analyses for each condition. Finally, we propose a novel transfer learning architecture to train Predictive models Augmented with Isolated modality Ratings (PAIR) and demonstrate the potential of isolated modality ratings for enhancing multimodal emotion recognition. Our results suggest that perceptions of arousal are influenced primarily by auditory information, while perceptions of valence are more subjective and can be influenced by both visual and auditory information. The dataset is made publicly available.

* 16 pages with 9 figures

View paper on

Share this with someone who'll enjoy it:

Title:Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Paper and Code