Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Mar 03, 2022

Jay Desai, Houwei Cao, Ravi Shah

Figure 1 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Figure 2 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Figure 3 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Figure 4 for Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Share this with someone who'll enjoy it:

Abstract:Automatic emotion recognition for real-life appli-cations is a challenging task. Human emotion expressions aresubtle, and can be conveyed by a combination of several emo-tions. In most existing emotion recognition studies, each audioutterance/video clip is labelled/classified in its entirety. However,utterance/clip-level labelling and classification can be too coarseto capture the subtle intra-utterance/clip temporal dynamics. Forexample, an utterance/video clip usually contains only a fewemotion-salient regions and many emotionless regions. In thisstudy, we propose to use attention mechanism in deep recurrentneural networks to detection the Regions-of-Interest (ROI) thatare more emotionally salient in human emotional speech/video,and further estimate the temporal emotion dynamics by aggre-gating those emotionally salient regions-of-interest. We comparethe ROI from audio and video and analyse them. We comparethe performance of the proposed attention networks with thestate-of-the-art LSTM models on multi-class classification task ofrecognizing six basic human emotions, and the proposed attentionmodels exhibit significantly better performance. Furthermore, theattention weight distribution can be used to interpret how anutterance can be expressed as a mixture of possible emotions.

* Paper written in 2019

View paper on

Share this with someone who'll enjoy it:

Title:Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Paper and Code