Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

May 21, 2024

R Gnana Praveen, Jahangir Alam

Figure 1 for Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Figure 2 for Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Figure 3 for Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Figure 4 for Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Share this with someone who'll enjoy it:

Abstract:Leveraging complementary relationships across modalities has recently drawn a lot of attention in multimodal emotion recognition. Most of the existing approaches explored cross-attention to capture the complementary relationships across the modalities. However, the modalities may also exhibit weak complementary relationships, which may deteriorate the cross-attended features, resulting in poor multimodal feature representations. To address this problem, we propose Inconsistency-Aware Cross-Attention (IACA), which can adaptively select the most relevant features on-the-fly based on the strong or weak complementary relationships across audio and visual modalities. Specifically, we design a two-stage gating mechanism that can adaptively select the appropriate relevant features to deal with weak complementary relationships. Extensive experiments are conducted on the challenging Aff-Wild2 dataset to show the robustness of the proposed model.

* arXiv admin note: substantial text overlap with arXiv:2403.19554

View paper on

Share this with someone who'll enjoy it:

Title:Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition

Paper and Code