Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge

Jan 12, 2024

Wenbin Wang, Liang Ding, Li Shen, Yong Luo, Han Hu, Dacheng Tao

Share this with someone who'll enjoy it:

Abstract:Sentiment analysis is rapidly advancing by utilizing various data modalities (e.g., text, image). However, most previous works relied on superficial information, neglecting the incorporation of contextual world knowledge (e.g., background information derived from but beyond the given image and text pairs) and thereby restricting their ability to achieve better multimodal sentiment analysis. In this paper, we proposed a plug-in framework named WisdoM, designed to leverage contextual world knowledge induced from the large vision-language models (LVLMs) for enhanced multimodal sentiment analysis. WisdoM utilizes a LVLM to comprehensively analyze both images and corresponding sentences, simultaneously generating pertinent context. To reduce the noise in the context, we also introduce a training-free Contextual Fusion mechanism. Experimental results across diverse granularities of multimodal sentiment analysis tasks consistently demonstrate that our approach has substantial improvements (brings an average +1.89 F1 score among five advanced methods) over several state-of-the-art methods. Code will be released.

View paper on

Share this with someone who'll enjoy it:

Title:WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge

Paper and Code