Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Mar 11, 2025

Taoxu Zhao, Meisi Li, Kehao Chen, Liye Wang, Xucheng Zhou, Kunal Chaturvedi, Mukesh Prasad, Ali Anaissi, Ali Braytee

Figure 1 for Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Figure 2 for Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Figure 3 for Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Figure 4 for Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Share this with someone who'll enjoy it:

Abstract:Multimodal sentiment analysis enhances conventional sentiment analysis, which traditionally relies solely on text, by incorporating information from different modalities such as images, text, and audio. This paper proposes a novel multimodal sentiment analysis architecture that integrates text and image data to provide a more comprehensive understanding of sentiments. For text feature extraction, we utilize BERT, a natural language processing model. For image feature extraction, we employ DINOv2, a vision-transformer-based model. The textual and visual latent features are integrated using proposed fusion techniques, namely the Basic Fusion Model, Self Attention Fusion Model, and Dual Attention Fusion Model. Experiments on three datasets, Memotion 7k dataset, MVSA single dataset, and MVSA multi dataset, demonstrate the viability and practicality of the proposed multimodal architecture.

* 12 pages

View paper on

Share this with someone who'll enjoy it:

Title:Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Paper and Code