Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:XMeCap: Meme Caption Generation with Sub-Image Adaptability

Jul 24, 2024

Yuyan Chen, Songzhou Yan, Zhihong Zhu, Zhixu Li, Yanghua Xiao

Figure 1 for XMeCap: Meme Caption Generation with Sub-Image Adaptability

Figure 2 for XMeCap: Meme Caption Generation with Sub-Image Adaptability

Figure 3 for XMeCap: Meme Caption Generation with Sub-Image Adaptability

Figure 4 for XMeCap: Meme Caption Generation with Sub-Image Adaptability

Share this with someone who'll enjoy it:

Abstract:Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines. While advances have been made in natural language processing, real-world humor often thrives in a multi-modal context, encapsulated distinctively by memes. This paper poses a particular emphasis on the impact of multi-images on meme captioning. After that, we introduce the \textsc{XMeCap} framework, a novel approach that adopts supervised fine-tuning and reinforcement learning based on an innovative reward model, which factors in both global and local similarities between visuals and text. Our results, benchmarked against contemporary models, manifest a marked improvement in caption generation for both single-image and multi-image memes, as well as different meme categories. \textsc{XMeCap} achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71\% and 4.82\%, respectively. This research not only establishes a new frontier in meme-related studies but also underscores the potential of machines in understanding and generating humor in a multi-modal setting.

* Accepted to MM 2024

View paper on

Share this with someone who'll enjoy it:

Title:XMeCap: Meme Caption Generation with Sub-Image Adaptability

Paper and Code