Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora

Aug 02, 2023

Kanzhi Cheng, Zheng Ma, Shi Zong, Jianbing Zhang, Xinyu Dai, Jiajun Chen

Share this with someone who'll enjoy it:

Abstract:Generating visually grounded image captions with specific linguistic styles using unpaired stylistic corpora is a challenging task, especially since we expect stylized captions with a wide variety of stylistic patterns. In this paper, we propose a novel framework to generate Accurate and Diverse Stylized Captions (ADS-Cap). Our ADS-Cap first uses a contrastive learning module to align the image and text features, which unifies paired factual and unpaired stylistic corpora during the training process. A conditional variational auto-encoder is then used to automatically memorize diverse stylistic patterns in latent space and enhance diversity through sampling. We also design a simple but effective recheck module to boost style accuracy by filtering style-specific captions. Experimental results on two widely used stylized image captioning datasets show that regarding consistency with the image, style accuracy and diversity, ADS-Cap achieves outstanding performances compared to various baselines. We finally conduct extensive analyses to understand the effectiveness of our method. Our code is available at https://github.com/njucckevin/ADS-Cap.

* Accepted at Natural Language Processing and Chinese Computing (NLPCC) 2022

View paper on

Share this with someone who'll enjoy it:

Title:ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora

Paper and Code