Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Jan 14, 2023

Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin(+1 more)

Figure 1 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Figure 2 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Figure 3 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Figure 4 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Share this with someone who'll enjoy it:

Abstract:We present a novel multi-modal chitchat dialogue dataset-TikTalk aimed at facilitating the research of intelligent chatbots. It consists of the videos and corresponding dialogues users generate on video social applications. In contrast to existing multi-modal dialogue datasets, we construct dialogue corpora based on video comment-reply pairs, which is more similar to chitchat in real-world dialogue scenarios. Our dialogue context includes three modalities: text, vision, and audio. Compared with previous image-based dialogue datasets, the richer sources of context in TikTalk lead to a greater diversity of conversations. TikTalk contains over 38K videos and 367K dialogues. Data analysis shows that responses in TikTalk are in correlation with various contexts and external knowledge. It poses a great challenge for the deep understanding of multi-modal information and the generation of responses. We evaluate several baselines on three types of automatic metrics and conduct case studies. Experimental results demonstrate that there is still a large room for future improvement on TikTalk. Our dataset is available at \url{https://github.com/RUC-AIMind/TikTalk}.

View paper on

Share this with someone who'll enjoy it:

Title:TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Paper and Code