Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Creating a Multimodal Dataset of Images and Text to Study Abusive Language

May 05, 2020

Alessio Palmero Aprosio, Stefano Menini, Sara Tonelli

Figure 1 for Creating a Multimodal Dataset of Images and Text to Study Abusive Language

Figure 2 for Creating a Multimodal Dataset of Images and Text to Study Abusive Language

Figure 3 for Creating a Multimodal Dataset of Images and Text to Study Abusive Language

Figure 4 for Creating a Multimodal Dataset of Images and Text to Study Abusive Language

Share this with someone who'll enjoy it:

Abstract:In order to study online hate speech, the availability of datasets containing the linguistic phenomena of interest are of crucial importance. However, when it comes to specific target groups, for example teenagers, collecting such data may be problematic due to issues with consent and privacy restrictions. Furthermore, while text-only datasets of this kind have been widely used, limitations set by image-based social media platforms like Instagram make it difficult for researchers to experiment with multimodal hate speech data. We therefore developed CREENDER, an annotation tool that has been used in school classes to create a multimodal dataset of images and abusive comments, which we make freely available under Apache 2.0 license. The corpus, with Italian comments, has been analysed from different perspectives, to investigate whether the subject of the images plays a role in triggering a comment. We find that users judge the same images in different ways, although the presence of a person in the picture increases the probability to get an offensive comment.

View paper on

Share this with someone who'll enjoy it:

Title:Creating a Multimodal Dataset of Images and Text to Study Abusive Language

Paper and Code