Abstract:We propose an ontology enhanced model for sentence based claim detection. We fused ontology embeddings from a knowledge base with BERT sentence embeddings to perform claim detection for the ClaimBuster and the NewsClaims datasets. Our ontology enhanced approach showed the best results with these small-sized unbalanced datasets, compared to other statistical and neural machine learning models. The experiments demonstrate that adding domain specific features (either trained word embeddings or knowledge graph metadata) can improve traditional ML methods. In addition, adding domain knowledge in the form of ontology embeddings helps avoid the bias encountered in neural network based models, for example the pure BERT model bias towards larger classes in our small corpus.
Abstract:In this working paper, we turn our attention to two exemplary, cross-media shitstorms directed against well-known individuals from the business world. Both have in common, first, the trigger, a controversial statement by the person who thereby becomes the target of the shitstorm, and second, the identity of this target as relatively privileged: cis-male, white, successful. We examine the spread of the outrage wave across two media at a time and test the applicability of computational linguistic methods for analyzing its time course. Assuming that harmful language spreads like a virus in digital space, we are primarily interested in the events and constellations that lead to the use of harmful language, and whether and how a linguistic formation of "tribes" occurs. Our research therefore focuses, first, on the distribution of linguistic features within the overall shitstorm: are individual words or phrases increasingly used after their introduction, and through which pathways they spread. Second, we ask whether "tribes," for example, one group of supporters and one of opponents of the target, have a distinguished linguistic form. Our hypothesis is that supporters remain equally active over time, while the dynamic "ripple" effect of the shitstorm is based on the varying participation of opponents.
Abstract:Learned dynamic weighting of the conditioning signal (attention) has been shown to improve neural language generation in a variety of settings. The weights applied when generating a particular output sequence have also been viewed as providing a potentially explanatory insight into the internal workings of the generator. In this paper, we reverse the direction of this connection and ask whether through the control of the attention of the model we can control its output. Specifically, we take a standard neural image captioning model that uses attention, and fix the attention to pre-determined areas in the image. We evaluate whether the resulting output is more likely to mention the class of the object in that area than the normally generated caption. We introduce three effective methods to control the attention and find that these are producing expected results in up to 28.56% of the cases.