



Abstract:Hate speech on social media threatens the mental and physical well-being of individuals and is further responsible for real-world violence. An important driver behind the spread of hate speech and thus why hateful posts can go viral are reshares, yet little is known about why users reshare hate speech. In this paper, we present a comprehensive, causal analysis of the user attributes that make users reshare hate speech. However, causal inference from observational social media data is challenging, because such data likely suffer from selection bias, and there is further confounding due to differences in the vulnerability of users to hate speech. We develop a novel, three-step causal framework: (1) We debias the observational social media data by applying inverse propensity scoring. (2) We use the debiased propensity scores to model the latent vulnerability of users to hate speech as a latent embedding. (3) We model the causal effects of user attributes on users' probability of sharing hate speech, while controlling for the latent vulnerability of users to hate speech. Compared to existing baselines, a particular strength of our framework is that it models causal effects that are non-linear, yet still explainable. We find that users with fewer followers, fewer friends, and fewer posts share more hate speech. Younger accounts, in return, share less hate speech. Overall, understanding the factors that drive users to share hate speech is crucial for detecting individuals at risk of engaging in harmful behavior and for designing effective mitigation strategies.

Abstract:The 2022 Russian invasion of Ukraine was accompanied by a large-scale, pro-Russian propaganda campaign on social media. However, the strategy behind the dissemination of propaganda has remained unclear, particularly how the online discourse was strategically shaped by the propagandists' community. Here, we analyze the strategy of the Twitter community using an inverse reinforcement learning (IRL) approach. Specifically, IRL allows us to model online behavior as a Markov decision process, where the goal is to infer the underlying reward structure that guides propagandists when interacting with users with a supporting or opposing stance toward the invasion. Thereby, we aim to understand empirically whether and how between-user interactions are strategically used to promote the proliferation of Russian propaganda. For this, we leverage a large-scale dataset with 349,455 posts with pro-Russian propaganda from 132,131 users. We show that bots and humans follow a different strategy: bots respond predominantly to pro-invasion messages, suggesting that they seek to drive virality; while messages indicating opposition primarily elicit responses from humans, suggesting that they tend to engage in critical discussions. To the best of our knowledge, this is the first study analyzing the strategy behind propaganda from the 2022 Russian invasion of Ukraine through the lens of IRL.

Abstract:Online propaganda poses a severe threat to the integrity of societies. However, existing datasets for detecting online propaganda have a key limitation: they were annotated using weak labels that can be noisy and even incorrect. To address this limitation, our work makes the following contributions: (1) We present HQP: a novel dataset (N=30,000) for detecting online propaganda with high-quality labels. To the best of our knowledge, HQP is the first dataset for detecting online propaganda that was created through human annotation. (2) We show empirically that state-of-the-art language models fail in detecting online propaganda when trained with weak labels (AUC: 64.03). In contrast, state-of-the-art language models can accurately detect online propaganda when trained with our high-quality labels (AUC: 92.25), which is an improvement of ~44%. (3) To address the cost of labeling, we extend our work to few-shot learning. Specifically, we show that prompt-based learning using a small sample of high-quality labels can still achieve a reasonable performance (AUC: 80.27). Finally, we discuss implications for the NLP community to balance the cost and quality of labeling. Crucially, our work highlights the importance of high-quality labels for sensitive NLP tasks such as propaganda detection.
