Abstract:Generics (unquantified generalizations) are thought to be pervasive in communication and when they are about social groups, this may offend and polarize people because generics gloss over variations between individuals. Generics about social groups might be particularly common on Twitter (X). This remains unexplored, however. Using machine learning (ML) techniques, we therefore developed an automatic classifier for social generics, applied it to more than a million tweets about people, and analyzed the tweets. We found that most tweets (78%) about people contained no generics. However, tweets with social generics received more 'likes' and retweets. Furthermore, while recent psychological research may lead to the prediction that tweets with generics about political groups are more common than tweets with generics about ethnic groups, we found the opposite. However, consistent with recent claims that political animosity is less constrained by social norms than animosity against gender and ethnic groups, negative tweets with generics about political groups were significantly more prevalent and retweeted than negative tweets about ethnic groups. Our study provides the first ML-based insights into the use and impact of social generics on Twitter.
Abstract:Attitudes about vaccination have become more polarized; it is common to see vaccine disinformation and fringe conspiracy theories online. An observational study of Twitter vaccine discourse is found in Ojea Quintana et al. (2021): the authors analyzed approximately six months' of Twitter discourse -- 1.3 million original tweets and 18 million retweets between December 2019 and June 2020, ranging from before to after the establishment of Covid-19 as a pandemic. This work expands upon Ojea Quintana et al. (2021) with two main contributions from data science. First, based on the authors' initial network clustering and qualitative analysis techniques, we are able to clearly demarcate and visualize the language patterns used in discourse by Antivaxxers (anti-vaccination campaigners and vaccine deniers) versus other clusters (collectively, Others). Second, using the characteristics of Antivaxxers' tweets, we develop text classifiers to determine the likelihood a given user is employing anti-vaccination language, ultimately contributing to an early-warning mechanism to improve the health of our epistemic environment and bolster (and not hinder) public health initiatives.