Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Nov 13, 2024

Shayan Alipour, Indira Sen, Mattia Samory, Tanushree Mitra

Figure 1 for Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Figure 2 for Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Figure 3 for Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Figure 4 for Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) are known to exhibit demographic biases, yet few studies systematically evaluate these biases across multiple datasets or account for confounding factors. In this work, we examine LLM alignment with human annotations in five offensive language datasets, comprising approximately 220K annotations. Our findings reveal that while demographic traits, particularly race, influence alignment, these effects are inconsistent across datasets and often entangled with other factors. Confounders -- such as document difficulty, annotator sensitivity, and within-group agreement -- account for more variation in alignment patterns than demographic traits alone. Specifically, alignment increases with higher annotator sensitivity and group agreement, while greater document difficulty corresponds to reduced alignment. Our results underscore the importance of multi-dataset analyses and confounder-aware methodologies in developing robust measures of demographic bias in LLMs.

* 18 pages, 8 figures, ACL'25

View paper on

Share this with someone who'll enjoy it:

Title:Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Paper and Code