Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ananya Malik

$\textit{Who Speaks Matters}$: Analysing the Influence of the Speaker's Ethnicity on Hate Classification

Oct 27, 2024

Ananya Malik, Kartik Sharma, Lynnette Hui Xian Ng, Shaily Bhatt

Abstract:Large Language Models (LLMs) offer a lucrative promise for scalable content moderation, including hate speech detection. However, they are also known to be brittle and biased against marginalised communities and dialects. This requires their applications to high-stakes tasks like hate speech detection to be critically scrutinized. In this work, we investigate the robustness of hate speech classification using LLMs, particularly when explicit and implicit markers of the speaker's ethnicity are injected into the input. For the explicit markers, we inject a phrase that mentions the speaker's identity. For the implicit markers, we inject dialectal features. By analysing how frequently model outputs flip in the presence of these markers, we reveal varying degrees of brittleness across 4 popular LLMs and 5 ethnicities. We find that the presence of implicit dialect markers in inputs causes model outputs to flip more than the presence of explicit markers. Further, the percentage of flips varies across ethnicities. Finally, we find that larger models are more robust. Our findings indicate the need for exercising caution in deploying LLMs for high-stakes tasks like hate speech detection.

* 9 pages, 3 figures, 3 tables. To appear in NeurIPS SafeGenAI 2024 Workshop

Via

Access Paper or Ask Questions

Evaluating Large Language Models through Gender and Racial Stereotypes

Nov 24, 2023

Ananya Malik

Figure 1 for Evaluating Large Language Models through Gender and Racial Stereotypes

Figure 2 for Evaluating Large Language Models through Gender and Racial Stereotypes

Figure 3 for Evaluating Large Language Models through Gender and Racial Stereotypes

Figure 4 for Evaluating Large Language Models through Gender and Racial Stereotypes

Abstract:Language Models have ushered a new age of AI gaining traction within the NLP community as well as amongst the general population. AI's ability to make predictions, generations and its applications in sensitive decision-making scenarios, makes it even more important to study these models for possible biases that may exist and that can be exaggerated. We conduct a quality comparative study and establish a framework to evaluate language models under the premise of two kinds of biases: gender and race, in a professional setting. We find out that while gender bias has reduced immensely in newer models, as compared to older ones, racial bias still exists.

* 8 pages, 12 figures, 6 tables

Via

Access Paper or Ask Questions