Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Mar 28, 2024

Manuel Tonneau, Pedro Vitor Quinta de Castro, Karim Lasri, Ibrahim Farouq, Lakshminarayanan Subramanian, Victor Orozco-Olvera, Samuel Fraiberger

Figure 1 for NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Figure 2 for NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Figure 3 for NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Figure 4 for NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Share this with someone who'll enjoy it:

Abstract:To address the global issue of hateful content proliferating in online platforms, hate speech detection (HSD) models are typically developed on datasets collected in the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on curated samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce NaijaHate, the first dataset annotated for HSD which contains a representative sample of Nigerian tweets. We demonstrate that HSD evaluated on biased datasets traditionally used in the literature largely overestimates real-world performance on representative data. We also propose NaijaXLM-T, a pretrained model tailored to the Nigerian Twitter context, and establish the key role played by domain-adaptive pretraining and finetuning in maximizing HSD performance. Finally, we show that in this context, a human-in-the-loop approach to content moderation where humans review 1% of Nigerian tweets flagged as hateful would enable to moderate 60% of all hateful content. Taken together, these results pave the way towards robust HSD systems and a better protection of social media users from hateful content in low-resource settings.

View paper on

Share this with someone who'll enjoy it:

Title:NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data

Paper and Code