Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the Role of Speech Data in Reducing Toxicity Detection Bias

Nov 12, 2024

Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà

Figure 1 for On the Role of Speech Data in Reducing Toxicity Detection Bias

Figure 2 for On the Role of Speech Data in Reducing Toxicity Detection Bias

Figure 3 for On the Role of Speech Data in Reducing Toxicity Detection Bias

Figure 4 for On the Role of Speech Data in Reducing Toxicity Detection Bias

Share this with someone who'll enjoy it:

Abstract:Text toxicity detection systems exhibit significant biases, producing disproportionate rates of false positives on samples mentioning demographic groups. But what about toxicity detection in speech? To investigate the extent to which text-based biases are mitigated by speech-based systems, we produce a set of high-quality group annotations for the multilingual MuTox dataset, and then leverage these annotations to systematically compare speech- and text-based toxicity classifiers. Our findings indicate that access to speech data during inference supports reduced bias against group mentions, particularly for ambiguous and disagreement-inducing samples. Our results also suggest that improving classifiers, rather than transcription pipelines, is more helpful for reducing group bias. We publicly release our annotations and provide recommendations for future toxicity dataset construction.

View paper on

Share this with someone who'll enjoy it:

Title:On the Role of Speech Data in Reducing Toxicity Detection Bias

Paper and Code