Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Aug 14, 2021

Ayush Kumar, Pratik Kumar

Figure 1 for Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Figure 2 for Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Figure 3 for Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Figure 4 for Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Share this with someone who'll enjoy it:

Abstract:With surge in online platforms, there has been an upsurge in the user engagement on these platforms via comments and reactions. A large portion of such textual comments are abusive, rude and offensive to the audience. With machine learning systems in-place to check such comments coming onto platform, biases present in the training data gets passed onto the classifier leading to discrimination against a set of classes, religion and gender. In this work, we evaluate different classifiers and feature to estimate the bias in these classifiers along with their performance on downstream task of toxicity classification. Results show that improvement in performance of automatic toxic comment detection models is positively correlated to mitigating biases in these models. In our work, LSTM with attention mechanism proved to be a better modelling strategy than a CNN model. Further analysis shows that fasttext embeddings is marginally preferable than glove embeddings on training models for toxicity comment detection. Deeper analysis reveals the findings that such automatic models are particularly biased to specific identity groups even though the model has a high AUC score. Finally, in effort to mitigate bias in toxicity detection models, a multi-task setup trained with auxiliary task of toxicity sub-types proved to be useful leading to upto 0.26% (6% relative) gain in AUC scores.

View paper on

Share this with someone who'll enjoy it:

Title:Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study

Paper and Code