Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Prathyush Potluri

Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Nov 12, 2021

Debapriya Tula, Shreyas MS, Viswanatha Reddy, Pranjal Sahu, Sumanth Doddapaneni, Prathyush Potluri, Rohan Sukumaran, Parth Patwa

Figure 1 for Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Figure 2 for Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Figure 3 for Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Figure 4 for Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Abstract:Over the past decade, we have seen exponential growth in online content fueled by social media platforms. Data generation of this scale comes with the caveat of insurmountable offensive content in it. The complexity of identifying offensive content is exacerbated by the usage of multiple modalities (image, language, etc.), code mixed language and more. Moreover, even if we carefully sample and annotate offensive content, there will always exist significant class imbalance in offensive vs non offensive content. In this paper, we introduce a novel Code-Mixing Index (CMI) based focal loss which circumvents two challenges (1) code mixing in languages (2) class imbalance problem for Dravidian language offense detection. We also replace the conventional dot product-based classifier with the cosine-based classifier which results in a boost in performance. Further, we use multilingual models that help transfer characteristics learnt across languages to work effectively with low resourced languages. It is also important to note that our model handles instances of mixed script (say usage of Latin and Dravidian - Tamil script) as well. Our model can handle offensive language detection in a low-resource, class imbalanced, multilingual and code mixed setting.

Via

Access Paper or Ask Questions

Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Nov 26, 2020

Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, Sundeep Teki

Figure 1 for Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Figure 2 for Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Figure 3 for Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Figure 4 for Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Abstract:The rapid advancement of technology in online communication via social media platforms has led to a prolific rise in the spread of misinformation and fake news. Fake news is especially rampant in the current COVID-19 pandemic, leading to people believing in false and potentially harmful claims and stories. Detecting fake news quickly can alleviate the spread of panic, chaos and potential health hazards. We developed a two stage automated pipeline for COVID-19 fake news detection using state of the art machine learning models for natural language processing. The first model leverages a novel fact checking algorithm that retrieves the most relevant facts concerning user claims about particular COVID-19 claims. The second model verifies the level of truth in the claim by computing the textual entailment between the claim and the true facts retrieved from a manually curated COVID-19 dataset. The dataset is based on a publicly available knowledge source consisting of more than 5000 COVID-19 false claims and verified explanations, a subset of which was internally annotated and cross-validated to train and evaluate our models. We evaluate a series of models based on classical text-based features to more contextual Transformer based models and observe that a model pipeline based on BERT and ALBERT for the two stages respectively yields the best results.

Via

Access Paper or Ask Questions