Abstract:Online gender based violence has grown concomitantly with adoption of the internet and social media. Its effects are worse in the Global majority where many users use social media in languages other than English. The scale and volume of conversations on the internet has necessitated the need for automated detection of hate speech, and more specifically gendered abuse. There is, however, a lack of language specific and contextual data to build such automated tools. In this paper we present a dataset on gendered abuse in three languages- Hindi, Tamil and Indian English. The dataset comprises of tweets annotated along three questions pertaining to the experience of gender abuse, by experts who identify as women or a member of the LGBTQIA community in South Asia. Through this dataset we demonstrate a participatory approach to creating datasets that drive AI systems.
Abstract:In this paper, we had built the online model which are built incrementally by using online outlier detection algorithms under the streaming environment. We identified that there is highly necessity to have the streaming models to tackle the streaming data. The objective of this project is to study and analyze the importance of streaming models which is applicable in the real-world environment. In this work, we built various Outlier Detection (OD) algorithms viz., One class Support Vector Machine (OC-SVM), Isolation Forest Adaptive Sliding window approach (IForest ASD), Exact Storm, Angle based outlier detection (ABOD), Local outlier factor (LOF), KitNet, KNN ASD methods. The effectiveness and validity of the above-built models on various finance problems such as credit card fraud detection, churn prediction, ethereum fraud prediction. Further, we also analyzed the performance of the models on the health care prediction problems such as heart stroke prediction, diabetes prediction and heart stroke prediction problems. As per the results and dataset it shows that it performs well for the highly imbalanced datasets that means there is a majority of negative class and minority will be the positive class. Among all the models, the ensemble model strategy IForest ASD model performed better in most of the cases standing in the top 3 models in almost all of the cases.
Abstract:Graph neural network (GNN) is achieving remarkable performances in a variety of application domains. However, GNN is vulnerable to noise and adversarial attacks in input data. Making GNN robust against noises and adversarial attacks is an important problem. The existing defense methods for GNNs are computationally demanding and are not scalable. In this paper, we propose a generic framework for robustifying GNN known as Weighted Laplacian GNN (RWL-GNN). The method combines Weighted Graph Laplacian learning with the GNN implementation. The proposed method benefits from the positive semi-definiteness property of Laplacian matrix, feature smoothness, and latent features via formulating a unified optimization framework, which ensures the adversarial/noisy edges are discarded and connections in the graph are appropriately weighted. For demonstration, the experiments are conducted with Graph convolutional neural network(GCNN) architecture, however, the proposed framework is easily amenable to any existing GNN architecture. The simulation results with benchmark dataset establish the efficacy of the proposed method, both in accuracy and computational efficiency. Code can be accessed at https://github.com/Bharat-Runwal/RWL-GNN.