Abstract:The paper describes the best performing system for the SemEval-2018 Affect in Tweets (English) sub-tasks. The system focuses on the ordinal classification and regression sub-tasks for valence and emotion. For ordinal classification valence is classified into 7 different classes ranging from -3 to 3 whereas emotion is classified into 4 different classes 0 to 3 separately for each emotion namely anger, fear, joy and sadness. The regression sub-tasks estimate the intensity of valence and each emotion. The system performs domain adaptation of 4 different models and creates an ensemble to give the final prediction. The proposed system achieved 1st position out of 75 teams which participated in the fore-mentioned sub-tasks. We outperform the baseline model by margins ranging from 49.2% to 76.4%, thus, pushing the state-of-the-art significantly.
Abstract:An Unreliable news is any piece of information which is false or misleading, deliberately spread to promote political, ideological and financial agendas. Recently the problem of unreliable news has got a lot of attention as the number instances of using news and social media outlets for propaganda have increased rapidly. This poses a serious threat to society, which calls for technology to automatically and reliably identify unreliable news sources. This paper is an effort made in this direction to build systems for detecting unreliable news articles. In this paper, various NLP algorithms were built and evaluated on Unreliable News Data 2017 dataset. Variants of hierarchical attention networks (HAN) are presented for encoding and classifying news articles which achieve the best results of 0.944 ROC-AUC. Finally, Attention layer weights are visualized to understand and give insight into the decisions made by HANs. The results obtained are very promising and encouraging to deploy and use these systems in the real world to mitigate the problem of unreliable news.
Abstract:The paper describes experiments on estimating emotion intensity in tweets using a generalized regressor system. The system combines lexical, syntactic and pre-trained word embedding features, trains them on general regressors and finally combines the best performing models to create an ensemble. The proposed system stood 3rd out of 22 systems in the leaderboard of WASSA-2017 Shared Task on Emotion Intensity.
Abstract:This paper presents models for detecting agreement/disagreement in online discussions. In this work we show that by using a Siamese inspired architecture to encode the discussions, we no longer need to rely on hand-crafted features to exploit the meta thread structure. We evaluate our model on existing online discussion corpora - ABCD, IAC and AWTP. Experimental results on ABCD dataset show that by fusing lexical and word embedding features, our model achieves the state of the art performance of 0.804 average F1 score. We also show that the model trained on ABCD dataset performs competitively on relatively smaller annotated datasets (IAC and AWTP).