Abstract:In this study, we introduce ANGST, a novel, first-of-its kind benchmark for depression-anxiety comorbidity classification from social media posts. Unlike contemporary datasets that often oversimplify the intricate interplay between different mental health disorders by treating them as isolated conditions, ANGST enables multi-label classification, allowing each post to be simultaneously identified as indicating depression and/or anxiety. Comprising 2876 meticulously annotated posts by expert psychologists and an additional 7667 silver-labeled posts, ANGST posits a more representative sample of online mental health discourse. Moreover, we benchmark ANGST using various state-of-the-art language models, ranging from Mental-BERT to GPT-4. Our results provide significant insights into the capabilities and limitations of these models in complex diagnostic scenarios. While GPT-4 generally outperforms other models, none achieve an F1 score exceeding 72% in multi-class comorbid classification, underscoring the ongoing challenges in applying language models to mental health diagnostics.
Abstract:In this paper we present an approach for estimating the quality of machine translation system. There are various methods for estimating the quality of output sentences, but in this paper we focus on Na\"ive Bayes classifier to build model using features which are extracted from the input sentences. These features are used for finding the likelihood of each of the sentences of the training data which are then further used for determining the scores of the test data. On the basis of these scores we determine the class labels of the test data.
Abstract:This paper considers the problem for estimating the quality of machine translation outputs which are independent of human intervention and are generally addressed using machine learning techniques.There are various measures through which a machine learns translations quality. Automatic Evaluation metrics produce good co-relation at corpus level but cannot produce the same results at the same segment or sentence level. In this paper 16 features are extracted from the input sentences and their translations and a quality score is obtained based on Bayesian inference produced from training data.