Abstract:In cognitive psychology, automatic and self-reinforcing irrational thought patterns are known as cognitive distortions. Left unchecked, patients exhibiting these types of thoughts can become stuck in negative feedback loops of unhealthy thinking, leading to inaccurate perceptions of reality commonly associated with anxiety and depression. In this paper, we present a machine learning framework for the automatic detection and classification of 15 common cognitive distortions in two novel mental health free text datasets collected from both crowdsourcing and a real-world online therapy program. When differentiating between distorted and non-distorted passages, our model achieved a weighted F1 score of 0.88. For classifying distorted passages into one of 15 distortion categories, our model yielded weighted F1 scores of 0.68 in the larger crowdsourced dataset and 0.45 in the smaller online counseling dataset, both of which outperformed random baseline metrics by a large margin. For both tasks, we also identified the most discriminative words and phrases between classes to highlight common thematic elements for improving targeted and therapist-guided mental health treatment. Furthermore, we performed an exploratory analysis using unsupervised content-based clustering and topic modeling algorithms as first efforts towards a data-driven perspective on the thematic relationship between similar cognitive distortions traditionally deemed unique. Finally, we highlight the difficulties in applying mental health-based machine learning in a real-world setting and comment on the implications and benefits of our framework for improving automated delivery of therapeutic treatment in conjunction with traditional cognitive-behavioral therapy.
Abstract:As the popularity of social media platforms continues to rise, an ever-increasing amount of human communication and self- expression takes place online. Most recent research has focused on mining social media for public user opinion about external entities such as product reviews or sentiment towards political news. However, less attention has been paid to analyzing users' internalized thoughts and emotions from a mental health perspective. In this paper, we quantify the semantic difference between public Tweets and private mental health journals used in online cognitive behavioral therapy. We will use deep transfer learning techniques for analyzing the semantic gap between the two domains. We show that for the task of emotional valence prediction, social media can be successfully harnessed to create more accurate, robust, and personalized mental health models. Our results suggest that the semantic gap between public and private self-expression is small, and that utilizing the abundance of available social media is one way to overcome the small sample sizes of mental health data, which are commonly limited by availability and privacy concerns.