Abstract:Self-supervised learning (SSL) has been investigated to generate task-agnostic representations across various domains. However, such investigation has not been conducted for detecting multiple mental disorders. The rationale behind the existence of a task-agnostic representation lies in the overlapping symptoms among multiple mental disorders. Consequently, the behavioural data collected for mental health assessment may carry a mixed bag of attributes related to multiple disorders. Motivated by that, in this study, we explore a task-agnostic representation derived through SSL in the context of detecting major depressive disorder (MDD) and post-traumatic stress disorder (PTSD) using audio and video data collected during interactive sessions. This study employs SSL models trained by predicting multiple fixed targets or masked frames. We propose a list of fixed targets to make the generated representation more efficient for detecting MDD and PTSD. Furthermore, we modify the hyper-parameters of the SSL encoder predicting fixed targets to generate global representations that capture varying temporal contexts. Both these innovations are noted to yield improved detection performances for considered mental disorders and exhibit task-agnostic traits. In the context of the SSL model predicting masked frames, the generated global representations are also noted to exhibit task-agnostic traits.
Abstract:The primary method for identifying mental disorders automatically has traditionally involved using binary classifiers. These classifiers are trained using behavioral data obtained from an interview setup. In this training process, data from individuals with the specific disorder under consideration are categorized as the positive class, while data from all other participants constitute the negative class. In practice, it is widely recognized that certain mental disorders share similar symptoms, causing the collected behavioral data to encompass a variety of attributes associated with multiple disorders. Consequently, attributes linked to the targeted mental disorder might also be present within the negative class. This data impurity may lead to sub-optimal training of the classifier for a mental disorder of interest. In this study, we investigate this hypothesis in the context of major depressive disorder (MDD) and post-traumatic stress disorder detection (PTSD). The results show that upon removal of such data impurity, MDD and PTSD detection performances are significantly improved.
Abstract:Several studies have been reported in the literature for the automatic detection of mental disorders. It is reported that mental disorders are highly correlated. The exploration of this fact for the automatic detection of mental disorders is yet to explore. Emotion regulation difficulties (ERD) characterize several mental disorders. Motivated by that, we investigated the use of ERD for the detection of two opted mental disorders in this study. For this, we have collected audio-video data of human subjects while conversing with a computer agent based on a specific questionnaire. Subsequently, a subject's responses are collected to obtain the ground truths of the audio-video data of that subject. The results indicate that the ERD can be used as an intermediate representation of audio-video data for detecting mental disorders.