Abstract:Emotions are an essential part of human behavior that can impact thinking, decision-making, and communication skills. Thus, the ability to accurately monitor and identify emotions can be useful in many human-centered applications such as behavioral training, tracking emotional well-being, and development of human-computer interfaces. The correlation between patterns in physiological data and affective states has allowed for the utilization of deep learning techniques which can accurately detect the affective states of a person. However, the generalisability of existing models is often limited by the subject-dependent noise in the physiological data due to variations in a subject's reactions to stimuli. Hence, we propose a novel cost function that employs Optimal Transport Theory, specifically Wasserstein Distance, to scale the importance of subject-dependent data such that higher importance is assigned to patterns in data that are common across all participants while decreasing the importance of patterns that result from subject-dependent noise. The performance of the proposed cost function is demonstrated through an autoencoder with a multi-class classifier attached to the latent space and trained simultaneously to detect different affective states. An autoencoder with a state-of-the-art loss function i.e., Mean Squared Error, is used as a baseline for comparison with our model across four different commonly used datasets. Centroid and minimum distance between different classes are used as a metrics to indicate the separation between different classes in the latent space. An average increase of 14.75% and 17.75% (from benchmark to proposed loss function) was found for minimum and centroid euclidean distance respectively over all datasets.