Abstract:Lies and deception are common phenomena in society, both in our private and professional lives. However, humans are notoriously bad at accurate deception detection. Based on the literature, human accuracy of distinguishing between lies and truthful statements is 54% on average, in other words it is slightly better than a random guess. While people do not much care about this issue, in high-stakes situations such as interrogations for series crimes and for evaluating the testimonies in court cases, accurate deception detection methods are highly desirable. To achieve a reliable, covert, and non-invasive deception detection, we propose a novel method that jointly extracts reliable low- and high-level facial features namely, 3D facial geometry, skin reflectance, expression, head pose, and scene illumination in a video sequence. Then these features are modeled using a Recurrent Neural Network to learn temporal characteristics of deceptive and honest behavior. We evaluate the proposed method on the Real-Life Trial (RLT) dataset that contains high-stake deceptive and honest videos recorded in courtrooms. Our results show that the proposed method (with an accuracy of 72.8%) improves the state of the art as well as outperforming the use of manually coded facial attributes 67.6%) in deception detection.