Abstract:Language has always been one of humanity's defining characteristics. Visual Language Identification (VLI) is a relatively new field of research that is complex and largely understudied. In this paper, we present a preliminary study in which we use linguistic information as a soft biometric trait to enhance the performance of a visual (auditory-free) identification system based on lip movement. We report a significant improvement in the identification performance of the proposed visual system as a result of the integration of these data using a score-based fusion strategy. Methods of Deep and Machine Learning are considered and evaluated. To the experimentation purposes, the dataset called laBial Articulation for the proBlem of the spokEn Language rEcognition (BABELE), consisting of eight different languages, has been created. It includes a collection of different features of which the spoken language represents the most relevant, while each sample is also manually labelled with gender and age of the subjects.
Abstract:Head pose estimation is a crucial challenge for many real-world applications, such as attention and human behavior analysis. This paper aims to estimate head pose from a single image by applying notions of network curvature. In the real world, many complex networks have groups of nodes that are well connected to each other with significant functional roles. Similarly, the interactions of facial landmarks can be represented as complex dynamic systems modeled by weighted graphs. The functionalities of such systems are therefore intrinsically linked to the topology and geometry of the underlying graph. In this work, using the geometric notion of Ollivier-Ricci curvature (ORC) on weighted graphs as input to the XGBoost regression model, we show that the intrinsic geometric basis of ORC offers a natural approach to discovering underlying common structure within a pool of poses. Experiments on the BIWI, AFLW2000 and Pointing'04 datasets show that the ORC_XGB method performs well compared to state-of-the-art methods, both landmark-based and image-only.