Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xavier Baró

AI Competitions and Benchmarks: Dataset Development

Apr 15, 2024

Romain Egele, Julio C. S. Jacques Junior, Jan N. van Rijn, Isabelle Guyon, Xavier Baró, Albert Clapés, Prasanna Balaprakash, Sergio Escalera, Thomas Moeslund, Jun Wan

Figure 1 for AI Competitions and Benchmarks: Dataset Development

Figure 2 for AI Competitions and Benchmarks: Dataset Development

Figure 3 for AI Competitions and Benchmarks: Dataset Development

Figure 4 for AI Competitions and Benchmarks: Dataset Development

Abstract:Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual data preparation. The haste in developing new models can frequently result in various shortcomings, potentially posing risks when deployed in real-world scenarios (eg social discrimination, critical failures), leading to the failure or substantial escalation of costs in AI-based projects. This chapter provides a comprehensive overview of established methodological tools, enriched by our practical experience, in the development of datasets for machine learning. Initially, we develop the tasks involved in dataset development and offer insights into their effective management (including requirements, design, implementation, evaluation, distribution, and maintenance). Then, we provide more details about the implementation process which includes data collection, transformation, and quality evaluation. Finally, we address practical considerations regarding dataset distribution and maintenance.

* Preprint version of the 3rd Chapter of the book: Competitions and Benchmarks, the science behind the contests (https://sites.google.com/chalearn.org/book/home)

Via

Access Paper or Ask Questions

Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Nov 30, 2020

Julio C. S. Jacques Junior, Agata Lapedriza, Cristina Palmero, Xavier Baró, Sergio Escalera

Figure 1 for Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Figure 2 for Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Figure 3 for Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Figure 4 for Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Abstract:This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness. We show how person perception bias can influence data labelling of a subjective task, which has received little attention from the computer vision and machine learning communities by now. We further show that the mechanism used to convert pairwise annotations to continuous values may magnify the biases if no special treatment is considered. The findings of this study are relevant for the computer vision community that is still creating new datasets on subjective tasks, and using them for practical applications, ignoring these perceptual biases.

* accepted on 11th International Workshop on Human Behavior Understanding (HBU), organized as part of WACV 2021

Via

Access Paper or Ask Questions

Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation

May 08, 2019

Daniel Sánchez, Marc Oliu, Meysam Madadi, Xavier Baró, Sergio Escalera

Figure 1 for Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation

Figure 2 for Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation

Figure 3 for Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation

Figure 4 for Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation

Abstract:While many individual tasks in the domain of human analysis have recently received an accuracy boost from deep learning approaches, multi-task learning has mostly been ignored due to a lack of data. New synthetic datasets are being released, filling this gap with synthetic generated data. In this work, we analyze four related human analysis tasks in still images in a multi-task scenario by leveraging such datasets. Specifically, we study the correlation of 2D/3D pose estimation, body part segmentation and full-body depth estimation. These tasks are learned via the well-known Stacked Hourglass module such that each of the task-specific streams shares information with the others. The main goal is to analyze how training together these four related tasks can benefit each individual task for a better generalization. Results on the newly released SURREAL dataset show that all four tasks benefit from the multi-task approach, but with different combinations of tasks: while combining all four tasks improves 2D pose estimation the most, 2D pose improves neither 3D pose nor full-body depth estimation. On the other hand 2D parts segmentation can benefit from 2D pose but not from 3D pose. In all cases, as expected, the maximum improvement is achieved on those human body parts that show more variability in terms of spatial distribution, appearance and shape, e.g. wrists and ankles.

* 8 pages, 4 Figures, 5 Tables, Conference Faces and Gestures 2019

Via

Access Paper or Ask Questions

On the effect of age perception biases for real age regression

Feb 20, 2019

Julio C. S. Jacques Junior, Cagri Ozcinar, Marina Marjanovic, Xavier Baró, Gholamreza Anbarjafari, Sergio Escalera

Figure 1 for On the effect of age perception biases for real age regression

Figure 2 for On the effect of age perception biases for real age regression

Figure 3 for On the effect of age perception biases for real age regression

Figure 4 for On the effect of age perception biases for real age regression

Abstract:Automatic age estimation from facial images represents an important task in computer vision. This paper analyses the effect of gender, age, ethnic, makeup and expression attributes of faces as sources of bias to improve deep apparent age prediction. Following recent works where it is shown that apparent age labels benefit real age estimation, rather than direct real to real age regression, our main contribution is the integration, in an end-to-end architecture, of face attributes for apparent age prediction with an additional loss for real age regression. Experimental results on the APPA-REAL dataset indicate the proposed network successfully take advantage of the adopted attributes to improve both apparent and real age estimation. Our model outperformed a state-of-the-art architecture proposed to separately address apparent and real age regression. Finally, we present preliminary results and discussion of a proof of concept application using the proposed model to regress the apparent age of an individual based on the gender of an external observer.

* Accepted in the 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019)

Via

Access Paper or Ask Questions

First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis

Apr 21, 2018

Julio C. S. Jacques Junior, Yağmur Güçlütürk, Marc Pérez, Umut Güçlü, Carlos Andujar, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Marcel A. J. van Gerven, Rob van Lier(+1 more)

Figure 1 for First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis

Figure 2 for First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis

Figure 3 for First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis

Figure 4 for First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis

Abstract:Personality analysis has been widely studied in psychology, neuropsychology, signal processing fields, among others. From the computing point of view, by far speech and text have been the most analyzed cues of information for analyzing personality. However, recently there has been an increasing interest form the computer vision community in analyzing personality starting from visual information. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing computer vision-based visual and multimodal approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features. More importantly, future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed. Hence, the survey provides an up-to-date review of research progress in a wide range of aspects of this research theme.

* submitted to IEEE Transactions on Affective Computing

Via

Access Paper or Ask Questions

Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Apr 12, 2018

Julio C. S. Jacques Junior, Xavier Baró, Sergio Escalera

Figure 1 for Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Figure 2 for Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Figure 3 for Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Figure 4 for Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Abstract:Person re-identification has received special attention by the human analysis community in the last few years. To address the challenges in this field, many researchers have proposed different strategies, which basically exploit either cross-view invariant features or cross-view robust metrics. In this work, we propose to exploit a post-ranking approach and combine different feature representations through ranking aggregation. Spatial information, which potentially benefits the person matching, is represented using a 2D body model, from which color and texture information are extracted and combined. We also consider background/foreground information, automatically extracted via Deep Decompositional Network, and the usage of Convolutional Neural Network (CNN) features. To describe the matching between images we use the polynomial feature map, also taking into account local and global information. The Discriminant Context Information Analysis based post-ranking approach is used to improve initial ranking lists. Finally, the Stuart ranking aggregation method is employed to combine complementary ranking lists obtained from different feature representations. Experimental results demonstrated that we improve the state-of-the-art on VIPeR and PRID450s datasets, achieving 67.21% and 75.64% on top-1 rank recognition rate, respectively, as well as obtaining competitive results on CUHK01 dataset.

* Preprint submitted to Image and Vision Computing

Via

Access Paper or Ask Questions

End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Mar 09, 2017

Umut Güçlü, Yağmur Güçlütürk, Meysam Madadi, Sergio Escalera, Xavier Baró, Jordi González, Rob van Lier, Marcel A. J. van Gerven

Figure 1 for End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Figure 2 for End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Figure 3 for End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Figure 4 for End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

Abstract:Recent years have seen a sharp increase in the number of related yet distinct advances in semantic segmentation. Here, we tackle this problem by leveraging the respective strengths of these advances. That is, we formulate a conditional random field over a four-connected graph as end-to-end trainable convolutional and recurrent networks, and estimate them via an adversarial process. Importantly, our model learns not only unary potentials but also pairwise potentials, while aggregating multi-scale contexts and controlling higher-order inconsistencies. We evaluate our model on two standard benchmark datasets for semantic face segmentation, achieving state-of-the-art results on both of them.

Via

Access Paper or Ask Questions

ChaLearn Looking at People: A Review of Events and Resources

Feb 15, 2017

Sergio Escalera, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon

Figure 1 for ChaLearn Looking at People: A Review of Events and Resources

Abstract:This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities.

* Paper to appear in proceedings of IJCNN 2017 - IEEE - Associated website: http://chalearnlap.cvc.uab.es

Via

Access Paper or Ask Questions

Non-Verbal Communication Analysis in Victim-Offender Mediations

Jan 19, 2015

Víctor Ponce-López, Sergio Escalera, Marc Pérez, Oriol Janés, Xavier Baró

Figure 1 for Non-Verbal Communication Analysis in Victim-Offender Mediations

Figure 2 for Non-Verbal Communication Analysis in Victim-Offender Mediations

Figure 3 for Non-Verbal Communication Analysis in Victim-Offender Mediations

Figure 4 for Non-Verbal Communication Analysis in Victim-Offender Mediations

Abstract:In this paper we present a non-invasive ambient intelligence framework for the semi-automatic analysis of non-verbal communication applied to the restorative justice field. In particular, we propose the use of computer vision and social signal processing technologies in real scenarios of Victim-Offender Mediations, applying feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues from the fields of psychology and observational methodology. We test our methodology on data captured in real world Victim-Offender Mediation sessions in Catalonia in collaboration with the regional government. We define the ground truth based on expert opinions when annotating the observed social responses. Using different state-of-the-art binary classification approaches, our system achieves recognition accuracies of 86% when predicting satisfaction, and 79% when predicting both agreement and receptivity. Applying a regression strategy, we obtain a mean deviation for the predictions between 0.5 and 0.7 in the range [1-5] for the computed social signals.

* Please, find the supplementary video material at: http://sunai.uoc.edu/~vponcel/video/VOMSessionSample.mp4

Via

Access Paper or Ask Questions