Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrea Toaiari

New Fashion Products Performance Forecasting: A Survey on Evolutions, Models and Emerging Trends

Jan 17, 2025

Andrea Avogaro, Luigi Capogrosso, Andrea Toaiari, Franco Fummi, Marco Cristani

Abstract:The fast fashion industry's insatiable demand for new styles and rapid production cycles has led to a significant environmental burden. Overproduction, excessive waste, and harmful chemicals have contributed to the negative environmental impact of the industry. To mitigate these issues, a paradigm shift that prioritizes sustainability and efficiency is urgently needed. Integrating learning-based predictive analytics into the fashion industry represents a significant opportunity to address environmental challenges and drive sustainable practices. By forecasting fashion trends and optimizing production, brands can reduce their ecological footprint while remaining competitive in a rapidly changing market. However, one of the key challenges in forecasting fashion sales is the dynamic nature of consumer preferences. Fashion is acyclical, with trends constantly evolving and resurfacing. In addition, cultural changes and unexpected events can disrupt established patterns. This problem is also known as New Fashion Products Performance Forecasting (NFPPF), and it has recently gained more and more interest in the global research landscape. Given its multidisciplinary nature, the field of NFPPF has been approached from many different angles. This comprehensive survey wishes to provide an up-to-date overview that focuses on learning-based NFPPF strategies. The survey is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodological flow, allowing for a systematic and complete literature review. In particular, we propose the first taxonomy that covers the learning panorama for NFPPF, examining in detail the different methodologies used to increase the amount of multimodal information, as well as the state-of-the-art available datasets. Finally, we discuss the challenges and future directions.

* Accepted at the Springer Nature Computer Science journal

Via

Access Paper or Ask Questions

Upper-Body Pose-based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Sep 26, 2024

Andrea Toaiari, Vittorio Murino, Marco Cristani, Cigdem Beyan

Figure 1 for Upper-Body Pose-based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Figure 2 for Upper-Body Pose-based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Figure 3 for Upper-Body Pose-based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Figure 4 for Upper-Body Pose-based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Abstract:Gaze Target Detection (GTD), i.e., determining where a person is looking within a scene from an external viewpoint, is a challenging task, particularly in 3D space. Existing approaches heavily rely on analyzing the person's appearance, primarily focusing on their face to predict the gaze target. This paper presents a novel approach to tackle this problem by utilizing the person's upper-body pose and available depth maps to extract a 3D gaze direction and employing a multi-stage or an end-to-end pipeline to predict the gazed target. When predicted accurately, the human body pose can provide valuable information about the head pose, which is a good approximation of the gaze direction, as well as the position of the arms and hands, which are linked to the activity the person is performing and the objects they are likely focusing on. Consequently, in addition to performing gaze estimation in 3D, we are also able to perform GTD simultaneously. We demonstrate state-of-the-art results on the most comprehensive publicly accessible 3D gaze target detection dataset without requiring images of the person's face, thus promoting privacy preservation in various application contexts. The code is available at https://github.com/intelligolabs/privacy-gtd-3D.

* Accepted in the T-CAP workshop at ECCV 2024

Via

Access Paper or Ask Questions

SITUATE: Indoor Human Trajectory Prediction through Geometric Features and Self-Supervised Vision Representation

Sep 01, 2024

Luigi Capogrosso, Andrea Toaiari, Andrea Avogaro, Uzair Khan, Aditya Jivoji, Franco Fummi, Marco Cristani

Abstract:Patterns of human motion in outdoor and indoor environments are substantially different due to the scope of the environment and the typical intentions of people therein. While outdoor trajectory forecasting has received significant attention, indoor forecasting is still an underexplored research area. This paper proposes SITUATE, a novel approach to cope with indoor human trajectory prediction by leveraging equivariant and invariant geometric features and a self-supervised vision representation. The geometric learning modules model the intrinsic symmetries and human movements inherent in indoor spaces. This concept becomes particularly important because self-loops at various scales and rapid direction changes often characterize indoor trajectories. On the other hand, the vision representation module is used to acquire spatial-semantic information about the environment to predict users' future locations more accurately. We evaluate our method through comprehensive experiments on the two most famous indoor trajectory forecasting datasets, i.e., TH\"OR and Supermarket, obtaining state-of-the-art performance. Furthermore, we also achieve competitive results in outdoor scenarios, showing that indoor-oriented forecasting models generalize better than outdoor-oriented ones. The source code is available at https://github.com/intelligolabs/SITUATE.

* Accepted at the 27th International Conference on Pattern Recognition (ICPR 2024)

Via

Access Paper or Ask Questions

Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Mar 23, 2024

Andrea Avogaro, Andrea Toaiari, Federico Cunico, Xiangmin Xu, Haralambos Dafas, Alessandro Vinciarelli, Emma Li, Marco Cristani

Figure 1 for Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Figure 2 for Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Figure 3 for Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Figure 4 for Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Abstract:We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The Corpus contains not only the recordings of the built-in stereo cameras of Spot, but also those of a 6-camera OptiTrack system (all recordings are synchronized). This leads to ground-truth skeletal representations with a precision lower than a millimeter. In addition, the Corpus includes reproducible benchmarks on 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those we provide in this work.

Via

Access Paper or Ask Questions

Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning

Oct 13, 2023

Geri Skenderi, Luigi Capogrosso, Andrea Toaiari, Matteo Denitto, Franco Fummi, Simone Melzi, Marco Cristani

Figure 1 for Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning

Figure 2 for Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning

Figure 3 for Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning

Abstract:In deep learning, auxiliary objectives are often used to facilitate learning in situations where data is scarce, or the principal task is extremely complex. This idea is primarily inspired by the improved generalization capability induced by solving multiple tasks simultaneously, which leads to a more robust shared representation. Nevertheless, finding optimal auxiliary tasks that give rise to the desired improvement is a crucial problem that often requires hand-crafted solutions or expensive meta-learning approaches. In this paper, we propose a novel framework, dubbed Detaux, whereby a weakly supervised disentanglement procedure is used to discover new unrelated classification tasks and the associated labels that can be exploited with the principal task in any Multi-Task Learning (MTL) model. The disentanglement procedure works at a representation level, isolating a subspace related to the principal task, plus an arbitrary number of orthogonal subspaces. In the most disentangled subspaces, through a clustering procedure, we generate the additional classification tasks, and the associated labels become their representatives. Subsequently, the original data, the labels associated with the principal task, and the newly discovered ones can be fed into any MTL framework. Extensive validation on both synthetic and real data, along with various ablation studies, demonstrate promising results, revealing the potential in what has been, so far, an unexplored connection between learning disentangled representations and MTL. The code will be made publicly available upon acceptance.

* Under review in Pattern Recognition Letters

Via

Access Paper or Ask Questions

A Masked Face Classification Benchmark

Nov 23, 2022

Federico Cunico, Andrea Toaiari, Marco Cristani

Abstract:We propose a novel image dataset focused on tiny faces wearing face masks for mask classification purposes, dubbed Small Face MASK (SF-MASK), composed of a collection made from 20k low-resolution images exported from diverse and heterogeneous datasets, ranging from 7 x 7 to 64 x 64 pixel resolution. An accurate visualization of this collection, through counting grids, made it possible to highlight gaps in the variety of poses assumed by the heads of the pedestrians. In particular, faces filmed by very high cameras, in which the facial features appear strongly skewed, are absent. To address this structural deficiency, we produced a set of synthetic images which resulted in a satisfactory covering of the intra-class variance. Furthermore, a small subsample of 1701 images contains badly worn face masks, opening to multi-class classification challenges. Experiments on SF-MASK focus on face mask classification using several classifiers. Results show that the richness of SF-MASK (real + synthetic images) leads all of the tested classifiers to perform better than exploiting comparative face mask datasets, on a fixed 1077 images testing set. Dataset and evaluation code are publicly available here: https://github.com/HumaticsLAB/sf-mask

* 15 pages, 7 figures. Accepted at T-CAP workshop @ ICPR 2022

Via

Access Paper or Ask Questions