Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manuel Marques

Benchmarking 3D Human Pose Estimation Models Under Occlusions

Apr 14, 2025

Filipa Lino, Carlos Santiago, Manuel Marques

Abstract:This paper addresses critical challenges in 3D Human Pose Estimation (HPE) by analyzing the robustness and sensitivity of existing models to occlusions, camera position, and action variability. Using a novel synthetic dataset, BlendMimic3D, which includes diverse scenarios with multi-camera setups and several occlusion types, we conduct specific tests on several state-of-the-art models. Our study focuses on the discrepancy in keypoint formats between common datasets such as Human3.6M, and 2D datasets such as COCO, commonly used for 2D detection models and frequently input of 3D HPE models. Our work explores the impact of occlusions on model performance and the generality of models trained exclusively under standard conditions. The findings suggest significant sensitivity to occlusions and camera settings, revealing a need for models that better adapt to real-world variability and occlusion scenarios. This research contributed to ongoing efforts to improve the fidelity and applicability of 3D HPE systems in complex environments.

Via

Access Paper or Ask Questions

Which cycling environment appears safer? Learning cycling safety perceptions from pairwise image comparisons

Dec 13, 2024

Miguel Costa, Manuel Marques, Carlos Lima Azevedo, Felix Wilhelm Siebert, Filipe Moura

Abstract:Cycling is critical for cities to transition to more sustainable transport modes. Yet, safety concerns remain a critical deterrent for individuals to cycle. If individuals perceive an environment as unsafe for cycling, it is likely that they will prefer other means of transportation. Yet, capturing and understanding how individuals perceive cycling risk is complex and often slow, with researchers defaulting to traditional surveys and in-loco interviews. In this study, we tackle this problem. We base our approach on using pairwise comparisons of real-world images, repeatedly presenting respondents with pairs of road environments and asking them to select the one they perceive as safer for cycling, if any. Using the collected data, we train a siamese-convolutional neural network using a multi-loss framework that learns from individuals' responses, learns preferences directly from images, and includes ties (often discarded in the literature). Effectively, this model learns to predict human-style perceptions, evaluating which cycling environments are perceived as safer. Our model achieves good results, showcasing this approach has a real-life impact, such as improving interventions' effectiveness. Furthermore, it facilitates the continuous assessment of changing cycling environments, permitting short-term evaluations of measures to enhance perceived cycling safety. Finally, our method can be efficiently deployed in different locations with a growing number of openly available street-view images.

* IEEE Transactions on Intelligent Transportation Systems, 2024
* \copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions

Learning Visual-Semantic Subspace Representations for Propositional Reasoning

May 25, 2024

Gabriel Moreira, Alexander Hauptmann, Manuel Marques, João Paulo Costeira

Figure 1 for Learning Visual-Semantic Subspace Representations for Propositional Reasoning

Figure 2 for Learning Visual-Semantic Subspace Representations for Propositional Reasoning

Figure 3 for Learning Visual-Semantic Subspace Representations for Propositional Reasoning

Figure 4 for Learning Visual-Semantic Subspace Representations for Propositional Reasoning

Abstract:Learning representations that capture rich semantic relationships and accommodate propositional calculus poses a significant challenge. Existing approaches are either contrastive, lacking theoretical guarantees, or fall short in effectively representing the partial orders inherent to rich visual-semantic hierarchies. In this paper, we propose a novel approach for learning visual representations that not only conform to a specified semantic structure but also facilitate probabilistic propositional reasoning. Our approach is based on a new nuclear norm-based loss. We show that its minimum encodes the spectral geometry of the semantics in a subspace lattice, where logical propositions can be represented by projection operators.

Via

Access Paper or Ask Questions

3D Human Pose Estimation with Occlusions: Introducing BlendMimic3D Dataset and GCN Refinement

Apr 24, 2024

Filipa Lino, Carlos Santiago, Manuel Marques

Abstract:In the field of 3D Human Pose Estimation (HPE), accurately estimating human pose, especially in scenarios with occlusions, is a significant challenge. This work identifies and addresses a gap in the current state of the art in 3D HPE concerning the scarcity of data and strategies for handling occlusions. We introduce our novel BlendMimic3D dataset, designed to mimic real-world situations where occlusions occur for seamless integration in 3D HPE algorithms. Additionally, we propose a 3D pose refinement block, employing a Graph Convolutional Network (GCN) to enhance pose representation through a graph model. This GCN block acts as a plug-and-play solution, adaptable to various 3D HPE frameworks without requiring retraining them. By training the GCN with occluded data from BlendMimic3D, we demonstrate significant improvements in resolving occluded poses, with comparable results for non-occluded ones. Project web page is available at https://blendmimic3d.github.io/BlendMimic3D/.

* Accepted at 6th Workshop and Competition on Affective Behavior Analysis in-the-wild - CVPR 2024 Workshop

Via

Access Paper or Ask Questions

Latent Embedding Clustering for Occlusion Robust Head Pose Estimation

Mar 29, 2024

José Celestino, Manuel Marques, Jacinto C. Nascimento

Abstract:Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications, including robotics, surveillance, or driver attention monitoring. One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios. In this paper, we propose a novel and efficient framework that is robust in real world head occlusion scenarios. In particular, we propose an unsupervised latent embedding clustering with regression and classification components for each pose angle. The model optimizes latent feature representations for occluded and non-occluded images through a clustering term while improving fine-grained angle predictions. Experimental evaluation on in-the-wild head pose benchmark datasets reveal competitive performance in comparison to state-of-the-art methodologies with the advantage of having a significant data reduction. We observe a substantial improvement in occluded head pose estimation. Also, an ablation study is conducted to ascertain the impact of the clustering term within our proposed framework.

* Accepted at 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG'24)

Via

Access Paper or Ask Questions

2D Image head pose estimation via latent space regression under occlusion settings

Nov 10, 2023

José Celestino, Manuel Marques, Jacinto C. Nascimento, João Paulo Costeira

Abstract:Head orientation is a challenging Computer Vision problem that has been extensively researched having a wide variety of applications. However, current state-of-the-art systems still underperform in the presence of occlusions and are unreliable for many task applications in such scenarios. This work proposes a novel deep learning approach for the problem of head pose estimation under occlusions. The strategy is based on latent space regression as a fundamental key to better structure the problem for occluded scenarios. Our model surpasses several state-of-the-art methodologies for occluded HPE, and achieves similar accuracy for non-occluded scenarios. We demonstrate the usefulness of the proposed approach with: (i) two synthetically occluded versions of the BIWI and AFLW2000 datasets, (ii) real-life occlusions of the Pandora dataset, and (iii) a real-life application to human-robot interaction scenarios where face occlusions often occur. Specifically, the autonomous feeding from a robotic arm.

* Pattern Recognition, Volume 137, May 2023

Via

Access Paper or Ask Questions

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin

Sep 18, 2023

Gabriel Moreira, Manuel Marques, João Paulo Costeira, Alexander Hauptmann

Abstract:Recent research in representation learning has shown that hierarchical data lends itself to low-dimensional and highly informative representations in hyperbolic space. However, even if hyperbolic embeddings have gathered attention in image recognition, their optimization is prone to numerical hurdles. Further, it remains unclear which applications stand to benefit the most from the implicit bias imposed by hyperbolicity, when compared to traditional Euclidean features. In this paper, we focus on prototypical hyperbolic neural networks. In particular, the tendency of hyperbolic embeddings to converge to the boundary of the Poincar\'e ball in high dimensions and the effect this has on few-shot classification. We show that the best few-shot results are attained for hyperbolic embeddings at a common hyperbolic radius. In contrast to prior benchmark results, we demonstrate that better performance can be achieved by a fixed-radius encoder equipped with the Euclidean metric, regardless of the embedding dimension.

* Accepted for WACV 2024

Via

Access Paper or Ask Questions

Scoring Cycling Environments Perceived Safety using Pairwise Image Comparisons

Jul 31, 2023

Miguel Costa, Manuel Marques, Felix Wilhelm Siebert, Carlos Lima Azevedo, Filipe Moura

Abstract:Today, many cities seek to transition to more sustainable transportation systems. Cycling is critical in this transition for shorter trips, including first-and-last-mile links to transit. Yet, if individuals perceive cycling as unsafe, they will not cycle and choose other transportation modes. This study presents a novel approach to identifying how the perception of cycling safety can be analyzed and understood and the impact of the built environment and cycling contexts on such perceptions. We base our work on other perception studies and pairwise comparisons, using real-world images to survey respondents. We repeatedly show respondents two road environments and ask them to select the one they perceive as safer for cycling. We compare several methods capable of rating cycling environments from pairwise comparisons and classify cycling environments perceived as safe or unsafe. Urban planning can use this score to improve interventions' effectiveness and improve cycling promotion campaigns. Furthermore, this approach facilitates the continuous assessment of changing cycling environments, allows for a short-term evaluation of measures, and is efficiently deployed in different locations or contexts.

Via

Access Paper or Ask Questions

A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Jan 03, 2022

Bárbara Tavares, Cláudia Soares, Manuel Marques

Figure 1 for A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Figure 2 for A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Figure 3 for A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Figure 4 for A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems

Abstract:Bike Sharing Systems (BSSs) are emerging as an innovative transportation service. Ensuring the proper functioning of a BSS is crucial given that these systems are committed to eradicating many of the current global concerns, by promoting environmental and economic sustainability and contributing to improving the life quality of the population. Good knowledge of users' transition patterns is a decisive contribution to the quality and operability of the service. The analogous and unbalanced users' transition patterns cause these systems to suffer from bicycle imbalance, leading to a drastic customer loss in the long term. Strategies for bicycle rebalancing become important to tackle this problem and for this, bicycle traffic prediction is essential, as it allows to operate more efficiently and to react in advance. In this work, we propose a bicycle trips predictor based on Graph Neural Network embeddings, taking into consideration station groupings, meteorology conditions, geographical distances, and trip patterns. We evaluated our approach in the New York City BSS (CitiBike) data and compared it with four baselines, including the non-clustered approach. To address our problem's specificities, we developed the Adaptive Transition Constraint Clustering Plus (AdaTC+) algorithm, eliminating shortcomings of previous work. Our experiments evidence the clustering pertinence (88% accuracy compared with 83% without clustering) and which clustering technique best suits this problem. Accuracy on the Link Prediction task is always higher for AdaTC+ than benchmark clustering methods when the stations are the same, while not degrading performance when the network is upgraded, in a mismatch with the trained model.

* 12 pages, 15 figures, 4 tables

Via

Access Paper or Ask Questions

Rotation Averaging in a Split Second: A Primal-Dual Method and a Closed-Form for Cycle Graphs

Sep 16, 2021

Gabriel Moreira, Manuel Marques, João Paulo Costeira

Figure 1 for Rotation Averaging in a Split Second: A Primal-Dual Method and a Closed-Form for Cycle Graphs

Figure 2 for Rotation Averaging in a Split Second: A Primal-Dual Method and a Closed-Form for Cycle Graphs

Figure 3 for Rotation Averaging in a Split Second: A Primal-Dual Method and a Closed-Form for Cycle Graphs

Figure 4 for Rotation Averaging in a Split Second: A Primal-Dual Method and a Closed-Form for Cycle Graphs

Abstract:A cornerstone of geometric reconstruction, rotation averaging seeks the set of absolute rotations that optimally explains a set of measured relative orientations between them. In spite of being an integral part of bundle adjustment and structure-from-motion, averaging rotations is both a non-convex and high-dimensional optimization problem. In this paper, we address it from a maximum likelihood estimation standpoint and make a twofold contribution. Firstly, we set forth a novel initialization-free primal-dual method which we show empirically to converge to the global optimum. Further, we derive what is to our knowledge, the first optimal closed-form solution for rotation averaging in cycle graphs and contextualize this result within spectral graph theory. Our proposed methods achieve a significant gain both in precision and performance.

Via

Access Paper or Ask Questions