Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tom Wehrbein

Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery

Nov 25, 2024

Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

Abstract:Monocular 3D human pose and shape estimation is an inherently ill-posed problem due to depth ambiguities, occlusions, and truncations. Recent probabilistic approaches learn a distribution over plausible 3D human meshes by maximizing the likelihood of the ground-truth pose given an image. We show that this objective function alone is not sufficient to best capture the full distributions. Instead, we propose to additionally supervise the learned distributions by minimizing the distance to distributions encoded in heatmaps of a 2D pose detector. Moreover, we reveal that current methods often generate incorrect hypotheses for invisible joints which is not detected by the evaluation protocols. We demonstrate that person segmentation masks can be utilized during training to significantly decrease the number of invalid samples and introduce two metrics to evaluate it. Our normalizing flow-based approach predicts plausible 3D human mesh hypotheses that are consistent with the image evidence while maintaining high diversity for ambiguous body parts. Experiments on 3DPW and EMDB show that we outperform other state-of-the-art probabilistic methods. Code is available for research purposes at https://github.com/twehrbein/humr.

* WACV 2025

Via

Access Paper or Ask Questions

Personalized 3D Human Pose and Shape Refinement

Mar 18, 2024

Tom Wehrbein, Bodo Rosenhahn, Iain Matthews, Carsten Stoll

Abstract:Recently, regression-based methods have dominated the field of 3D human pose and shape estimation. Despite their promising results, a common issue is the misalignment between predictions and image observations, often caused by minor joint rotation errors that accumulate along the kinematic chain. To address this issue, we propose to construct dense correspondences between initial human model estimates and the corresponding images that can be used to refine the initial predictions. To this end, we utilize renderings of the 3D models to predict per-pixel 2D displacements between the synthetic renderings and the RGB images. This allows us to effectively integrate and exploit appearance information of the persons. Our per-pixel displacements can be efficiently transformed to per-visible-vertex displacements and then used for 3D model refinement by minimizing a reprojection loss. To demonstrate the effectiveness of our approach, we refine the initial 3D human mesh predictions of multiple models using different refinement procedures on 3DPW and RICH. We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.

* 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
* Accepted to 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Via

Access Paper or Ask Questions

Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

Oct 18, 2022

Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

Figure 1 for Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

Figure 2 for Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

Figure 3 for Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

Figure 4 for Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

Abstract:Industrial defect detection is commonly addressed with anomaly detection (AD) methods where no or only incomplete data of potentially occurring defects is available. This work discovers previously unknown problems of student-teacher approaches for AD and proposes a solution, where two neural networks are trained to produce the same output for the defect-free training examples. The core assumption of student-teacher networks is that the distance between the outputs of both networks is larger for anomalies since they are absent in training. However, previous methods suffer from the similarity of student and teacher architecture, such that the distance is undesirably small for anomalies. For this reason, we propose asymmetric student-teacher networks (AST). We train a normalizing flow for density estimation as a teacher and a conventional feed-forward network as a student to trigger large distances for anomalies: The bijectivity of the normalizing flow enforces a divergence of teacher outputs for anomalies compared to normal data. Outside the training distribution the student cannot imitate this divergence due to its fundamentally different architecture. Our AST network compensates for wrongly estimated likelihoods by a normalizing flow, which was alternatively used for anomaly detection in previous work. We show that our method produces state-of-the-art results on the two currently most relevant defect detection datasets MVTec AD and MVTec 3D-AD regarding image-level anomaly detection on RGB and 3D data.

* accepted to WACV 2023

Via

Access Paper or Ask Questions

Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

Oct 06, 2021

Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

Figure 1 for Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

Figure 2 for Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

Figure 3 for Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

Figure 4 for Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

Abstract:In industrial manufacturing processes, errors frequently occur at unpredictable times and in unknown manifestations. We tackle the problem of automatic defect detection without requiring any image samples of defective parts. Recent works model the distribution of defect-free image data, using either strong statistical priors or overly simplified data representations. In contrast, our approach handles fine-grained representations incorporating the global and local image context while flexibly estimating the density. To this end, we propose a novel fully convolutional cross-scale normalizing flow (CS-Flow) that jointly processes multiple feature maps of different scales. Using normalizing flows to assign meaningful likelihoods to input samples allows for efficient defect detection on image-level. Moreover, due to the preserved spatial arrangement the latent space of the normalizing flow is interpretable which enables to localize defective regions in the image. Our work sets a new state-of-the-art in image-level defect detection on the benchmark datasets Magnetic Tile Defects and MVTec AD showing a 100% AUROC on 4 out of 15 classes.

Via

Access Paper or Ask Questions

Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Aug 02, 2021

Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

Figure 1 for Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Figure 2 for Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Figure 3 for Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Figure 4 for Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Abstract:3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. Additionally, uncertain detections and occlusions are effectively modeled by incorporating uncertainty information of the 2D detector as condition. Further keys to success are a learned 3D pose prior and a generalization of the best-of-M loss. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics. The implementation is available on GitHub.

* Accepted to ICCV 2021

Via

Access Paper or Ask Questions