Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning to Estimate 3D Human Pose and Shape from a Single Color Image

May 10, 2018

Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

Figure 1 for Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Figure 2 for Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Figure 3 for Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Figure 4 for Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Share this with someone who'll enjoy it:

Abstract:This work addresses the problem of estimating the full body 3D human pose and shape from a single color image. This is a task where iterative optimization-based solutions have typically prevailed, while Convolutional Networks (ConvNets) have suffered because of the lack of training data and their low resolution 3D predictions. Our work aims to bridge this gap and proposes an efficient and effective direct prediction method based on ConvNets. Central part to our approach is the incorporation of a parametric statistical body shape model (SMPL) within our end-to-end framework. This allows us to get very detailed 3D mesh results, while requiring estimation only of a small number of parameters, making it friendly for direct network prediction. Interestingly, we demonstrate that these parameters can be predicted reliably only from 2D keypoints and masks. These are typical outputs of generic 2D human analysis ConvNets, allowing us to relax the massive requirement that images with 3D shape ground truth are available for training. Simultaneously, by maintaining differentiability, at training time we generate the 3D mesh from the estimated parameters and optimize explicitly for the surface using a 3D per-vertex loss. Finally, a differentiable renderer is employed to project the 3D mesh to the image, which enables further refinement of the network, by optimizing for the consistency of the projection with 2D annotations (i.e., 2D keypoints or masks). The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.

* CVPR 2018 Camera Ready

View paper on

Share this with someone who'll enjoy it:

Title:Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Paper and Code