Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rıza Alp Güler

MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Jun 14, 2024

Eric-Tuan Lê, Antonis Kakolyris, Petros Koutras, Himmy Tam, Efstratios Skordos, George Papandreou, Rıza Alp Güler, Iasonas Kokkinos

Figure 1 for MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Figure 2 for MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Figure 3 for MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Figure 4 for MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Abstract:DensePose provides a pixel-accurate association of images with 3D mesh coordinates, but does not provide a 3D mesh, while Human Mesh Reconstruction (HMR) systems have high 2D reprojection error, as measured by DensePose localization metrics. In this work we introduce MeshPose to jointly tackle DensePose and HMR. For this we first introduce new losses that allow us to use weak DensePose supervision to accurately localize in 2D a subset of the mesh vertices ('VertexPose'). We then lift these vertices to 3D, yielding a low-poly body mesh ('MeshPose'). Our system is trained in an end-to-end manner and is the first HMR method to attain competitive DensePose accuracy, while also being lightweight and amenable to efficient inference, making it suitable for real-time AR applications.

* CVPR 2024
* IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Via

Access Paper or Ask Questions

Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Jun 13, 2019

Natalia Neverova, James Thewlis, Rıza Alp Güler, Iasonas Kokkinos, Andrea Vedaldi

Figure 1 for Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Figure 2 for Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Figure 3 for Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Figure 4 for Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Abstract:DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection strategies. In particular, we demonstrate that if annotations are collected in video frames, their efficacy can be multiplied for free by using motion cues. To explore this idea, we introduce DensePose-Track, a dataset of videos where selected frames are annotated in the traditional DensePose manner. Then, building on geometric properties of the DensePose mapping, we use the video dynamic to propagate ground-truth annotations in time as well as to learn from Siamese equivariance constraints. Having performed exhaustive empirical evaluation of various data annotation and learning strategies, we demonstrate that doing so can deliver significantly improved pose estimation results over strong baselines. However, despite what is suggested by some recent works, we show that merely synthesizing motion patterns by applying geometric transformations to isolated frames is significantly less effective, and that motion cues help much more when they are extracted from videos.

* CVPR 2019

Via

Access Paper or Ask Questions

DensePose: Dense Human Pose Estimation In The Wild

Feb 01, 2018

Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos

Figure 1 for DensePose: Dense Human Pose Estimation In The Wild

Figure 2 for DensePose: Dense Human Pose Estimation In The Wild

Figure 3 for DensePose: Dense Human Pose Estimation In The Wild

Figure 4 for DensePose: Dense Human Pose Estimation In The Wild

Abstract:In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation. We first gather dense correspondences for 50K persons appearing in the COCO dataset by introducing an efficient annotation pipeline. We then use our dataset to train CNN-based systems that deliver dense correspondence 'in the wild', namely in the presence of background, occlusions and scale variations. We improve our training set's effectiveness by training an 'inpainting' network that can fill in missing groundtruth values and report clear improvements with respect to the best results that would be achievable in the past. We experiment with fully-convolutional networks and region-based models and observe a superiority of the latter; we further improve accuracy through cascading, obtaining a system that delivers highly0accurate results in real time. Supplementary materials and videos are provided on the project page http://densepose.org

Via

Access Paper or Ask Questions

DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Jun 19, 2017

Rıza Alp Güler, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos

Figure 1 for DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Figure 2 for DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Figure 3 for DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Figure 4 for DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Abstract:In this paper we propose to learn a mapping from image pixels into a dense template grid through a fully convolutional network. We formulate this task as a regression problem and train our network by leveraging upon manually annotated facial landmarks "in-the-wild". We use such landmarks to establish a dense correspondence field between a three-dimensional object template and the input image, which then serves as the ground-truth for training our regression system. We show that we can combine ideas from semantic segmentation with regression networks, yielding a highly-accurate "quantized regression" architecture. Our system, called DenseReg, allows us to estimate dense image-to-template correspondences in a fully convolutional manner. As such our network can provide useful correspondence information as a stand-alone system, while when used as an initialization for Statistical Deformable Models we obtain landmark localization results that largely outperform the current state-of-the-art on the challenging 300W benchmark. We thoroughly evaluate our method on a host of facial analysis tasks and also provide qualitative results for dense human body correspondence. We make our code available at http://alpguler.com/DenseReg.html along with supplementary materials.

* CVPR 2017

Via

Access Paper or Ask Questions