Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Steven Borkman

PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision

Dec 17, 2021

Salehe Erfanian Ebadi, You-Cyuan Jhang, Alex Zook, Saurav Dhakad, Adam Crespi, Pete Parisi, Steven Borkman, Jonathan Hogins, Sujoy Ganguly

Figure 1 for PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision

Figure 2 for PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision

Figure 3 for PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision

Figure 4 for PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision

Abstract:In recent years, person detection and human pose estimation have made great strides, helped by large-scale labeled datasets. However, these datasets had no guarantees or analysis of human activities, poses, or context diversity. Additionally, privacy, legal, safety, and ethical concerns may limit the ability to collect more human data. An emerging alternative to real-world data that alleviates some of these issues is synthetic data. However, creation of synthetic data generators is incredibly challenging and prevents researchers from exploring their usefulness. Therefore, we release a human-centric synthetic data generator PeopleSansPeople which contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels. Using PeopleSansPeople, we performed benchmark synthetic data training using a Detectron2 Keypoint R-CNN variant [1]. We found that pre-training a network using synthetic data and fine-tuning on target real-world data (few-shot transfer to limited subsets of COCO-person train [2]) resulted in a keypoint AP of $60.37 \pm 0.48$ (COCO test-dev2017) outperforming models trained with the same real data alone (keypoint AP of $55.80$) and pre-trained with ImageNet (keypoint AP of $57.50$). This freely-available data generator should enable a wide range of research into the emerging field of simulation to real transfer learning in the critical area of human-centric computer vision.

* PeopleSansPeople template Unity environment, benchmark binaries, and source code is available at: https://github.com/Unity-Technologies/PeopleSansPeople

Via

Access Paper or Ask Questions