In this work we introduce DoPose, a dataset of highly cluttered and closely stacked objects for segmentation and 6D pose estimation. We show how using careful choice of synthetic data and fine-tuning on our real dataset along with a rational training can boost the performance of already existing CNN architectures to generalize on real data and produce comparable results to SOTA methods even without post-processing or refinements. Our DoPose dataset, network models, pipeline code and ROS driver are available online.

Title:Category-agnostic Segmentation for Robotic Grasping

Paper and Code