Abstract:This paper proposes a new deep learning approach to antipodal grasp detection, named Double-Dot Network (DD-Net). It follows the recent anchor-free object detection framework, which does not depend on empirically pre-set anchors and thus allows more generalized and flexible prediction on unseen objects. Specifically, unlike the widely used 5-dimensional rectangle, the gripper configuration is defined as a pair of fingertips. An effective CNN architecture is introduced to localize such fingertips, and with the help of auxiliary centers for refinement, it accurately and robustly infers grasp candidates. Additionally, we design a specialized loss function to measure the quality of grasps, and in contrast to the IoU scores of bounding boxes adopted in object detection, it is more consistent to the grasp detection task. Both the simulation and robotic experiments are executed and state of the art accuracies are achieved, showing that DD-Net is superior to the counterparts in handling unseen objects.
Abstract:Recent years have witnessed great progress in deep learning based object detection. However, due to the domain shift problem, applying off-the-shelf detectors to an unseen domain leads to significant performance drop. To address such an issue, this paper proposes a novel coarse-to-fine feature adaptation approach to cross-domain object detection. At the coarse-grained stage, different from the rough image-level or instance-level feature alignment used in the literature, foreground regions are extracted by adopting the attention mechanism, and aligned according to their marginal distributions via multi-layer adversarial learning in the common feature space. At the fine-grained stage, we conduct conditional distribution alignment of foregrounds by minimizing the distance of global prototypes with the same category but from different domains. Thanks to this coarse-to-fine feature adaptation, domain knowledge in foreground regions can be effectively transferred. Extensive experiments are carried out in various cross-domain detection scenarios. The results are state-of-the-art, which demonstrate the broad applicability and effectiveness of the proposed approach.