Abstract:To achieve high-accuracy manipulation in the presence of unknown dynamics and external disturbance, we propose an efficient and robust motion controller (named TvUDE) for robotic manipulators. The controller incorporates a disturbance estimation mechanism that utilizes reformulated robot dynamics and filtering operations to obtain uncertainty and disturbance without requiring measurement of acceleration. Furthermore, we design a time-varying control input gain to enhance the control system's robustness. Finally, we analyze the boundness of the control signal and the stability of the closed-loop system, and conduct a set of experiments on a six-DOF robotic manipulator. The experimental results verify the effectiveness of TvUDE in handling internal uncertainty and external static or transient disturbance.
Abstract:In this paper, we present a novel training scheme, namely Teach-DETR, to learn better DETR-based detectors from versatile teacher detectors. We show that the predicted boxes from teacher detectors are effective medium to transfer knowledge of teacher detectors, which could be either RCNN-based or DETR-based detectors, to train a more accurate and robust DETR model. This new training scheme can easily incorporate the predicted boxes from multiple teacher detectors, each of which provides parallel supervisions to the student DETR. Our strategy introduces no additional parameters and adds negligible computational cost to the original detector during training. During inference, Teach-DETR brings zero additional overhead and maintains the merit of requiring no non-maximum suppression. Extensive experiments show that our method leads to consistent improvement for various DETR-based detectors. Specifically, we improve the state-of-the-art detector DINO with Swin-Large backbone, 4 scales of feature maps and 36-epoch training schedule, from 57.8% to 58.9% in terms of mean average precision on MSCOCO 2017 validation set. Code will be available at https://github.com/LeonHLJ/Teach-DETR.
Abstract:When using cut-and-paste to acquire a composite image, the geometry inconsistency between foreground and background may severely harm its fidelity. To address the geometry inconsistency in composite images, several existing works learned to warp the foreground object for geometric correction. However, the absence of annotated dataset results in unsatisfactory performance and unreliable evaluation. In this work, we contribute a Spatial TRAnsformation for virtual Try-on (STRAT) dataset covering three typical application scenarios. Moreover, previous works simply concatenate foreground and background as input without considering their mutual correspondence. Instead, we propose a novel correspondence learning network (CorrelNet) to model the correspondence between foreground and background using cross-attention maps, based on which we can predict the target coordinate that each source coordinate of foreground should be mapped to on the background. Then, the warping parameters of foreground object can be derived from pairs of source and target coordinates. Additionally, we learn a filtering mask to eliminate noisy pairs of coordinates to estimate more accurate warping parameters. Extensive experiments on our STRAT dataset demonstrate that our proposed CorrelNet performs more favorably against previous methods.