Abstract:The versatility and adaptability of human grasping catalyze advancing dexterous robotic manipulation. While significant strides have been made in dexterous grasp generation, current research endeavors pivot towards optimizing object manipulation while ensuring functional integrity, emphasizing the synthesis of functional grasps following desired affordance instructions. This paper addresses the challenge of synthesizing functional grasps tailored to diverse dexterous robotic hands by proposing DexGrasp-Diffusion, an end-to-end modularized diffusion-based pipeline. DexGrasp-Diffusion integrates MultiHandDiffuser, a novel unified data-driven diffusion model for multi-dexterous hands grasp estimation, with DexDiscriminator, which employs a Physics Discriminator and a Functional Discriminator with open-vocabulary setting to filter physically plausible functional grasps based on object affordances. The experimental evaluation conducted on the MultiDex dataset provides substantiating evidence supporting the superior performance of MultiHandDiffuser over the baseline model in terms of success rate, grasp diversity, and collision depth. Moreover, we demonstrate the capacity of DexGrasp-Diffusion to reliably generate functional grasps for household objects aligned with specific affordance instructions.
Abstract:Target detection and tracking provides crucial information for motion planning and decision making in autonomous driving. This paper proposes an online multi-object tracking (MOT) framework with tracking-by-detection for maneuvering vehicles under motion uncertainty in dynamic road context. We employ a point cloud based vehicle detector to provide real-time 3D bounding boxes of detected vehicles and conduct the online bipartite optimization of the maneuver-orientated data association between the detections and the targets. Kalman Filter (KF) is adopted as the backbone for multi-object tracking. In order to entertain the maneuvering uncertainty, we leverage the interacting multiple model (IMM) approach to obtain the \textit{a-posterior} residual as the cost for each association hypothesis, which is calculated with the hybrid model posterior (after mode-switch). Road context is integrated to conduct adjustments of the time varying transition probability matrix (TPM) of the IMM to regulate the maneuvers according to road segments and traffic sign/signals, with which the data association is performed in a unified spatial-temporal fashion. Experiments show our framework is able to effectively track multiple vehicles with maneuvers subject to dynamic road context and localization drift.
Abstract:Autonomous driving requires 3D perception of vehicles and other objects in the in environment. Much of the current methods support 2D vehicle detection. This paper proposes a flexible pipeline to adopt any 2D detection network and fuse it with a 3D point cloud to generate 3D information with minimum changes of the 2D detection networks. To identify the 3D box, an effective model fitting algorithm is developed based on generalised car models and score maps. A two-stage convolutional neural network (CNN) is proposed to refine the detected 3D box. This pipeline is tested on the KITTI dataset using two different 2D detection networks. The 3D detection results based on these two networks are similar, demonstrating the flexibility of the proposed pipeline. The results rank second among the 3D detection algorithms, indicating its competencies in 3D detection.
Abstract:When a vehicle observes another one, the two vehicles' poses are correlated by this spatial relative observation, which can be used in cooperative localization for further increasing localization accuracy and precision. To use spatial relative observations, we propose to add them into a pose graph for optimal pose estimation. Before adding them, we need to know the identities of the observed vehicles. The vehicle identification is formulated as a linear assignment problem, which can be solved efficiently. By using pose graph techniques and the start-of-the-art factor composition/decomposition method, our cooperative localization algorithm is robust against communication delay, packet loss, and out-of-sequence packet reception. We demonstrate the usability of our framework and effectiveness of our algorithm through both simulations and real-world experiments using three vehicles on the road.