Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marcelo H. Ang Jr.

Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera

Mar 26, 2025

Peiyuan Ni, Chee Meng Chew, Marcelo H. Ang Jr., Gregory S. Chirikjian

Abstract:Bin-picking of metal objects using low-cost RGB-D cameras often suffers from sparse depth information and reflective surface textures, leading to errors and the need for manual labeling. To reduce human intervention, we propose a two-stage framework consisting of a metric learning stage and a self-training stage. Specifically, to automatically process data captured by a low-cost camera (LC), we introduce a Multi-object Pose Reasoning (MoPR) algorithm that optimizes pose hypotheses under depth, collision, and boundary constraints. To further refine pose candidates, we adopt a Symmetry-aware Lie-group based Bayesian Gaussian Mixture Model (SaL-BGMM), integrated with the Expectation-Maximization (EM) algorithm, for symmetry-aware filtering. Additionally, we propose a Weighted Ranking Information Noise Contrastive Estimation (WR-InfoNCE) loss to enable the LC to learn a perceptual metric from reconstructed data, supporting self-training on untrained or even unseen objects. Experimental results show that our approach outperforms several state-of-the-art methods on both the ROBI dataset and our newly introduced Self-ROBI dataset.

* 9 pages, 10 figures

Via

Access Paper or Ask Questions

DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands

Jul 13, 2024

Zhengshen Zhang, Lei Zhou, Chenchen Liu, Zhiyang Liu, Chengran Yuan, Sheng Guo, Ruiteng Zhao, Marcelo H. Ang Jr., Francis EH Tay

Figure 1 for DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands

Figure 2 for DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands

Figure 3 for DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands

Figure 4 for DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Pipeline for Multi-Dexterous Robotic Hands

Abstract:The versatility and adaptability of human grasping catalyze advancing dexterous robotic manipulation. While significant strides have been made in dexterous grasp generation, current research endeavors pivot towards optimizing object manipulation while ensuring functional integrity, emphasizing the synthesis of functional grasps following desired affordance instructions. This paper addresses the challenge of synthesizing functional grasps tailored to diverse dexterous robotic hands by proposing DexGrasp-Diffusion, an end-to-end modularized diffusion-based pipeline. DexGrasp-Diffusion integrates MultiHandDiffuser, a novel unified data-driven diffusion model for multi-dexterous hands grasp estimation, with DexDiscriminator, which employs a Physics Discriminator and a Functional Discriminator with open-vocabulary setting to filter physically plausible functional grasps based on object affordances. The experimental evaluation conducted on the MultiDex dataset provides substantiating evidence supporting the superior performance of MultiHandDiffuser over the baseline model in terms of success rate, grasp diversity, and collision depth. Moreover, we demonstrate the capacity of DexGrasp-Diffusion to reliably generate functional grasps for household objects aligned with specific affordance instructions.

Via

Access Paper or Ask Questions

Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Dec 02, 2019

Zehui Meng, Qi Heng Ho, Zefan Huang, Hongliang Guo, Marcelo H. Ang Jr., Daniela Rus

Figure 1 for Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Figure 2 for Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Figure 3 for Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Figure 4 for Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Abstract:Target detection and tracking provides crucial information for motion planning and decision making in autonomous driving. This paper proposes an online multi-object tracking (MOT) framework with tracking-by-detection for maneuvering vehicles under motion uncertainty in dynamic road context. We employ a point cloud based vehicle detector to provide real-time 3D bounding boxes of detected vehicles and conduct the online bipartite optimization of the maneuver-orientated data association between the detections and the targets. Kalman Filter (KF) is adopted as the backbone for multi-object tracking. In order to entertain the maneuvering uncertainty, we leverage the interacting multiple model (IMM) approach to obtain the \textit{a-posterior} residual as the cost for each association hypothesis, which is calculated with the hybrid model posterior (after mode-switch). Road context is integrated to conduct adjustments of the time varying transition probability matrix (TPM) of the IMM to regulate the maneuvers according to road segments and traffic sign/signals, with which the data association is performed in a unified spatial-temporal fashion. Experiments show our framework is able to effectively track multiple vehicles with maneuvers subject to dynamic road context and localization drift.

* Submitted to ICRA 2020

Via

Access Paper or Ask Questions

A General Pipeline for 3D Detection of Vehicles

Feb 12, 2018

Xinxin Du, Marcelo H. Ang Jr., Sertac Karaman, Daniela Rus

Figure 1 for A General Pipeline for 3D Detection of Vehicles

Figure 2 for A General Pipeline for 3D Detection of Vehicles

Figure 3 for A General Pipeline for 3D Detection of Vehicles

Figure 4 for A General Pipeline for 3D Detection of Vehicles

Abstract:Autonomous driving requires 3D perception of vehicles and other objects in the in environment. Much of the current methods support 2D vehicle detection. This paper proposes a flexible pipeline to adopt any 2D detection network and fuse it with a 3D point cloud to generate 3D information with minimum changes of the 2D detection networks. To identify the 3D box, an effective model fitting algorithm is developed based on generalised car models and score maps. A two-stage convolutional neural network (CNN) is proposed to refine the detected 3D box. This pipeline is tested on the KITTI dataset using two different 2D detection networks. The 3D detection results based on these two networks are similar, demonstrating the flexibility of the proposed pipeline. The results rank second among the 3D detection algorithms, indicating its competencies in 3D detection.

* Accepted at ICRA 2018

Via

Access Paper or Ask Questions

A General Framework for Multi-vehicle Cooperative Localization Using Pose Graph

Apr 05, 2017

Xiaotong Shen, Hans Andersen, Wei Kang Leong, Hai Xun Kong, Marcelo H. Ang Jr., Daniela Rus

Figure 1 for A General Framework for Multi-vehicle Cooperative Localization Using Pose Graph

Figure 2 for A General Framework for Multi-vehicle Cooperative Localization Using Pose Graph

Figure 3 for A General Framework for Multi-vehicle Cooperative Localization Using Pose Graph

Figure 4 for A General Framework for Multi-vehicle Cooperative Localization Using Pose Graph

Abstract:When a vehicle observes another one, the two vehicles' poses are correlated by this spatial relative observation, which can be used in cooperative localization for further increasing localization accuracy and precision. To use spatial relative observations, we propose to add them into a pose graph for optimal pose estimation. Before adding them, we need to know the identities of the observed vehicles. The vehicle identification is formulated as a linear assignment problem, which can be solved efficiently. By using pose graph techniques and the start-of-the-art factor composition/decomposition method, our cooperative localization algorithm is robust against communication delay, packet loss, and out-of-sequence packet reception. We demonstrate the usability of our framework and effectiveness of our algorithm through both simulations and real-world experiments using three vehicles on the road.

Via

Access Paper or Ask Questions