Abstract:In this paper, we propose a novel method for plane clustering specialized in cluttered scenes using an RGB-D camera and validate its effectiveness through robot grasping experiments. Unlike existing methods, which focus on large-scale indoor structures, our approach -- Multi-Object RANSAC emphasizes cluttered environments that contain a wide range of objects with different scales. It enhances plane segmentation by generating subplanes in Deep Plane Clustering (DPC) module, which are then merged with the final planes by post-processing. DPC rearranges the point cloud by voting layers to make subplane clusters, trained in a self-supervised manner using pseudo-labels generated from RANSAC. Multi-Object RANSAC demonstrates superior plane instance segmentation performances over other recent RANSAC applications. We conducted an experiment on robot suction-based grasping, comparing our method with vision-based grasping network and RANSAC applications. The results from this real-world scenario showed its remarkable performance surpassing the baseline methods, highlighting its potential for advanced scene understanding and manipulation.
Abstract:Object slip perception is essential for mobile manipulation robots to perform manipulation tasks reliably in the dynamic real-world. Traditional approaches to robot arms' slip perception use tactile or vision sensors. However, mobile robots still have to deal with noise in their sensor signals caused by the robot's movement in a changing environment. To solve this problem, we present an anomaly detection method that utilizes multisensory data based on a deep autoencoder model. The proposed framework integrates heterogeneous data streams collected from various robot sensors, including RGB and depth cameras, a microphone, and a force-torque sensor. The integrated data is used to train a deep autoencoder to construct latent representations of the multisensory data that indicate the normal status. Anomalies can then be identified by error scores measured by the difference between the trained encoder's latent values and the latent values of reconstructed input data. In order to evaluate the proposed framework, we conducted an experiment that mimics an object slip by a mobile service robot operating in a real-world environment with diverse household objects and different moving patterns. The experimental results verified that the proposed framework reliably detects anomalies in object slip situations despite various object types and robot behaviors, and visual and auditory noise in the environment.