Abstract:Excavation of irregular rigid objects in clutter, such as fragmented rocks and wood blocks, is very challenging due to their complex interaction dynamics and highly variable geometries. In this paper, we adopt reinforcement learning (RL) to tackle this challenge and learn policies to plan for a sequence of excavation trajectories for irregular rigid objects, given point clouds of excavation scenes. Moreover, we separately learn a compact representation of the point cloud on geometric tasks that do not require human labeling. We show that using the representation reduces training time for RL, while achieving similar asymptotic performance compare to an end-to-end RL algorithm. When using a policy trained in simulation directly on a real scene, we show that the policy trained with the representation outperforms end-to-end RL. To our best knowledge, this paper presents the first application of RL to plan a sequence of excavation trajectories of irregular rigid objects in clutter.
Abstract:Autonomous excavation for hard or compact materials, especially irregular rigid objects, is challenging due to high variance of geometric and physical properties of objects, and large resistive force during excavation. In this paper, we propose a novel learning-based excavation planning method for rigid objects in clutter. Our method consists of a convolutional neural network to predict the excavation success and a sampling-based optimization method for planning high-quality excavation trajectories leveraging the learned prediction model. To reduce the sim2real gap for excavation learning, we propose a voxel-based representation of the excavation scene. We perform excavation experiments in both simulation and real world to evaluate the learning-based excavation planners. We further compare with two heuristic baseline excavation planners and one data-driven scene-independent planner. The experimental results show that our method can plan high-quality excavations for rigid objects in clutter and outperforms the baseline methods by large margins. As far as we know, our work presents the first learning-based excavation planner for cluttered and irregular rigid objects.
Abstract:Learning-based approaches to grasp planning are preferred over analytical methods due to their ability to better generalize to new, partially observed objects. However, data collection remains one of the biggest bottlenecks for grasp learning methods, particularly for multi-fingered hands. The relatively high dimensional configuration space of the hands coupled with the diversity of objects common in daily life requires a significant number of samples to produce robust and confident grasp success classifiers. In this paper, we present the first active learning approach to grasping that searches over the grasp configuration space and classifier confidence in a unified manner. Our real-robot grasping experiment shows our active grasp planner using less training data achieves comparable success rates with a passive supervised planner trained with geometrical grasping data. We also compute the differential entropy to demonstrate our active learner generates grasps with larger diversity than passive supervised learning using more heuristic data. We base our approach on recent success in planning multi-fingered grasps as probabilistic inference with a learned neural network likelihood function. We embed this within a multi-armed bandit formulation of sample selection. We show that our active grasp learning approach uses fewer training samples to produce grasp success rates comparable with the passive supervised learning method trained with grasping data generated by an analytical planner.
Abstract:We propose a novel approach to multi-fingered grasp planning leveraging learned deep neural network models. We train a voxel-based 3D convolutional neural network to predict grasp success probability as a function of both visual information of an object and grasp configuration. We can then formulate grasp planning as inferring the grasp configuration which maximizes the probability of grasp success. In addition, we learn a prior over grasp configurations as a mixture density network conditioned on our voxel-based object representation. We show that this object conditional prior improves grasp inference when used with the learned grasp success prediction network when compared to a learned, object-agnostic prior, or an uninformed uniform prior. Our work is the first to directly plan high quality multi-fingered grasps in configuration space using a deep neural network without the need of an external planner. We validate our inference method performing multi-finger grasping on a physical robot. Our experimental results show that our planning method outperforms existing grasp planning methods for neural networks.
Abstract:Deep learning has enabled remarkable improvements in grasp synthesis for previously unseen objects viewed from partial views. However, existing approaches lack the ability to explicitly reason about the full 3D geometry of the object when selecting a grasp, relying on indirect geometric reasoning derived when learning grasp success networks. This abandons common sense geometric reasoning, such as avoiding undesired robot object collisions. We propose to utilize a novel, learned 3D reconstruction to enable geometric awareness in a grasping system. We leverage the structure of the reconstruction network to learn a grasp success classifier which serves as the objective function for a continuous grasp optimization. We additionally explicitly constrain the optimization to avoid undesired contact, directly using the reconstruction. By using the reconstruction network, our method can grasp objects from a new camera viewpoint which was not seen during training. Our results show that utilizing learned geometry outperforms alternative formulations for partial-view information based on real robot execution. Our results can be found on https://sites.google.com/view/reconstruction-grasp/.
Abstract:Different manipulation tasks require different types of grasps. For example, holding a heavy tool like a hammer requires a multi-fingered power grasp offering stability, while holding a pen to write requires a multi-fingered precision grasp to impart dexterity on the object. In this paper, we propose a probabilistic grasp planner that explicitly models grasp type for planning high-quality precision and power grasps in real-time. We take a learning approach in order to plan grasps of different types for previously unseen objects when only partial visual information is available. Our work demonstrates the first supervised learning approach to grasp planning that can explicitly plan both power and precision grasps for a given object. Additionally, we compare our learned grasp model with a model that does not encode type and show that modeling grasp type improves the success rate of generated grasps. Furthermore we show the benefit of learning a prior over grasp configurations to improve grasp inference with a learned classifier.
Abstract:We propose a novel approach to multi-fingered grasp planning leveraging learned deep neural network models. We train a convolutional neural network to predict grasp success as a function of both visual information of an object and grasp configuration. We can then formulate grasp planning as inferring the grasp configuration which maximizes the probability of grasp success. We efficiently perform this inference using a gradient-ascent optimization inside the neural network using the backpropagation algorithm. Our work is the first to directly plan high quality multifingered grasps in configuration space using a deep neural network without the need of an external planner. We validate our inference method performing both multifinger and two-finger grasps on real robots. Our experimental results show that our planning method outperforms existing planning methods for neural networks; while offering several other benefits including being data-efficient in learning and fast enough to be deployed in real robotic applications.