Abstract:This paper presents a new self-supervised system for learning to detect novel and previously unseen categories of objects in images. The proposed system receives as input several unlabeled videos of scenes containing various objects. The frames of the videos are segmented into objects using depth information, and the segments are tracked along each video. The system then constructs a weighted graph that connects sequences based on the similarities between the objects that they contain. The similarity between two sequences of objects is measured by using generic visual features, after automatically re-arranging the frames in the two sequences to align the viewpoints of the objects. The graph is used to sample triplets of similar and dissimilar examples by performing random walks. The triplet examples are finally used to train a siamese neural network that projects the generic visual features into a low-dimensional manifold. Experiments on three public datasets, YCB-Video, CORe50 and RGBD-Object, show that the projected low-dimensional features improve the accuracy of clustering unknown objects into novel categories, and outperform several recent unsupervised clustering techniques.
Abstract:This paper introduces a new technique for learning probabilistic models of mass and friction distributions of unknown objects, and performing robust sliding actions by using the learned models. The proposed method is executed in two consecutive phases. In the exploration phase, a table-top object is poked by a robot from different angles. The observed motions of the object are compared against simulated motions with various hypothesized mass and friction models. The simulation-to-reality gap is then differentiated with respect to the unknown mass and friction parameters, and the analytically computed gradient is used to optimize those parameters. Since it is difficult to disentangle the mass from the friction coefficients in low-data and quasi-static motion regimes, our approach retains a set of locally optimal pairs of mass and friction models. A probability distribution on the models is computed based on the relative accuracy of each pair of models. In the exploitation phase, a probabilistic planner is used to select a goal configuration and waypoints that are stable with a high confidence. The proposed technique is evaluated on real objects and using a real manipulator. The results show that this technique can not only identify accurately mass and friction coefficients of non-uniform heterogeneous objects, but can also be used to successfully slide an unknown object to the edge of a table and pick it up from there, without any human assistance or feedback.
Abstract:We propose a new technique for pushing an unknown object from an initial configuration to a goal configuration with stability constraints. The proposed method leverages recent progress in differentiable physics models to learn unknown mechanical properties of pushed objects, such as their distributions of mass and coefficients of friction. The proposed learning technique computes the gradient of the distance between predicted poses of objects and their actual observed poses and utilizes that gradient to search for values of the mechanical properties that reduce the reality gap. The proposed approach is also utilized to optimize a policy to efficiently push an object toward the desired goal configuration. Experiments with real objects using a real robot to gather data show that the proposed approach can identify the mechanical properties of heterogeneous objects from a small number of pushing actions.
Abstract:This paper proposes a new method for manipulating unknown objects through a sequence of non-prehensile actions that displace an object from its initial configuration to a given goal configuration on a flat surface. The proposed method leverages recent progress in differentiable physics models to identify unknown mechanical properties of manipulated objects, such as inertia matrix, friction coefficients and external forces acting on the object. To this end, a recently proposed differentiable physics engine for two-dimensional objects is adopted in this work and extended to deal forces in the three-dimensional space. The proposed model identification technique analytically computes the gradient of the distance between forecasted poses of objects and their actual observed poses and utilizes that gradient to search for values of the mechanical properties that reduce the reality gap. Experiments with real objects using a real robot to gather data show that the proposed approach can identify the mechanical properties of heterogeneous objects on the fly.
Abstract:This paper considers the problem of rearrangement planning, i.e finding a sequence of manipulation actions that displace multiple objects from an initial configuration to a given goal configuration. Rearrangement is a critical skill for robots so that they can effectively operate in confined spaces that contain clutter. Examples of tasks that require rearrangement include packing objects inside a bin, wherein objects need to lay according to a predefined pattern. In tight bins, collision-free grasps are often unavailable. Nonprehensile actions, such as pushing and sliding, are preferred because they can be performed using minimalistic end-effectors that can easily be inserted in the bin. Rearrangement with nonprehensile actions is a challenging problem as it requires reasoning about object interactions in a combinatorially large configuration space of multiple objects. This work revisits several existing rearrangement planning techniques and introduces a new one that exploits nested nonprehensile actions by pushing several similar objects simultaneously along the same path, which removes the need to rearrange each object individually. Experiments in simulation and using a real Kuka robotic arm show the ability of the proposed approach to solve difficult rearrangement tasks while reducing the length of the end-effector's trajectories.
Abstract:We present a probabilistic approach for building, on the fly, 3-D models of unknown objects while being manipulated by a robot. We specifically consider manipulation tasks in piles of clutter that contain previously unseen objects. Most manipulation algorithms for performing such tasks require known geometric models of the objects in order to grasp or rearrange them robustly. One of the novel aspects of this work is the utilization of a physics engine for verifying hypothesized geometries in simulation. The evidence provided by physics simulations is used in a probabilistic framework that accounts for the fact that mechanical properties of the objects are uncertain. We present an efficient algorithm for inferring occluded parts of objects based on their observed motions and mutual interactions. Experiments using a robot show that this approach is efficient for constructing physically realistic 3-D models, which can be useful for manipulation planning. Experiments also show that the proposed approach significantly outperforms alternative approaches in terms of shape accuracy.
Abstract:Advances in sensor technologies, object detection algorithms, planning frameworks and hardware designs have motivated the deployment of robots in warehouse automation. A variety of such applications, like order fulfillment or packing tasks, require picking objects from unstructured piles and carefully arranging them in bins or containers. Desirable solutions need to be low-cost, easily deployable and controllable, making minimalistic hardware choices desirable. The challenge in designing an effective solution to this problem relates to appropriately integrating multiple components, so as to achieve a robust pipeline that minimizes failure conditions. The current work proposes a complete pipeline for solving such packing tasks, given access only to RGB-D data and a single robot arm with a vacuum-based end-effector, which is also used as a pushing finger. To achieve the desired level of robustness, three key manipulation primitives are identified, which take advantage of the environment and simple operations to successfully pack multiple cubic objects. The overall approach is demonstrated to be robust to execution and perception errors. The impact of each manipulation primitive is evaluated by considering different versions of the proposed pipeline, which incrementally introduce reasoning about object poses and corrective manipulation actions.
Abstract:We propose new methods to speed up convergence of the Alternating Direction Method of Multipliers (ADMM), a common optimization tool in the context of large scale and distributed learning. The proposed method accelerates the speed of convergence by automatically deciding the constraint penalty needed for parameter consensus in each iteration. In addition, we also propose an extension of the method that adaptively determines the maximum number of iterations to update the penalty. We show that this approach effectively leads to an adaptive, dynamic network topology underlying the distributed optimization. The utility of the new penalty update schemes is demonstrated on both synthetic and real data, including a computer vision application of distributed structure from motion.