Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adithyavairavan Murali

DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning

Oct 22, 2024

Huang Huang, Balakumar Sundaralingam, Arsalan Mousavian, Adithyavairavan Murali, Ken Goldberg, Dieter Fox

Abstract:Running optimization across many parallel seeds leveraging GPU compute have relaxed the need for a good initialization, but this can fail if the problem is highly non-convex as all seeds could get stuck in local minima. One such setting is collision-free motion optimization for robot manipulation, where optimization converges quickly on easy problems but struggle in obstacle dense environments (e.g., a cluttered cabinet or table). In these situations, graph-based planning algorithms are used to obtain seeds, resulting in significant slowdowns. We propose DiffusionSeeder, a diffusion based approach that generates trajectories to seed motion optimization for rapid robot motion planning. DiffusionSeeder takes the initial depth image observation of the scene and generates high quality, multi-modal trajectories that are then fine-tuned with a few iterations of motion optimization. We integrate DiffusionSeeder to generate the seed trajectories for cuRobo, a GPU-accelerated motion optimization method, which results in 12x speed up on average, and 36x speed up for more complicated problems, while achieving 10% higher success rate in partially observed simulation environments. Our results show the effectiveness of using diverse solutions from a learned diffusion model. Physical experiments on a Franka robot demonstrate the sim2real transfer of DiffusionSeeder to the real robot, with an average success rate of 86% and planning time of 26ms, improving on cuRobo by 51% higher success rate while also being 2.5x faster.

Via

Access Paper or Ask Questions

RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics

Jun 15, 2024

Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

Abstract:From rearranging objects on a table to putting groceries into shelves, robots must plan precise action points to perform tasks accurately and reliably. In spite of the recent adoption of vision language models (VLMs) to control robot behavior, VLMs struggle to precisely articulate robot actions using language. We introduce an automatic synthetic data generation pipeline that instruction-tunes VLMs to robotic domains and needs. Using the pipeline, we train RoboPoint, a VLM that predicts image keypoint affordances given language instructions. Compared to alternative approaches, our method requires no real-world data collection or human demonstration, making it much more scalable to diverse environments and viewpoints. In addition, RoboPoint is a general model that enables several downstream applications such as robot navigation, manipulation, and augmented reality (AR) assistance. Our experiments demonstrate that RoboPoint outperforms state-of-the-art VLMs (GPT-4o) and visual prompting techniques (PIVOT) by 21.8% in the accuracy of predicting spatial affordance and by 30.5% in the success rate of downstream tasks. Project website: https://robo-point.github.io.

Via

Access Paper or Ask Questions

M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place

Nov 02, 2023

Wentao Yuan, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

Abstract:With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation. These generic models are able to interpret complex tasks using language commands, but they often have difficulties generalizing to out-of-distribution objects due to the inability of low-level action primitives. In contrast, existing task-specific models excel in low-level manipulation of unknown objects, but only work for a single type of action. To bridge this gap, we present M2T2, a single model that supplies different types of low-level actions that work robustly on arbitrary objects in cluttered scenes. M2T2 is a transformer model which reasons about contact points and predicts valid gripper poses for different action modes given a raw point cloud of the scene. Trained on a large-scale synthetic dataset with 128K scenes, M2T2 achieves zero-shot sim2real transfer on the real robot, outperforming the baseline system with state-of-the-art task-specific models by about 19% in overall performance and 37.5% in challenging scenes where the object needs to be re-oriented for collision-free placement. M2T2 also achieves state-of-the-art results on a subset of language conditioned tasks in RLBench. Videos of robot experiments on unseen objects in both real world and simulation are available on our project website https://m2-t2.github.io.

* 12 pages, 8 figures, accepted by CoRL 2023

Via

Access Paper or Ask Questions

CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

Apr 18, 2023

Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox

Abstract:We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7 microseconds per query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene's signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Robot experiment demos in completely unknown scenes and objects can be found at this http https://cabinet-object-rearrangement.github.io

Via

Access Paper or Ask Questions

Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

Jun 29, 2022

Yun-Chun Chen, Adithyavairavan Murali, Balakumar Sundaralingam, Wei Yang, Animesh Garg, Dieter Fox

Figure 1 for Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

Figure 2 for Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

Figure 3 for Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

Figure 4 for Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

Abstract:The pipeline of current robotic pick-and-place methods typically consists of several stages: grasp pose detection, finding inverse kinematic solutions for the detected poses, planning a collision-free trajectory, and then executing the open-loop trajectory to the grasp pose with a low-level tracking controller. While these grasping methods have shown good performance on grasping static objects on a table-top, the problem of grasping dynamic objects in constrained environments remains an open problem. We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network. This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.

* RSS 2022 Workshop on Implicit Representations for Robotic Manipulation

Via

Access Paper or Ask Questions

HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

May 19, 2022

Yu-Wei Chao, Chris Paxton, Yu Xiang, Wei Yang, Balakumar Sundaralingam, Tao Chen, Adithyavairavan Murali, Maya Cakmak, Dieter Fox

Figure 1 for HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

Figure 2 for HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

Figure 3 for HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

Figure 4 for HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

Abstract:We introduce a new simulation benchmark "HandoverSim" for human-to-robot object handovers. To simulate the giver's motion, we leverage a recent motion capture dataset of hand grasping of objects. We create training and evaluation environments for the receiver with standardized protocols and metrics. We analyze the performance of a set of baselines and show a correlation with a real-world evaluation. Code is open sourced at https://handover-sim.github.io.

* Accepted to ICRA 2022

Via

Access Paper or Ask Questions

Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Nov 13, 2020

Adithyavairavan Murali, Weiyu Liu, Kenneth Marino, Sonia Chernova, Abhinav Gupta

Figure 1 for Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Figure 2 for Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Figure 3 for Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Figure 4 for Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Abstract:Despite the enormous progress and generalization in robotic grasping in recent years, existing methods have yet to scale and generalize task-oriented grasping to the same extent. This is largely due to the scale of the datasets both in terms of the number of objects and tasks studied. We address these concerns with the TaskGrasp dataset which is more diverse both in terms of objects and tasks, and an order of magnitude larger than previous datasets. The dataset contains 250K task-oriented grasps for 56 tasks and 191 objects along with their RGB-D information. We take advantage of this new breadth and diversity in the data and present the GCNGrasp framework which uses the semantic knowledge of objects and tasks encoded in a knowledge graph to generalize to new object instances, classes and even new tasks. Our framework shows a significant improvement of around 12% on held-out settings compared to baseline methods which do not use semantics. We demonstrate that our dataset and model are applicable for the real world by executing task-oriented grasps on a real robot on unknown objects. Code, data and supplementary video could be found at https://sites.google.com/view/taskgrasp

* Accepted to Conference on Robot Learning (CoRL) 2020

Via

Access Paper or Ask Questions

6-DOF Grasping for Target-driven Object Manipulation in Clutter

Dec 08, 2019

Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Chris Paxton, Dieter Fox

Figure 1 for 6-DOF Grasping for Target-driven Object Manipulation in Clutter

Figure 2 for 6-DOF Grasping for Target-driven Object Manipulation in Clutter

Figure 3 for 6-DOF Grasping for Target-driven Object Manipulation in Clutter

Figure 4 for 6-DOF Grasping for Target-driven Object Manipulation in Clutter

Abstract:Grasping in cluttered environments is a fundamental but challenging robotic skill. It requires both reasoning about unseen object parts and potential collisions with the manipulator. Most existing data-driven approaches avoid this problem by limiting themselves to top-down planar grasps which is insufficient for many real-world scenarios and greatly limits possible grasps. We present a method that plans 6-DOF grasps for any desired object in a cluttered scene from partial point cloud observations. Our method achieves a grasp success of 80.3%, outperforming baseline approaches by 17.6% and clearing 9 cluttered table scenes (which contain 23 unknown objects and 51 picks in total) on a real robotic platform. By using our learned collision checking module, we can even reason about effective grasp sequences to retrieve objects that are not immediately accessible. Supplementary video can be found at https://youtu.be/w0B5S-gCsJk.

Via

Access Paper or Ask Questions

PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Jun 19, 2019

Adithyavairavan Murali, Tao Chen, Kalyan Vasudev Alwala, Dhiraj Gandhi, Lerrel Pinto, Saurabh Gupta, Abhinav Gupta

Figure 1 for PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Figure 2 for PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Figure 3 for PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Figure 4 for PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Abstract:This paper introduces PyRobot, an open-source robotics framework for research and benchmarking. PyRobot is a light-weight, high-level interface on top of ROS that provides a consistent set of hardware independent mid-level APIs to control different robots. PyRobot abstracts away details about low-level controllers and inter-process communication, and allows non-robotics researchers (ML, CV researchers) to focus on building high-level AI applications. PyRobot aims to provide a research ecosystem with convenient access to robotics datasets, algorithm implementations and models that can be used to quickly create a state-of-the-art baseline. We believe PyRobot, when paired up with low-cost robot platforms such as LoCoBot, will reduce the entry barrier into robotics, and democratize robotics. PyRobot is open-source, and can be accessed via https://pyrobot.org.

Via

Access Paper or Ask Questions

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Jan 13, 2019

Tao Chen, Adithyavairavan Murali, Abhinav Gupta

Figure 1 for Hardware Conditioned Policies for Multi-Robot Transfer Learning

Figure 2 for Hardware Conditioned Policies for Multi-Robot Transfer Learning

Figure 3 for Hardware Conditioned Policies for Multi-Robot Transfer Learning

Figure 4 for Hardware Conditioned Policies for Multi-Robot Transfer Learning

Abstract:Deep reinforcement learning could be used to learn dexterous robotic policies but it is challenging to transfer them to new robots with vastly different hardware properties. It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms. We propose a novel approach called \textit{Hardware Conditioned Policies} where we train a universal policy conditioned on a vector representation of robot hardware. We considered robots in simulation with varied dynamics, kinematic structure, kinematic lengths and degrees-of-freedom. First, we use the kinematic structure directly as the hardware encoding and show great zero-shot transfer to completely novel robots not seen during training. For robots with lower zero-shot success rate, we also demonstrate that fine-tuning the policy network is significantly more sample-efficient than training a model from scratch. In tasks where knowing the agent dynamics is important for success, we learn an embedding for robot hardware and show that policies conditioned on the encoding of hardware tend to generalize and transfer well. The code and videos are available on the project webpage: https://sites.google.com/view/robot-transfer-hcp.

Via

Access Paper or Ask Questions