Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Utkarsh A. Mishra

Generative Trajectory Stitching through Diffusion Composition

Mar 07, 2025

Yunhao Luo, Utkarsh A. Mishra, Yilun Du, Danfei Xu

Figure 1 for Generative Trajectory Stitching through Diffusion Composition

Figure 2 for Generative Trajectory Stitching through Diffusion Composition

Figure 3 for Generative Trajectory Stitching through Diffusion Composition

Figure 4 for Generative Trajectory Stitching through Diffusion Composition

Abstract:Effective trajectory stitching for long-horizon planning is a significant challenge in robotic decision-making. While diffusion models have shown promise in planning, they are limited to solving tasks similar to those seen in their training data. We propose CompDiffuser, a novel generative approach that can solve new tasks by learning to compositionally stitch together shorter trajectory chunks from previously seen tasks. Our key insight is modeling the trajectory distribution by subdividing it into overlapping chunks and learning their conditional relationships through a single bidirectional diffusion model. This allows information to propagate between segments during generation, ensuring physically consistent connections. We conduct experiments on benchmark tasks of various difficulties, covering different environment sizes, agent state dimension, trajectory types, training data quality, and show that CompDiffuser significantly outperforms existing methods.

* Project page: https://comp-diffuser.github.io/

Via

Access Paper or Ask Questions

RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution

Sep 28, 2024

Wonsuhk Jung, Dennis Anthony, Utkarsh A. Mishra, Nadun Ranawaka Arachchige, Matthew Bronars, Danfei Xu, Shreyas Kousik

Figure 1 for RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution

Figure 2 for RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution

Figure 3 for RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution

Figure 4 for RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution

Abstract:Imitation learning (IL) has shown great success in learning complex robot manipulation tasks. However, there remains a need for practical safety methods to justify widespread deployment. In particular, it is important to certify that a system obeys hard constraints on unsafe behavior in settings when it is unacceptable to design a tradeoff between performance and safety via tuning the policy (i.e. soft constraints). This leads to the question, how does enforcing hard constraints impact the performance (meaning safely completing tasks) of an IL policy? To answer this question, this paper builds a reachability-based safety filter to enforce hard constraints on IL, which we call Reachability-Aided Imitation Learning (RAIL). Through evaluations with state-of-the-art IL policies in mobile robots and manipulation tasks, we make two key findings. First, the highest-performing policies are sometimes only so because they frequently violate constraints, and significantly lose performance under hard constraints. Second, surprisingly, hard constraints on the lower-performing policies can occasionally increase their ability to perform tasks safely. Finally, hardware evaluation confirms the method can operate in real time.

* * denotes equal contribution

Via

Access Paper or Ask Questions

Generative Factor Chaining: Coordinated Manipulation with Diffusion-based Factor Graph

Sep 24, 2024

Utkarsh A. Mishra, Yongxin Chen, Danfei Xu

Abstract:Learning to plan for multi-step, multi-manipulator tasks is notoriously difficult because of the large search space and the complex constraint satisfaction problems. We present Generative Factor Chaining~(GFC), a composable generative model for planning. GFC represents a planning problem as a spatial-temporal factor graph, where nodes represent objects and robots in the scene, spatial factors capture the distributions of valid relationships among nodes, and temporal factors represent the distributions of skill transitions. Each factor is implemented as a modular diffusion model, which are composed during inference to generate feasible long-horizon plans through bi-directional message passing. We show that GFC can solve complex bimanual manipulation tasks and exhibits strong generalization to unseen planning tasks with novel combinations of objects and constraints. More details can be found at: https://generative-fc.github.io/

* 28 pages, 17 figures, 2024 Conference on Robot Learning

Via

Access Paper or Ask Questions

Learning Representations for Pixel-based Control: What Matters and Why?

Nov 15, 2021

Manan Tomar, Utkarsh A. Mishra, Amy Zhang, Matthew E. Taylor

Figure 1 for Learning Representations for Pixel-based Control: What Matters and Why?

Figure 2 for Learning Representations for Pixel-based Control: What Matters and Why?

Figure 3 for Learning Representations for Pixel-based Control: What Matters and Why?

Figure 4 for Learning Representations for Pixel-based Control: What Matters and Why?

Abstract:Learning representations for pixel-based control has garnered significant attention recently in reinforcement learning. A wide range of methods have been proposed to enable efficient learning, leading to sample complexities similar to those in the full state setting. However, moving beyond carefully curated pixel data sets (centered crop, appropriate lighting, clear background, etc.) remains challenging. In this paper, we adopt a more difficult setting, incorporating background distractors, as a first step towards addressing this challenge. We present a simple baseline approach that can learn meaningful representations with no metric-based learning, no data augmentations, no world-model learning, and no contrastive learning. We then analyze when and why previously proposed methods are likely to fail or reduce to the same performance as the baseline in this harder setting and why we should think carefully about extending such methods beyond the well curated environments. Our results show that finer categorization of benchmarks on the basis of characteristics like density of reward, planning horizon of the problem, presence of task-irrelevant components, etc., is crucial in evaluating algorithms. Based on these observations, we propose different metrics to consider when evaluating an algorithm on benchmark tasks. We hope such a data-centric view can motivate researchers to rethink representation learning when investigating how to best apply RL to real-world tasks.

Via

Access Paper or Ask Questions

Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Oct 05, 2021

Lokesh Krishna, Guillermo A. Castillo, Utkarsh A. Mishra, Ayonga Hereid, Shishir Kolathaya

Figure 1 for Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Figure 2 for Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Figure 3 for Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Figure 4 for Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Abstract:In this work, we demonstrate robust walking in the bipedal robot Digit on uneven terrains by just learning a single linear policy. In particular, we propose a new control pipeline, wherein the high-level trajectory modulator shapes the end-foot ellipsoidal trajectories, and the low-level gait controller regulates the torso and ankle orientation. The foot-trajectory modulator uses a linear policy and the regulator uses a linear PD control law. As opposed to neural network-based policies, the proposed linear policy has only 13 learnable parameters, thereby not only guaranteeing sample efficient learning but also enabling simplicity and interpretability of the policy. This is achieved with no loss of performance on challenging terrains like slopes, stairs and outdoor landscapes. We first demonstrate robust walking in the custom simulation environment, MuJoCo, and then directly transfer to hardware with no modification of the control pipeline. We subject the biped to a series of pushes and terrain height changes, both indoors and outdoors, thereby validating the presented work.

* 8 pages, 10 Figures

Via

Access Paper or Ask Questions

Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Apr 04, 2021

Lokesh Krishna, Utkarsh A. Mishra, Guillermo A. Castillo, Ayonga Hereid, Shishir Kolathaya

Figure 1 for Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Figure 2 for Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Figure 3 for Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Figure 4 for Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Abstract:In this paper, with a view toward deployment of light-weight control frameworks for bipedal walking robots, we realize end-foot trajectories that are shaped by a single linear feedback policy. We learn this policy via a model-free and a gradient-free learning algorithm, Augmented Random Search (ARS), in the two robot platforms Rabbit and Digit. Our contributions are two-fold: a) By using torso and support plane orientation as inputs, we achieve robust walking on slopes of up to 20 degrees in simulation. b) We demonstrate additional behaviors like walking backwards, stepping-in-place, and recovery from external pushes of up to 120 N. The end result is a robust and a fast feedback control law for bipedal walking on terrains with varying slopes. Towards the end, we also provide preliminary results of hardware transfer to Digit.

* 7 pages, 7 figures

Via

Access Paper or Ask Questions

Planning Brachistochrone Hip Trajectory for a Toe-Foot Bipedal Robot going Downstairs

Dec 02, 2020

Gaurav Bhardwaj, Utkarsh A. Mishra, N. Sukavanam, R. Balasubramanian

Figure 1 for Planning Brachistochrone Hip Trajectory for a Toe-Foot Bipedal Robot going Downstairs

Figure 2 for Planning Brachistochrone Hip Trajectory for a Toe-Foot Bipedal Robot going Downstairs

Figure 3 for Planning Brachistochrone Hip Trajectory for a Toe-Foot Bipedal Robot going Downstairs

Figure 4 for Planning Brachistochrone Hip Trajectory for a Toe-Foot Bipedal Robot going Downstairs

Abstract:A novel efficient downstairs trajectory is proposed for a 9 link biped robot model with toe-foot. Brachistochrone is the fastest descent trajectory for a particle moving only under the influence of gravity. In most situations, while climbing downstairs, human hip also follow brachistochrone trajectory for a more responsive motion. Here, an adaptive trajectory planning algorithm is developed so that biped robots of varying link lengths, masses can climb down on varying staircase dimensions. We assume that the center of gravity (COG) of the biped concerned lies on the hip. Zero Moment Point (ZMP) based COG trajectory is considered and its stability is ensured. Cycloidal trajectory is considered for ankle of the swing leg. Parameters of both cycloid and brachistochrone depends on dimensions of staircase steps. Hence this paper can be broadly divided into 4 steps 1) Developing ZMP based brachistochrone trajectory for hip 2) Cycloidal trajectory planning for ankle by taking proper collision constraints 3) Solving Inverse kinematics using unsupervised artificial neural network (ANN) 4) Comparison between the proposed, a circular arc and a virtual slope based hip trajectory. The proposed algorithms have been implemented using MATLAB.

* 6 pages, 6 figures, Accepted for presentation at the RoAI 2020: International Conference on Robotics and Artificial Intelligence 2020, IIT Madras and will be published in the Proceedings by the Journal of Physics: Conference Series. arXiv admin note: substantial text overlap with arXiv:2012.01417

Via

Access Paper or Ask Questions

Cycloidal Trajectory Realization on Staircase with Optimal Trajectory Tracking Control based on Neural Network Temporal Quantized Lagrange Dynamics (NNTQLD)

Dec 02, 2020

Gaurav Bhardwaj, Utkarsh A. Mishra, N. Sukavanam, R. Balasubramanian

Figure 1 for Cycloidal Trajectory Realization on Staircase with Optimal Trajectory Tracking Control based on Neural Network Temporal Quantized Lagrange Dynamics (NNTQLD)

Figure 2 for Cycloidal Trajectory Realization on Staircase with Optimal Trajectory Tracking Control based on Neural Network Temporal Quantized Lagrange Dynamics (NNTQLD)

Figure 3 for Cycloidal Trajectory Realization on Staircase with Optimal Trajectory Tracking Control based on Neural Network Temporal Quantized Lagrange Dynamics (NNTQLD)

Figure 4 for Cycloidal Trajectory Realization on Staircase with Optimal Trajectory Tracking Control based on Neural Network Temporal Quantized Lagrange Dynamics (NNTQLD)

Abstract:In this paper, a novel optimal technique for joint angles trajectory tracking control of a biped robot with toe foot is proposed. For the task of climbing stairs by a 9 link biped model, a cycloid trajectory for swing phase is proposed in such a way that the cycloid variables depend on the staircase dimensions. Zero Moment Point(ZMP) criteria is taken for satisfying stability constraint. This paper mainly can be divided into 4 steps: 1) Planning stable cycloid trajectory for initial step and subsequent step for climbing upstairs. 2) Inverse Kinematics using unsupervised artificial neural network with knot shifting procedure for jerk minimization. 3) Modeling Dynamics for Toe foot biped model using Lagrange Dynamics along with contact modeling using spring damper system , and finally 4) Real time joint angle trajectory tracking optimization using Temporal Quantized Lagrange Dynamics which takes inverse kinematics output from neural network as its inputs. Generated patterns have been simulated in MATLAB.

* 13 pages, 18 figures, Preprint Submitted to Elsevier Robotics and Autonomous Systems Journal on November 29, 2020

Via

Access Paper or Ask Questions