Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhenyang Chen

SAIL: Faster-than-Demonstration Execution of Imitation Learning Policies

Jun 13, 2025

Nadun Ranawaka Arachchige, Zhenyang Chen, Wonsuhk Jung, Woo Chul Shin, Rohan Bansal, Pierre Barroso, Yu Hang He, Yingyang Celine Lin, Benjamin Joffe, Shreyas Kousik(+1 more)

Abstract:Offline Imitation Learning (IL) methods such as Behavior Cloning are effective at acquiring complex robotic manipulation skills. However, existing IL-trained policies are confined to executing the task at the same speed as shown in demonstration data. This limits the task throughput of a robotic system, a critical requirement for applications such as industrial automation. In this paper, we introduce and formalize the novel problem of enabling faster-than-demonstration execution of visuomotor policies and identify fundamental challenges in robot dynamics and state-action distribution shifts. We instantiate the key insights as SAIL (Speed Adaptation for Imitation Learning), a full-stack system integrating four tightly-connected components: (1) a consistency-preserving action inference algorithm for smooth motion at high speed, (2) high-fidelity tracking of controller-invariant motion targets, (3) adaptive speed modulation that dynamically adjusts execution speed based on motion complexity, and (4) action scheduling to handle real-world system latencies. Experiments on 12 tasks across simulation and two real, distinct robot platforms show that SAIL achieves up to a 4x speedup over demonstration speed in simulation and up to 3.2x speedup in the real world. Additional detail is available at https://nadunranawaka1.github.io/sail-policy

* The first two authors contributed equally

Via

Access Paper or Ask Questions

DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning

Feb 24, 2025

Zhengrong Xue, Shuying Deng, Zhenyang Chen, Yixuan Wang, Zhecheng Yuan, Huazhe Xu

Abstract:Visuomotor policies have shown great promise in robotic manipulation but often require substantial amounts of human-collected data for effective performance. A key reason underlying the data demands is their limited spatial generalization capability, which necessitates extensive data collection across different object configurations. In this work, we present DemoGen, a low-cost, fully synthetic approach for automatic demonstration generation. Using only one human-collected demonstration per task, DemoGen generates spatially augmented demonstrations by adapting the demonstrated action trajectory to novel object configurations. Visual observations are synthesized by leveraging 3D point clouds as the modality and rearranging the subjects in the scene via 3D editing. Empirically, DemoGen significantly enhances policy performance across a diverse range of real-world manipulation tasks, showing its applicability even in challenging scenarios involving deformable objects, dexterous hand end-effectors, and bimanual platforms. Furthermore, DemoGen can be extended to enable additional out-of-distribution capabilities, including disturbance resistance and obstacle avoidance.

* Project website: https://demo-generation.github.io

Via

Access Paper or Ask Questions

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

Sep 16, 2024

Yuanhang Zhang, Tianhai Liang, Zhenyang Chen, Yanjie Ze, Huazhe Xu

Abstract:Catching objects in flight (i.e., thrown objects) is a common daily skill for humans, yet it presents a significant challenge for robots. This task requires a robot with agile and accurate motion, a large spatial workspace, and the ability to interact with diverse objects. In this paper, we build a mobile manipulator composed of a mobile base, a 6-DoF arm, and a 12-DoF dexterous hand to tackle such a challenging task. We propose a two-stage reinforcement learning framework to efficiently train a whole-body-control catching policy for this high-DoF system in simulation. The objects' throwing configurations, shapes, and sizes are randomized during training to enhance policy adaptivity to various trajectories and object characteristics in flight. The results show that our trained policy catches diverse objects with randomly thrown trajectories, at a high success rate of about 80\% in simulation, with a significant improvement over the baselines. The policy trained in simulation can be directly deployed in the real world with onboard sensing and computation, which achieves catching sandbags in various shapes, randomly thrown by humans. Our project page is available at https://mobile-dex-catch.github.io/.

Via

Access Paper or Ask Questions

Learning Prehensile Dexterity by Imitating and Emulating State-only Observations

Apr 12, 2024

Yunhai Han, Zhenyang Chen, Harish Ravichandar

Figure 1 for Learning Prehensile Dexterity by Imitating and Emulating State-only Observations

Figure 2 for Learning Prehensile Dexterity by Imitating and Emulating State-only Observations

Figure 3 for Learning Prehensile Dexterity by Imitating and Emulating State-only Observations

Figure 4 for Learning Prehensile Dexterity by Imitating and Emulating State-only Observations

Abstract:When human acquire physical skills (e.g., tennis) from experts, we tend to first learn from merely observing the expert. But this is often insufficient. We then engage in practice, where we try to emulate the expert and ensure that our actions produce similar effects on our environment. Inspired by this observation, we introduce Combining IMitation and Emulation for Motion Refinement (CIMER) -- a two-stage framework to learn dexterous prehensile manipulation skills from state-only observations. CIMER's first stage involves imitation: simultaneously encode the complex interdependent motions of the robot hand and the object in a structured dynamical system. This results in a reactive motion generation policy that provides a reasonable motion prior, but lacks the ability to reason about contact effects due to the lack of action labels. The second stage involves emulation: learn a motion refinement policy via reinforcement that adjusts the robot hand's motion prior such that the desired object motion is reenacted. CIMER is both task-agnostic (no task-specific reward design or shaping) and intervention-free (no additional teleoperated or labeled demonstrations). Detailed experiments with prehensile dexterity reveal that i) imitation alone is insufficient, but adding emulation drastically improves performance, ii) CIMER outperforms existing methods in terms of sample efficiency and the ability to generate realistic and stable motions, iii) CIMER can either zero-shot generalize or learn to adapt to novel objects from the YCB dataset, even outperforming expert policies trained with action labels in most cases. Source code and videos are available at https://sites.google.com/view/cimer-2024/.

* Under review by RA-L

Via

Access Paper or Ask Questions

Efficient Belief Road Map for Planning Under Uncertainty

Sep 17, 2023

Zhenyang Chen, Hongzhe Yu, Yongxin Chen

Abstract:Robotic systems, particularly in demanding environments like narrow corridors or disaster zones, often grapple with imperfect state estimation. Addressing this challenge requires a trajectory plan that not only navigates these restrictive spaces but also manages the inherent uncertainty of the system. We present a novel approach for graph-based belief space planning via the use of an efficient covariance control algorithm. By adaptively steering state statistics via output state feedback, we efficiently craft a belief roadmap characterized by nodes with controlled uncertainty and edges representing collision-free mean trajectories. The roadmap's structured design then paves the way for precise path searches that balance control costs and uncertainty considerations. Our numerical experiments affirm the efficacy and advantage of our method in different motion planning tasks. Our open-source implementation can be found at https://github.com/hzyu17/VIMP/tree/BRM.

Via

Access Paper or Ask Questions

Graph Attention Collaborative Similarity Embedding for Recommender System

Feb 05, 2021

Jinbo Song, Chao Chang, Fei Sun, Zhenyang Chen, Guoyong Hu, Peng Jiang

Figure 1 for Graph Attention Collaborative Similarity Embedding for Recommender System

Figure 2 for Graph Attention Collaborative Similarity Embedding for Recommender System

Figure 3 for Graph Attention Collaborative Similarity Embedding for Recommender System

Figure 4 for Graph Attention Collaborative Similarity Embedding for Recommender System

Abstract:We present Graph Attention Collaborative Similarity Embedding (GACSE), a new recommendation framework that exploits collaborative information in the user-item bipartite graph for representation learning. Our framework consists of two parts: the first part is to learn explicit graph collaborative filtering information such as user-item association through embedding propagation with attention mechanism, and the second part is to learn implicit graph collaborative information such as user-user similarities and item-item similarities through auxiliary loss. We design a new loss function that combines BPR loss with adaptive margin and similarity loss for the similarities learning. Extensive experiments on three benchmarks show that our model is consistently better than the latest state-of-the-art models.

Via

Access Paper or Ask Questions

Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region

May 24, 2018

Yi Yang, Andy Chen, Xiaoming Chen, Jiang Ji, Zhenyang Chen, Yan Dai

Figure 1 for Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region

Abstract:Implementing large-scale deep neural networks with high computational complexity on low-cost IoT devices may inevitably be constrained by limited computation resource, making the devices hard to respond in real-time. This disjunction makes the state-of-art deep learning algorithms, i.e. CNN (Convolutional Neural Networks), incompatible with IoT world. We present a low-bit (range from 8-bit to 1-bit) scheme with our local quantization region algorithm. We use models in Caffe model zoo as our example tasks to evaluate the effect of our low precision data representation scheme. With the available of local quantization region, we find implementations on top of those schemes could greatly retain the model accuracy, besides the reduction of computational complexity. For example, our 8-bit scheme has no drops on top-1 and top-5 accuracy with 2x speedup on Intel Edison IoT platform. Implementations based on our 4-bit, 2-bit or 1-bit scheme are also applicable to IoT devices with advances of low computational complexity. For example, the drop on our task is only 0.7% when using 2-bit scheme, a scheme which could largely save transistors. Making low-bit scheme usable here opens a new door for further optimization on commodity IoT controller, i.e. extra speed-up could be achieved by replacing multiply-accumulate operations with the proposed table look-up operations. The whole study offers a new approach to relief the challenge of bring advanced deep learning algorithm to resource constrained low-cost IoT device.

Via

Access Paper or Ask Questions