Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianxin Bi

Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Oct 10, 2024

Jianxin Bi, Kelvin Lim, Kaiqi Chen, Yifei Huang, Harold Soh

Figure 1 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Figure 2 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Figure 3 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Figure 4 for Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Abstract:Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data.

Via

Access Paper or Ask Questions

Safety-Constrained Policy Transfer with Successor Features

Nov 10, 2022

Zeyu Feng, Bowen Zhang, Jianxin Bi, Harold Soh

Abstract:In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account.

Via

Access Paper or Ask Questions

SCALES: From Fairness Principles to Constrained Decision-Making

Sep 22, 2022

Sreejith Balakrishnan, Jianxin Bi, Harold Soh

Figure 1 for SCALES: From Fairness Principles to Constrained Decision-Making

Figure 2 for SCALES: From Fairness Principles to Constrained Decision-Making

Figure 3 for SCALES: From Fairness Principles to Constrained Decision-Making

Figure 4 for SCALES: From Fairness Principles to Constrained Decision-Making

Abstract:This paper proposes SCALES, a general framework that translates well-established fairness principles into a common representation based on the Constraint Markov Decision Process (CMDP). With the help of causal language, our framework can place constraints on both the procedure of decision making (procedural fairness) as well as the outcomes resulting from decisions (outcome fairness). Specifically, we show that well-known fairness principles can be encoded either as a utility component, a non-causal component, or a causal component in a SCALES-CMDP. We illustrate SCALES using a set of case studies involving a simulated healthcare scenario and the real-world COMPAS dataset. Experiments demonstrate that our framework produces fair policies that embody alternative fairness principles in single-step and sequential decision-making scenarios.

* Accepted to the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES '22), Updated version with additional citations, 14 pages

Via

Access Paper or Ask Questions