Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yoonchang Sung

ROTATE: Regret-driven Open-ended Training for Ad Hoc Teamwork

May 29, 2025

Caroline Wang, Arrasy Rahman, Jiaxun Cui, Yoonchang Sung, Peter Stone

Abstract:Developing AI agents capable of collaborating with previously unseen partners is a fundamental generalization challenge in multi-agent learning, known as Ad Hoc Teamwork (AHT). Existing AHT approaches typically adopt a two-stage pipeline, where first, a fixed population of teammates is generated with the idea that they should be representative of the teammates that will be seen at deployment time, and second, an AHT agent is trained to collaborate well with agents in the population. To date, the research community has focused on designing separate algorithms for each stage. This separation has led to algorithms that generate teammate pools with limited coverage of possible behaviors, and that ignore whether the generated teammates are easy to learn from for the AHT agent. Furthermore, algorithms for training AHT agents typically treat the set of training teammates as static, thus attempting to generalize to previously unseen partner agents without assuming any control over the distribution of training teammates. In this paper, we present a unified framework for AHT by reformulating the problem as an open-ended learning process between an ad hoc agent and an adversarial teammate generator. We introduce ROTATE, a regret-driven, open-ended training algorithm that alternates between improving the AHT agent and generating teammates that probe its deficiencies. Extensive experiments across diverse AHT environments demonstrate that ROTATE significantly outperforms baselines at generalizing to an unseen set of evaluation teammates, thus establishing a new standard for robust and generalizable teamwork.

Via

Access Paper or Ask Questions

Effort Allocation for Deadline-Aware Task and Motion Planning: A Metareasoning Approach

Oct 08, 2024

Yoonchang Sung, Shahaf S. Shperberg, Qi Wang, Peter Stone

Abstract:In robot planning, tasks can often be achieved through multiple options, each consisting of several actions. This work specifically addresses deadline constraints in task and motion planning, aiming to find a plan that can be executed within the deadline despite uncertain planning and execution times. We propose an effort allocation problem, formulated as a Markov decision process (MDP), to find such a plan by leveraging metareasoning perspectives to allocate computational resources among the given options. We formally prove the NP-hardness of the problem by reducing it from the knapsack problem. Both a model-based approach, where transition models are learned from past experience, and a model-free approach, which overcomes the unavailability of prior data acquisition through reinforcement learning, are explored. For the model-based approach, we investigate Monte Carlo tree search (MCTS) to approximately solve the proposed MDP and further design heuristic schemes to tackle NP-hardness, leading to the approximate yet efficient algorithm called DP_Rerun. In experiments, DP_Rerun demonstrates promising performance comparable to MCTS while requiring negligible computation time.

* 48 pages, 6 figures

Via

Access Paper or Ask Questions

PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Sep 24, 2024

Mingyo Seo, Yoonyoung Cho, Yoonchang Sung, Peter Stone, Yuke Zhu, Beomjoon Kim

Figure 1 for PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Figure 2 for PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Figure 3 for PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Figure 4 for PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Abstract:We introduce a learning-guided motion planning framework that provides initial seed trajectories using a diffusion model for trajectory optimization. Given a workspace, our method approximates the configuration space (C-space) obstacles through a key-configuration representation that consists of a sparse set of task-related key configurations, and uses this as an input to the diffusion model. The diffusion model integrates regularization terms that encourage collision avoidance and smooth trajectories during training, and trajectory optimization refines the generated seed trajectories to further correct any colliding segments. Our experimental results demonstrate that using high-quality trajectory priors, learned through our C-space-grounded diffusion model, enables efficient generation of collision-free trajectories in narrow-passage environments, outperforming prior learning- and planning-based baselines. Videos and additional materials can be found on the project page: https://kiwi-sherbet.github.io/PRESTO.

* Submitted to ICRA 2025

Via

Access Paper or Ask Questions

Asynchronous Task Plan Refinement for Multi-Robot Task and Motion Planning

Sep 16, 2023

Yoonchang Sung, Rahul Shome, Peter Stone

Abstract:This paper explores general multi-robot task and motion planning, where multiple robots in close proximity manipulate objects while satisfying constraints and a given goal. In particular, we formulate the plan refinement problem--which, given a task plan, finds valid assignments of variables corresponding to solution trajectories--as a hybrid constraint satisfaction problem. The proposed algorithm follows several design principles that yield the following features: (1) efficient solution finding due to sequential heuristics and implicit time and roadmap representations, and (2) maximized feasible solution space obtained by introducing minimally necessary coordination-induced constraints and not relying on prevalent simplifications that exist in the literature. The evaluation results demonstrate the planning efficiency of the proposed algorithm, outperforming the synchronous approach in terms of makespan.

Via

Access Paper or Ask Questions

Decision-Theoretic Approaches for Robotic Environmental Monitoring -- A Survey

Aug 04, 2023

Yoonchang Sung, Jnaneshwar Das, Pratap Tokekar

Abstract:Robotics has dramatically increased our ability to gather data about our environments. This is an opportune time for the robotics and algorithms community to come together to contribute novel solutions to pressing environmental monitoring problems. In order to do so, it is useful to consider a taxonomy of problems and methods in this realm. We present the first comprehensive summary of decision theoretic approaches that are enabling efficient sampling of various kinds of environmental processes. Representations for different kinds of environments are explored, followed by a discussion of tasks of interest such as learning, localization, or monitoring. Finally, various algorithms to carry out these tasks are presented, along with a few illustrative prior results from the community.

Via

Access Paper or Ask Questions

Motion Planning (In)feasibility Detection using a Prior Roadmap via Path and Cut Search

May 18, 2023

Yoonchang Sung, Peter Stone

Abstract:Motion planning seeks a collision-free path in a configuration space (C-space), representing all possible robot configurations in the environment. As it is challenging to construct a C-space explicitly for a high-dimensional robot, we generally build a graph structure called a roadmap, a discrete approximation of a complex continuous C-space, to reason about connectivity. Checking collision-free connectivity in the roadmap requires expensive edge-evaluation computations, and thus, reducing the number of evaluations has become a significant research objective. However, in practice, we often face infeasible problems: those in which there is no collision-free path in the roadmap between the start and the goal locations. Existing studies often overlook the possibility of infeasibility, becoming highly inefficient by performing many edge evaluations. In this work, we address this oversight in scenarios where a prior roadmap is available; that is, the edges of the roadmap contain the probability of being a collision-free edge learned from past experience. To this end, we propose an algorithm called iterative path and cut finding (IPC) that iteratively searches for a path and a cut in a prior roadmap to detect infeasibility while reducing expensive edge evaluations as much as possible. We further improve the efficiency of IPC by introducing a second algorithm, iterative decomposition and path and cut finding (IDPC), that leverages the fact that cut-finding algorithms partition the roadmap into smaller subgraphs. We analyze the theoretical properties of IPC and IDPC, such as completeness and computational complexity, and evaluate their performance in terms of completion time and the number of edge evaluations in large-scale simulations.

* 18 pages, 19 figures, Published in Robotics: Science and Systems (RSS), 2023

Via

Access Paper or Ask Questions

Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning

Nov 15, 2022

Yoonchang Sung, Zizhao Wang, Peter Stone

Figure 1 for Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning

Figure 2 for Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning

Figure 3 for Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning

Figure 4 for Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning

Abstract:As robots become increasingly capable of manipulation and long-term autonomy, long-horizon task and motion planning problems are becoming increasingly important. A key challenge in such problems is that early actions in the plan may make future actions infeasible. When reaching a dead-end in the search, most existing planners use backtracking, which exhaustively reevaluates motion-level actions, often resulting in inefficient planning, especially when the search depth is large. In this paper, we propose to learn backjumping heuristics which identify the culprit action directly using supervised learning models to guide the task-level search. Based on evaluations on two different tasks, we find that our method significantly improves planning efficiency compared to backtracking and also generalizes to problems with novel numbers of objects.

* 17 pages, 3 figures, Published in the Conference on Robot Learning (CoRL), 2022

Via

Access Paper or Ask Questions

Towards Optimal Correlational Object Search

Oct 19, 2021

Kaiyu Zheng, Rohan Chitnis, Yoonchang Sung, George Konidaris, Stefanie Tellex

Figure 1 for Towards Optimal Correlational Object Search

Figure 2 for Towards Optimal Correlational Object Search

Figure 3 for Towards Optimal Correlational Object Search

Figure 4 for Towards Optimal Correlational Object Search

Abstract:In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable for planning efficiently: when looking for a fork, the robot could start by locating the easier-to-detect refrigerator, since forks would probably be found nearby. Previous approaches to object search with correlational information typically resort to ad-hoc or greedy search strategies. In this paper, we propose the Correlational Object Search POMDP (COS-POMDP), which can be solved to produce search strategies that use correlational information. COS-POMDPs contain a correlation-based observation model that allows us to avoid the exponential blow-up of maintaining a joint belief about all objects, while preserving the optimal solution to this naive, exponential POMDP formulation. We propose a hierarchical planning algorithm to scale up COS-POMDP for practical domains. We conduct experiments using AI2-THOR, a realistic simulator of household environments, as well as YOLOv5, a widely-used object detector. Our results show that, particularly for hard-to-detect objects, such as scrub brush and remote control, our method offers the most robust performance compared to baselines that ignore correlations as well as a greedy, next-best view approach.

* 10 pages, 4 figures, 3 tables

Via

Access Paper or Ask Questions

Reactive Task and Motion Planning under Temporal Logic Specifications

Mar 26, 2021

Shen Li, Daehyung Park, Yoonchang Sung, Julie A. Shah, Nicholas Roy

Figure 1 for Reactive Task and Motion Planning under Temporal Logic Specifications

Figure 2 for Reactive Task and Motion Planning under Temporal Logic Specifications

Figure 3 for Reactive Task and Motion Planning under Temporal Logic Specifications

Figure 4 for Reactive Task and Motion Planning under Temporal Logic Specifications

Abstract:We present a task-and-motion planning (TAMP) algorithm robust against a human operator's cooperative or adversarial interventions. Interventions often invalidate the current plan and require replanning on the fly. Replanning can be computationally expensive and often interrupts seamless task execution. We introduce a dynamically reconfigurable planning methodology with behavior tree-based control strategies toward reactive TAMP, which takes the advantage of previous plans and incremental graph search during temporal logic-based reactive synthesis. Our algorithm also shows efficient recovery functionalities that minimize the number of replanning steps. Finally, our algorithm produces a robust, efficient, and complete TAMP solution. Our experimental results show the algorithm results in superior manipulation performance in both simulated and real-world tasks.

* 7 pages, 6 figures, Published in IEEE International Conference on Robotics and Automation (ICRA), 2021

Via

Access Paper or Ask Questions

Learning When to Quit: Meta-Reasoning for Motion Planning

Mar 07, 2021

Yoonchang Sung, Leslie Pack Kaelbling, Tomás Lozano-Pérez

Figure 1 for Learning When to Quit: Meta-Reasoning for Motion Planning

Figure 2 for Learning When to Quit: Meta-Reasoning for Motion Planning

Figure 3 for Learning When to Quit: Meta-Reasoning for Motion Planning

Figure 4 for Learning When to Quit: Meta-Reasoning for Motion Planning

Abstract:Anytime motion planners are widely used in robotics. However, the relationship between their solution quality and computation time is not well understood, and thus, determining when to quit planning and start execution is unclear. In this paper, we address the problem of deciding when to stop deliberation under bounded computational capacity, so called meta-reasoning, for anytime motion planning. We propose data-driven learning methods, model-based and model-free meta-reasoning, that are applicable to different environment distributions and agnostic to the choice of anytime motion planners. As a part of the framework, we design a convolutional neural network-based optimal solution predictor that predicts the optimal path length from a given 2D workspace image. We empirically evaluate the performance of the proposed methods in simulation in comparison with baselines.

* 8 pages, 5 figures, Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021

Via

Access Paper or Ask Questions