Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mingyu Cai

Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments

Apr 09, 2025

Licheng Luo, Mingyu Cai

Abstract:Deep Reinforcement Learning (DRL) has emerged as a powerful model-free paradigm for learning optimal policies. However, in real-world navigation tasks, DRL methods often suffer from insufficient exploration, particularly in cluttered environments with sparse rewards or complex dynamics under system disturbances. To address this challenge, we bridge general graph-based motion planning with DRL, enabling agents to explore cluttered spaces more effectively and achieve desired navigation performance. Specifically, we design a dense reward function grounded in a graph structure that spans the entire state space. This graph provides rich guidance, steering the agent toward optimal strategies. We validate our approach in challenging environments, demonstrating substantial improvements in exploration efficiency and task success rates. The project website is available at: https://plen1lune.github.io/overcome_exploration/

* 10 pages

Via

Access Paper or Ask Questions

TLINet: Differentiable Neural Network Temporal Logic Inference

May 14, 2024

Danyang Li, Mingyu Cai, Cristian-Ioan Vasile, Roberto Tron

Figure 1 for TLINet: Differentiable Neural Network Temporal Logic Inference

Figure 2 for TLINet: Differentiable Neural Network Temporal Logic Inference

Figure 3 for TLINet: Differentiable Neural Network Temporal Logic Inference

Figure 4 for TLINet: Differentiable Neural Network Temporal Logic Inference

Abstract:There has been a growing interest in extracting formal descriptions of the system behaviors from data. Signal Temporal Logic (STL) is an expressive formal language used to describe spatial-temporal properties with interpretability. This paper introduces TLINet, a neural-symbolic framework for learning STL formulas. The computation in TLINet is differentiable, enabling the usage of off-the-shelf gradient-based tools during the learning process. In contrast to existing approaches, we introduce approximation methods for max operator designed specifically for temporal logic-based gradient techniques, ensuring the correctness of STL satisfaction evaluation. Our framework not only learns the structure but also the parameters of STL formulas, allowing flexible combinations of operators and various logical structures. We validate TLINet against state-of-the-art baselines, demonstrating that our approach outperforms these baselines in terms of interpretability, compactness, rich expressibility, and computational efficiency.

Via

Access Paper or Ask Questions

Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Mar 28, 2024

Mingyu Cai, Karankumar Patel, Soshi Iba, Songpo Li

Figure 1 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Figure 2 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Figure 3 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Figure 4 for Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks

Abstract:In human-robot collaboration, shared control presents an opportunity to teleoperate robotic manipulation to improve the efficiency of manufacturing and assembly processes. Robots are expected to assist in executing the user's intentions. To this end, robust and prompt intention estimation is needed, relying on behavioral observations. The framework presents an intention estimation technique at hierarchical levels i.e., low-level actions and high-level tasks, by incorporating multi-scale hierarchical information in neural networks. Technically, we employ hierarchical dependency loss to boost overall accuracy. Furthermore, we propose a multi-window method that assigns proper hierarchical prediction windows of input data. An analysis of the predictive power with various inputs demonstrates the predominance of the deep hierarchical model in the sense of prediction accuracy and early intention identification. We implement the algorithm on a virtual reality (VR) setup to teleoperate robotic hands in a simulation with various assembly tasks to show the effectiveness of online estimation.

* ICRA 2024

Via

Access Paper or Ask Questions

Model-free Motion Planning of Autonomous Agents for Complex Tasks in Partially Observable Environments

Apr 30, 2023

Junchao Li, Mingyu Cai, Zhen Kan, Shaoping Xiao

Abstract:Motion planning of autonomous agents in partially known environments with incomplete information is a challenging problem, particularly for complex tasks. This paper proposes a model-free reinforcement learning approach to address this problem. We formulate motion planning as a probabilistic-labeled partially observable Markov decision process (PL-POMDP) problem and use linear temporal logic (LTL) to express the complex task. The LTL formula is then converted to a limit-deterministic generalized B\"uchi automaton (LDGBA). The problem is redefined as finding an optimal policy on the product of PL-POMDP with LDGBA based on model-checking techniques to satisfy the complex task. We implement deep Q learning with long short-term memory (LSTM) to process the observation history and task recognition. Our contributions include the proposed method, the utilization of LTL and LDGBA, and the LSTM-enhanced deep Q learning. We demonstrate the applicability of the proposed method by conducting simulations in various environments, including grid worlds, a virtual office, and a multi-agent warehouse. The simulation results demonstrate that our proposed method effectively addresses environment, action, and observation uncertainties. This indicates its potential for real-world applications, including the control of unmanned aerial vehicles (UAVs).

* 32 pages, 22 figures, submitted to Autonomous Agents and Multi-Agent Systems

Via

Access Paper or Ask Questions

**Efficient LQR-CBF-RRT*: Safe and Optimal Motion Planning**

Apr 04, 2023

Guang Yang, Mingyu Cai, Ahmad Ahmad, Calin Belta, Roberto Tron

Abstract:Control Barrier Functions (CBF) are a powerful tool for designing safety-critical controllers and motion planners. The safety requirements are encoded as a continuously differentiable function that maps from state variables to a real value, in which the sign of its output determines whether safety is violated. In practice, the CBFs can be used to enforce safety by imposing itself as a constraint in a Quadratic Program (QP) solved point-wise in time. However, this approach costs computational resources and could lead to infeasibility in solving the QP. In this paper, we propose a novel motion planning framework that combines sampling-based methods with Linear Quadratic Regulator (LQR) and CBFs. Our approach does not require solving the QPs for control synthesis and avoids explicit collision checking during samplings. Instead, it uses LQR to generate optimal controls and CBF to reject unsafe trajectories. To improve sampling efficiency, we employ the Cross-Entropy Method (CEM) for importance sampling (IS) to sample configurations that will enhance the path with higher probability and store computed optimal gain matrices in a hash table to avoid re-computation during rewiring procedure. We demonstrate the effectiveness of our method on nonlinear control affine systems in simulation.

Via

Access Paper or Ask Questions

Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Oct 06, 2022

Mingyu Cai, Makai Mann, Zachary Serlin, Kevin Leahy, Cristian-Ioan Vasile

Figure 1 for Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Figure 2 for Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Figure 3 for Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Figure 4 for Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Abstract:This paper explores continuous-time control synthesis for target-driven navigation to satisfy complex high-level tasks expressed as linear temporal logic (LTL). We propose a model-free framework using deep reinforcement learning (DRL) where the underlying dynamic system is unknown (an opaque box). Unlike prior work, this paper considers scenarios where the given LTL specification might be infeasible and therefore cannot be accomplished globally. Instead of modifying the given LTL formula, we provide a general DRL-based approach to satisfy it with minimal violation. %\mminline{Need to decide if we're comfortable calling these "guarantees" due to the stochastic policy. I'm not repeating this comment everywhere that says "guarantees" but there are multiple places.} To do this, we transform a previously multi-objective DRL problem, which requires simultaneous automata satisfaction and minimum violation cost, into a single objective. By guiding the DRL agent with a sampling-based path planning algorithm for the potentially infeasible LTL task, the proposed approach mitigates the myopic tendencies of DRL, which are often an issue when learning general LTL tasks that can have long or infinite horizons. This is achieved by decomposing an infeasible LTL formula into several reach-avoid sub-tasks with shorter horizons, which can be trained in a modular DRL architecture. Furthermore, we overcome the challenge of the exploration process for DRL in complex and cluttered environments by using path planners to design rewards that are dense in the configuration space. The benefits of the presented approach are demonstrated through testing on various complex nonlinear systems and compared with state-of-the-art baselines. The Video demonstration can be found on YouTube Channel:\url{https://youtu.be/jBhx6Nv224E}.

Via

Access Paper or Ask Questions

Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Oct 04, 2022

Danyang Li, Mingyu Cai, Cristian-Ioan Vasile, Roberto Tron

Figure 1 for Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Figure 2 for Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Figure 3 for Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Figure 4 for Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Abstract:Machine learning techniques using neural networks have achieved promising success for time-series data classification. However, the models that they produce are challenging to verify and interpret. In this paper, we propose an explainable neural-symbolic framework for the classification of time-series behaviors. In particular, we use an expressive formal language, namely Signal Temporal Logic (STL), to constrain the search of the computation graph for a neural network. We design a novel time function and sparse softmax function to improve the soundness and precision of the neural-STL framework. As a result, we can efficiently learn a compact STL formula for the classification of time-series data through off-the-shelf gradient-based tools. We demonstrate the computational efficiency, compactness, and interpretability of the proposed method through driving scenarios and naval surveillance case studies, compared with state-of-the-art baselines.

Via

Access Paper or Ask Questions

A Robotic Visual Grasping Design: Rethinking Convolution Neural Network with High-Resolutions

Sep 16, 2022

Zhangli Zhou, Shaochen Wang, Ziyang Chen, Mingyu Cai, Zhen Kan

Figure 1 for A Robotic Visual Grasping Design: Rethinking Convolution Neural Network with High-Resolutions

Abstract:High-resolution representations are important for vision-based robotic grasping problems. Existing works generally encode the input images into low-resolution representations via sub-networks and then recover high-resolution representations. This will lose spatial information, and errors introduced by the decoder will be more serious when multiple types of objects are considered or objects are far away from the camera. To address these issues, we revisit the design paradigm of CNN for robotic perception tasks. We demonstrate that using parallel branches as opposed to serial stacked convolutional layers will be a more powerful design for robotic visual grasping tasks. In particular, guidelines of neural network design are provided for robotic perception tasks, e.g., high-resolution representation and lightweight design, which respond to the challenges in different manipulation scenarios. We then develop a novel grasping visual architecture referred to as HRG-Net, a parallel-branch structure that always maintains a high-resolution representation and repeatedly exchanges information across resolutions. Extensive experiments validate that these two designs can effectively enhance the accuracy of visual-based grasping and accelerate network training. We show a series of comparative experiments in real physical environments at Youtube: https://youtu.be/Jhlsp-xzHFY.

Via

Access Paper or Ask Questions

Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

Feb 01, 2022

Mingyu Cai, Erfan Aasi, Calin Belta, Cristian-Ioan Vasile

Figure 1 for Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

Figure 2 for Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

Figure 3 for Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

Figure 4 for Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

Abstract:We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment. Linear Temporal Logic (LTL) is applied to express a rich robotic specification. To overcome the environmental challenge, we propose a novel path planning-guided reward scheme that is dense over the state space, and crucially, robust to infeasibility of computed geometric paths due to the unknown robot dynamics. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-tasks that are solved using distributed DRL, where the sub-tasks are trained in parallel, using Deep Policy Gradient algorithms. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.

Via

Access Paper or Ask Questions

Time-Incremental Learning from Data Using Temporal Logics

Dec 28, 2021

Erfan Aasi, Mingyu Cai, Cristian Ioan Vasile, Calin Belta

Figure 1 for Time-Incremental Learning from Data Using Temporal Logics

Figure 2 for Time-Incremental Learning from Data Using Temporal Logics

Figure 3 for Time-Incremental Learning from Data Using Temporal Logics

Abstract:Real-time and human-interpretable decision-making in cyber-physical systems is a significant but challenging task, which usually requires predictions of possible future events from limited data. In this paper, we introduce a time-incremental learning framework: given a dataset of labeled signal traces with a common time horizon, we propose a method to predict the label of a signal that is received incrementally over time, referred to as prefix signal. Prefix signals are the signals that are being observed as they are generated, and their time length is shorter than the common horizon of signals. We present a novel decision-tree based approach to generate a finite number of Signal Temporal Logic (STL) specifications from the given dataset, and construct a predictor based on them. Each STL specification, as a binary classifier of time-series data, captures the temporal properties of the dataset over time. The predictor is constructed by assigning time-variant weights to the STL formulas. The weights are learned by using neural networks, with the goal of minimizing the misclassification rate for the prefix signals defined over the given dataset. The learned predictor is used to predict the label of a prefix signal, by computing the weighted sum of the robustness of the prefix signal with respect to each STL formula. The effectiveness and classification performance of our algorithm are evaluated on an urban-driving and a naval-surveillance case studies.

Via

Access Paper or Ask Questions