Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianglin Lan

Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

May 29, 2025

Maxiu Xiao, Jianglin Lan, Jingxing Yu, Eldert van Henten, Congcong Sun

Abstract:Climate control is crucial for greenhouse production as it directly affects crop growth and resource use. Reinforcement learning (RL) has received increasing attention in this field, but still faces challenges, including limited training efficiency and high reliance on initial learning conditions. Interactive RL, which combines human (grower) input with the RL agent's learning, offers a potential solution to overcome these challenges. However, interactive RL has not yet been applied to greenhouse climate control and may face challenges related to imperfect inputs. Therefore, this paper aims to explore the possibility and performance of applying interactive RL with imperfect inputs into greenhouse climate control, by: (1) developing three representative interactive RL algorithms tailored for greenhouse climate control (reward shaping, policy shaping and control sharing); (2) analyzing how input characteristics are often contradicting, and how the trade-offs between them make grower's inputs difficult to perfect; (3) proposing a neural network-based approach to enhance the robustness of interactive RL agents under limited input availability; (4) conducting a comprehensive evaluation of the three interactive RL algorithms with imperfect inputs in a simulated greenhouse environment. The demonstration shows that interactive RL incorporating imperfect grower inputs has the potential to improve the performance of the RL agent. RL algorithms that influence action selection, such as policy shaping and control sharing, perform better when dealing with imperfect inputs, achieving 8.4% and 6.8% improvement in profit, respectively. In contrast, reward shaping, an algorithm that manipulates the reward function, is sensitive to imperfect inputs and leads to a 9.4% decrease in profit. This highlights the importance of selecting an appropriate mechanism when incorporating imperfect inputs.

Via

Access Paper or Ask Questions

Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming

Feb 11, 2025

Azizjon Kobilov, Jianglin Lan

Abstract:Accurate task planning is critical for controlling autonomous systems, such as robots, drones, and self-driving vehicles. Behavior Trees (BTs) are considered one of the most prominent control-policy-defining frameworks in task planning, due to their modularity, flexibility, and reusability. Generating reliable and accurate BT-based control policies for robotic systems remains challenging and often requires domain expertise. In this paper, we present the LLM-GP-BT technique that leverages the Large Language Model (LLM) and Genetic Programming (GP) to automate the generation and configuration of BTs. The LLM-GP-BT technique processes robot task commands expressed in human natural language and converts them into accurate and reliable BT-based task plans in a computationally efficient and user-friendly manner. The proposed technique is systematically developed and validated through simulation experiments, demonstrating its potential to streamline task planning for autonomous systems.

* Submitted to IEEE Conference

Via

Access Paper or Ask Questions

Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars

Aug 21, 2024

Zhihao Lin, Zhen Tian, Qi Zhang, Hanyang Zhuang, Jianglin Lan

Figure 1 for Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars

Figure 2 for Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars

Figure 3 for Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars

Figure 4 for Enhanced Visual SLAM for Collision-free Driving with Lightweight Autonomous Cars

Abstract:The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car's poses and extract rich texture information from the scene. In the path planning phase, we employ a method combining a control Lyapunov function and control barrier function in the form of quadratic program (CLF-CBF-QP) together with an obstacle shape reconstruction process (SRP) to plan safe and stable trajectories. To validate the performance and robustness of the proposed method, simulation experiments were conducted with a car in various complex indoor environments using the Gazebo simulation environment. Our method can effectively avoid obstacles in the scenes. The proposed algorithm outperforms benchmark algorithms in achieving more stable and shorter trajectories across multiple simulated scenes.

* 16 pages; Submitted to a journal

Via

Access Paper or Ask Questions

A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts

Aug 15, 2024

Zhihao Lin, Zhen Tian, Qi Zhang, Ziyang Ye, Hanyang Zhuang, Jianglin Lan

Figure 1 for A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts

Figure 2 for A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts

Figure 3 for A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts

Figure 4 for A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts

Abstract:Safety and efficiency are crucial for autonomous driving in roundabouts, especially in the context of mixed traffic where autonomous vehicles (AVs) and human-driven vehicles coexist. This paper introduces a learning-based algorithm tailored to foster safe and efficient driving behaviors across varying levels of traffic flows in roundabouts. The proposed algorithm employs a deep Q-learning network to effectively learn safe and efficient driving strategies in complex multi-vehicle roundabouts. Additionally, a KAN (Kolmogorov-Arnold network) enhances the AVs' ability to learn their surroundings robustly and precisely. An action inspector is integrated to replace dangerous actions to avoid collisions when the AV interacts with the environment, and a route planner is proposed to enhance the driving efficiency and safety of the AVs. Moreover, a model predictive control is adopted to ensure stability and precision of the driving actions. The results show that our proposed system consistently achieves safe and efficient driving whilst maintaining a stable training process, as evidenced by the smooth convergence of the reward function and the low variance in the training curves across various traffic flows. Compared to state-of-the-art benchmarks, the proposed algorithm achieves a lower number of collisions and reduced travel time to destination.

* 15 pages, 12 figures, submitted to an IEEE journal

Via

Access Paper or Ask Questions

Efficient model predictive control for nonlinear systems modelled by deep neural networks

May 16, 2024

Jianglin Lan

Figure 1 for Efficient model predictive control for nonlinear systems modelled by deep neural networks

Figure 2 for Efficient model predictive control for nonlinear systems modelled by deep neural networks

Figure 3 for Efficient model predictive control for nonlinear systems modelled by deep neural networks

Figure 4 for Efficient model predictive control for nonlinear systems modelled by deep neural networks

Abstract:This paper presents a model predictive control (MPC) for dynamic systems whose nonlinearity and uncertainty are modelled by deep neural networks (NNs), under input and state constraints. Since the NN output contains a high-order complex nonlinearity of the system state and control input, the MPC problem is nonlinear and challenging to solve for real-time control. This paper proposes two types of methods for solving the MPC problem: the mixed integer programming (MIP) method which produces an exact solution to the nonlinear MPC, and linear relaxation (LR) methods which generally give suboptimal solutions but are much computationally cheaper. Extensive numerical simulation for an inverted pendulum system modelled by ReLU NNs of various sizes is used to demonstrate and compare performance of the MIP and LR methods.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Apr 20, 2024

Hanjiang Hu, Jianglin Lan, Changliu Liu

Figure 1 for Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Figure 2 for Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Figure 3 for Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Figure 4 for Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Abstract:Safe control of neural network dynamic models (NNDMs) is important to robotics and many applications. However, it remains challenging to compute an optimal safe control in real time for NNDM. To enable real-time computation, we propose to use a sound approximation of the NNDM in the control synthesis. In particular, we propose Bernstein over-approximated neural dynamics (BOND) based on the Bernstein polynomial over-approximation (BPO) of ReLU activation functions in NNDM. To mitigate the errors introduced by the approximation and to ensure persistent feasibility of the safe control problems, we synthesize a worst-case safety index using the most unsafe approximated state within the BPO relaxation of NNDM offline. For the online real-time optimization, we formulate the first-order Taylor approximation of the nonlinear worst-case safety constraint as an additional linear layer of NNDM with the l2 bounded bias term for the higher-order remainder. Comprehensive experiments with different neural dynamics and safety constraints show that with safety guaranteed, our NNDMs with sound approximation are 10-100 times faster than the safe control baseline that uses mixed integer programming (MIP), validating the effectiveness of the worst-case safety index and scalability of the proposed BOND in real-time large-scale settings.

* L4DC 2024, 12 pages, 3 figures, 4 tables

Via

Access Paper or Ask Questions

Runtime Monitoring and Fault Detection for Neural Network-Controlled Systems

Mar 24, 2024

Jianglin Lan, Siyuan Zhan, Ron Patton, Xianxian Zhao

Abstract:There is an emerging trend in applying deep learning methods to control complex nonlinear systems. This paper considers enhancing the runtime safety of nonlinear systems controlled by neural networks in the presence of disturbance and measurement noise. A robustly stable interval observer is designed to generate sound and precise lower and upper bounds for the neural network, nonlinear function, and system state. The obtained interval is utilised to monitor the real-time system safety and detect faults in the system outputs or actuators. An adaptive cruise control vehicular system is simulated to demonstrate effectiveness of the proposed design.

* Accepted to SAFEPROCESS 2024

Via

Access Paper or Ask Questions

Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation

Sep 22, 2023

Junqi Jiang, Jianglin Lan, Francesco Leofante, Antonio Rago, Francesca Toni

Figure 1 for Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation

Abstract:Counterfactual Explanations (CEs) have received increasing interest as a major methodology for explaining neural network classifiers. Usually, CEs for an input-output pair are defined as data points with minimum distance to the input that are classified with a different label than the output. To tackle the established problem that CEs are easily invalidated when model parameters are updated (e.g. retrained), studies have proposed ways to certify the robustness of CEs under model parameter changes bounded by a norm ball. However, existing methods targeting this form of robustness are not sound or complete, and they may generate implausible CEs, i.e., outliers wrt the training dataset. In fact, no existing method simultaneously optimises for proximity and plausibility while preserving robustness guarantees. In this work, we propose Provably RObust and PLAusible Counterfactual Explanations (PROPLACE), a method leveraging on robust optimisation techniques to address the aforementioned limitations in the literature. We formulate an iterative algorithm to compute provably robust CEs and prove its convergence, soundness and completeness. Through a comparative experiment involving six baselines, five of which target robustness, we show that PROPLACE achieves state-of-the-art performances against metrics on three evaluation aspects.

* Accepted at ACML 2023, camera-ready version

Via

Access Paper or Ask Questions

Data-Driven Cooperative Adaptive Cruise Control for Unknown Nonlinear Vehicle Platoons

Jul 21, 2023

Jianglin Lan

Figure 1 for Data-Driven Cooperative Adaptive Cruise Control for Unknown Nonlinear Vehicle Platoons

Figure 2 for Data-Driven Cooperative Adaptive Cruise Control for Unknown Nonlinear Vehicle Platoons

Figure 3 for Data-Driven Cooperative Adaptive Cruise Control for Unknown Nonlinear Vehicle Platoons

Figure 4 for Data-Driven Cooperative Adaptive Cruise Control for Unknown Nonlinear Vehicle Platoons

Abstract:This paper studies cooperative adaptive cruise control (CACC) for vehicle platoons with consideration of the unknown nonlinear vehicle dynamics that are normally ignored in the literature. A unified data-driven CACC design is proposed for platoons of pure automated vehicles (AVs) or of mixed AVs and human-driven vehicles (HVs). The CACC leverages online-collected sufficient data samples of vehicle accelerations, spacing and relative velocities. The data-driven control design is formulated as a semidefinite program (SDP) that can be solved efficiently using off-the-shelf solvers. The efficacy and advantage of the proposed CACC are demonstrated through a comparison with the classic adaptive cruise control (ACC) method on a platoon of pure AVs and a mixed platoon under a representative aggressive driving profile.

* 6 pages, 5 figures; This paper is under submission

Via

Access Paper or Ask Questions

Data-driven dual-loop control for platooning mixed human-driven and automated vehicles

Jul 21, 2023

Jianglin Lan

Abstract:This paper considers controlling automated vehicles (AVs) to form a platoon with human-driven vehicles (HVs) under consideration of unknown HV model parameters and propulsion time constants. The proposed design is a data-driven dual-loop control strategy for the ego AVs, where the inner loop controller ensures platoon stability and the outer loop controller keeps a safe inter-vehicular spacing under control input limits. The inner loop controller is a constant-gain state feedback controller solved from a semidefinite program (SDP) using the online collected data of platooning errors. The outer loop is a model predictive control (MPC) that embeds a data-driven internal model to predict the future platooning error evolution. The proposed design is evaluated on a mixed platoon with a representative aggressive reference velocity profile, the SFTP-US06 Drive Cycle. The results confirm efficacy of the design and its advantages over the existing single loop data-driven MPC in terms of platoon stability and computational cost.

* 10 pages, 6 figures. This paper has been accepted by IET Intelligent Transport Systems

Via

Access Paper or Ask Questions