Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andreas A. Malikopoulos

A Communication-Efficient Decentralized Actor-Critic Algorithm

Oct 22, 2025

Xiaoxing Ren, Nicola Bastianello, Thomas Parisini, Andreas A. Malikopoulos

Abstract:In this paper, we study the problem of reinforcement learning in multi-agent systems where communication among agents is limited. We develop a decentralized actor-critic learning framework in which each agent performs several local updates of its policy and value function, where the latter is approximated by a multi-layer neural network, before exchanging information with its neighbors. This local training strategy substantially reduces the communication burden while maintaining coordination across the network. We establish finite-time convergence analysis for the algorithm under Markov-sampling. Specifically, to attain the $\varepsilon$-accurate stationary point, the sample complexity is of order $\mathcal{O}(\varepsilon^{-3})$ and the communication complexity is of order $\mathcal{O}(\varepsilon^{-1}\tau^{-1})$, where tau denotes the number of local training steps. We also show how the final error bound depends on the neural network's approximation quality. Numerical experiments in a cooperative control setting illustrate and validate the theoretical findings.

Via

Access Paper or Ask Questions

VisioPath: Vision-Language Enhanced Model Predictive Control for Safe Autonomous Navigation in Mixed Traffic

Jul 08, 2025

Shanting Wang, Panagiotis Typaldos, Chenjun Li, Andreas A. Malikopoulos

Abstract:In this paper, we introduce VisioPath, a novel framework combining vision-language models (VLMs) with model predictive control (MPC) to enable safe autonomous driving in dynamic traffic environments. The proposed approach leverages a bird's-eye view video processing pipeline and zero-shot VLM capabilities to obtain structured information about surrounding vehicles, including their positions, dimensions, and velocities. Using this rich perception output, we construct elliptical collision-avoidance potential fields around other traffic participants, which are seamlessly integrated into a finite-horizon optimal control problem for trajectory planning. The resulting trajectory optimization is solved via differential dynamic programming with an adaptive regularization scheme and is embedded in an event-triggered MPC loop. To ensure collision-free motion, a safety verification layer is incorporated in the framework that provides an assessment of potential unsafe trajectories. Extensive simulations in Simulation of Urban Mobility (SUMO) demonstrate that VisioPath outperforms conventional MPC baselines across multiple metrics. By combining modern AI-driven perception with the rigorous foundation of optimal control, VisioPath represents a significant step forward in safe trajectory planning for complex traffic systems.

Via

Access Paper or Ask Questions

AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning

Apr 28, 2025

Weihao Sun, Heeseung Bang, Andreas A. Malikopoulos

Figure 1 for AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning

Figure 2 for AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning

Figure 3 for AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning

Figure 4 for AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning

Abstract:In this paper, we present an adherence-aware reinforcement learning (RL) approach aimed at seeking optimal lane-changing recommendations within a semi-autonomous driving environment to enhance a single vehicle's travel efficiency. The problem is framed within a Markov decision process setting and is addressed through an adherence-aware deep Q network, which takes into account the partial compliance of human drivers with the recommended actions. This approach is evaluated within CARLA's driving environment under realistic scenarios.

* 6 pages, 5 figures, conference

Via

Access Paper or Ask Questions

Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Apr 01, 2025

Nishanth Venkatesh S., Heeseung Bang, Andreas A. Malikopoulos

Figure 1 for Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Figure 2 for Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Figure 3 for Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Figure 4 for Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding

Abstract:In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. We conceptualize this as a sequential decision-making problem, where the sender and receiver interact over multiple rounds. In each round, the sender communicates with the receiver, who also interacts with the environment. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder. We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments.

* 8 pages, 4 Figures

Via

Access Paper or Ask Questions

CorrA: Leveraging Large Language Models for Dynamic Obstacle Avoidance of Autonomous Vehicles

Mar 03, 2025

Shanting Wang, Panagiotis Typaldos, Andreas A. Malikopoulos

Abstract:In this paper, we present Corridor-Agent (CorrA), a framework that integrates large language models (LLMs) with model predictive control (MPC) to address the challenges of dynamic obstacle avoidance in autonomous vehicles. Our approach leverages LLM reasoning ability to generate appropriate parameters for sigmoid-based boundary functions that define safe corridors around obstacles, effectively reducing the state-space of the controlled vehicle. The proposed framework adjusts these boundaries dynamically based on real-time vehicle data that guarantees collision-free trajectories while also ensuring both computational efficiency and trajectory optimality. The problem is formulated as an optimal control problem and solved with differential dynamic programming (DDP) for constrained optimization, and the proposed approach is embedded within an MPC framework. Extensive simulation and real-world experiments demonstrate that the proposed framework achieves superior performance in maintaining safety and efficiency in complex, dynamic environments compared to a baseline MPC approach.

Via

Access Paper or Ask Questions

Safe Merging in Mixed Traffic with Confidence

Mar 09, 2024

Heeseung Bang, Aditya Dave, Andreas A. Malikopoulos

Abstract:In this letter, we present an approach for learning human driving behavior, without relying on specific model structures or prior distributions, in a mixed-traffic environment where connected and automated vehicles (CAVs) coexist with human-driven vehicles (HDVs). We employ conformal prediction to obtain theoretical safety guarantees and use real-world traffic data to validate our approach. Then, we design a controller that ensures effective merging of CAVs with HDVs with safety guarantees. We provide numerical simulations to illustrate the efficacy of the control approach.

* 6 pages, 5 figures

Via

Access Paper or Ask Questions

A Framework for Effective AI Recommendations in Cyber-Physical-Human Systems

Mar 08, 2024

Aditya Dave, Heeseung Bang, Andreas A. Malikopoulos

Abstract:Many cyber-physical-human systems (CPHS) involve a human decision-maker who may receive recommendations from an artificial intelligence (AI) platform while holding the ultimate responsibility of making decisions. In such CPHS applications, the human decision-maker may depart from an optimal recommended decision and instead implement a different one for various reasons. In this letter, we develop a rigorous framework to overcome this challenge. In our framework, we consider that humans may deviate from AI recommendations as they perceive and interpret the system's state in a different way than the AI platform. We establish the structural properties of optimal recommendation strategies and develop an approximate human model (AHM) used by the AI. We provide theoretical bounds on the optimality gap that arises from an AHM and illustrate the efficacy of our results in a numerical example.

Via

Access Paper or Ask Questions

Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach

Oct 10, 2023

Viet-Anh Le, Vaishnav Tadiparthi, Behdad Chalaki, Hossein Nourkhiz Mahjoub, Jovin D'sa, Ehsan Moradi-Pari, Andreas A. Malikopoulos

Figure 1 for Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach

Figure 2 for Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach

Figure 3 for Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach

Figure 4 for Multi-Robot Cooperative Navigation in Crowds: A Game-Theoretic Learning-Based Model Predictive Control Approach

Abstract:In this paper, we develop a control framework for the coordination of multiple robots as they navigate through crowded environments. Our framework comprises of a local model predictive control (MPC) for each robot and a social long short-term memory model that forecasts pedestrians' trajectories. We formulate the local MPC formulation for each individual robot that includes both individual and shared objectives, in which the latter encourages the emergence of coordination among robots. Next, we consider the multi-robot navigation and human-robot interaction, respectively, as a potential game and a two-player game, then employ an iterative best response approach to solve the resulting optimization problems in a centralized and distributed fashion. Finally, we demonstrate the effectiveness of coordination among robots in simulated crowd navigation.

Via

Access Paper or Ask Questions

A Q-learning Approach for Adherence-Aware Recommendations

Sep 12, 2023

Ioannis Faros, Aditya Dave, Andreas A. Malikopoulos

Figure 1 for A Q-learning Approach for Adherence-Aware Recommendations

Figure 2 for A Q-learning Approach for Adherence-Aware Recommendations

Figure 3 for A Q-learning Approach for Adherence-Aware Recommendations

Figure 4 for A Q-learning Approach for Adherence-Aware Recommendations

Abstract:In many real-world scenarios involving high-stakes and safety implications, a human decision-maker (HDM) may receive recommendations from an artificial intelligence while holding the ultimate responsibility of making decisions. In this letter, we develop an "adherence-aware Q-learning" algorithm to address this problem. The algorithm learns the "adherence level" that captures the frequency with which an HDM follows the recommended actions and derives the best recommendation policy in real time. We prove the convergence of the proposed Q-learning algorithm to the optimal value and evaluate its performance across various scenarios.

Via

Access Paper or Ask Questions

Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

Apr 01, 2023

Nishanth Venkatesh, Viet-Anh Le, Aditya Dave, Andreas A. Malikopoulos

Figure 1 for Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

Figure 2 for Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

Figure 3 for Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

Figure 4 for Connected and Automated Vehicles in Mixed-Traffic: Learning Human Driver Behavior for Effective On-Ramp Merging

Abstract:Highway merging scenarios featuring mixed traffic conditions pose significant modeling and control challenges for connected and automated vehicles (CAVs) interacting with incoming on-ramp human-driven vehicles (HDVs). In this paper, we present an approach to learn an approximate information state model of CAV-HDV interactions for a CAV to maneuver safely during highway merging. In our approach, the CAV learns the behavior of an incoming HDV using approximate information states before generating a control strategy to facilitate merging. First, we validate the efficacy of this framework on real-world data by using it to predict the behavior of an HDV in mixed traffic situations extracted from the Next-Generation Simulation repository. Then, we generate simulation data for HDV-CAV interactions in a highway merging scenario using a standard inverse reinforcement learning approach. Without assuming a prior knowledge of the generating model, we show that our approximate information state model learns to predict the future trajectory of the HDV using only observations. Subsequently, we generate safe control policies for a CAV while merging with HDVs, demonstrating a spectrum of driving behaviors, from aggressive to conservative. We demonstrate the effectiveness of the proposed approach by performing numerical simulations.

Via

Access Paper or Ask Questions