Abstract:We introduce an Implicit Game-Theoretic MPC (IGT-MPC), a decentralized algorithm for two-agent motion planning that uses a learned value function that predicts the game-theoretic interaction outcomes as the terminal cost-to-go function in a model predictive control (MPC) framework, guiding agents to implicitly account for interactions with other agents and maximize their reward. This approach applies to competitive and cooperative multi-agent motion planning problems which we formulate as constrained dynamic games. Given a constrained dynamic game, we randomly sample initial conditions and solve for the generalized Nash equilibrium (GNE) to generate a dataset of GNE solutions, computing the reward outcome of each game-theoretic interaction from the GNE. The data is used to train a simple neural network to predict the reward outcome, which we use as the terminal cost-to-go function in an MPC scheme. We showcase emerging competitive and coordinated behaviors using IGT-MPC in scenarios such as two-vehicle head-to-head racing and un-signalized intersection navigation. IGT-MPC offers a novel method integrating machine learning and game-theoretic reasoning into model-based decentralized multi-agent motion planning.
Abstract:We present an approach for predictive braking of a four-wheeled vehicle on a nonplanar road. Our main contribution is a methodology to consider friction and road contact safety on general smooth road geometry. We use this to develop an active safety system to preemptively reduce vehicle speed for upcoming road geometry, such as off-camber turns. Our system may be used for human-driven or autonomous vehicles and we demonstrate it with a simulated ADAS scenario. We show that loss of control due to driver error on nonplanar roads can be mitigated by our approach.
Abstract:We present a novel control-oriented motorcycle model and use it for computing racing lines on a nonplanar racetrack. The proposed model combines recent advances in nonplanar road models with the dynamics of motorcycles. Our approach considers the additional camber degree of freedom of the motorcycle body with a simplified model of the rider and front steering fork bodies. We demonstrate the effectiveness of our model by computing minimum-time racing trajectories on a nonplanar racetrack.
Abstract:Dynamic games can be an effective approach for modeling interactive behavior between multiple competitive agents in autonomous racing and they provide a theoretical framework for simultaneous prediction and control in such scenarios. In this work, we propose DG-SQP, a numerical method for the solution of local generalized Nash equilibria (GNE) for open-loop general-sum dynamic games for agents with nonlinear dynamics and constraints. In particular, we formulate a sequential quadratic programming (SQP) approach which requires only the solution of a single convex quadratic program at each iteration. The three key elements of the method are a non-monotonic line search for solving the associated KKT equations, a merit function to handle zero sum costs, and a decaying regularization scheme for SQP step selection. We show that our method achieves linear convergence in the neighborhood of local GNE and demonstrate the effectiveness of the approach in the context of head-to-head car racing, where we show significant improvement in solver success rate when comparing against the state-of-the-art PATH solver for dynamic games. An implementation of our solver can be found at https://github.com/zhu-edward/DGSQP.
Abstract:This paper introduces a novel data-driven hierarchical control scheme for managing a fleet of nonlinear, capacity-constrained autonomous agents in an iterative environment. We propose a control framework consisting of a high-level dynamic task assignment and routing layer and low-level motion planning and tracking layer. Each layer of the control hierarchy uses a data-driven Model Predictive Control (MPC) policy, maintaining bounded computational complexity at each calculation of a new task assignment or actuation input. We utilize collected data to iteratively refine estimates of agent capacity usage, and update MPC policy parameters accordingly. Our approach leverages tools from iterative learning control to integrate learning at both levels of the hierarchy, and coordinates learning between levels in order to maintain closed-loop feasibility and performance improvement of the connected architecture.
Abstract:We propose a hierarchical architecture designed for scalable real-time Model Predictive Control (MPC) in complex, multi-modal traffic scenarios. This architecture comprises two key components: 1) RAID-Net, a novel attention-based Recurrent Neural Network that predicts relevant interactions along the MPC prediction horizon between the autonomous vehicle and the surrounding vehicles using Lagrangian duality, and 2) a reduced Stochastic MPC problem that eliminates irrelevant collision avoidance constraints, enhancing computational efficiency. Our approach is demonstrated in a simulated traffic intersection with interactive surrounding vehicles, showcasing a 12x speed-up in solving the motion planning problem. A video demonstrating the proposed architecture in multiple complex traffic scenarios can be found here: https://youtu.be/-TcMeolCLWc
Abstract:We propose a Stochastic MPC (SMPC) formulation for path planning with autonomous vehicles in scenarios involving multiple agents with multi-modal predictions. The multi-modal predictions capture the uncertainty of urban driving in distinct modes/maneuvers (e.g., yield, keep speed) and driving trajectories (e.g., speed, turning radius), which are incorporated for multi-modal collision avoidance chance constraints for path planning. In the presence of multi-modal uncertainties, it is challenging to reliably compute feasible path planning solutions at real-time frequencies ($\geq$ 10 Hz). Our main technological contribution is a convex SMPC formulation that simultaneously (1) optimizes over parameterized feedback policies and (2) allocates risk levels for each mode of the prediction. The use of feedback policies and risk allocation enhances the feasibility and performance of the SMPC formulation against multi-modal predictions with large uncertainty. We evaluate our approach via simulations and road experiments with a full-scale vehicle interacting in closed-loop with virtual vehicles. We consider distinct, multi-modal driving scenarios: 1) Negotiating a traffic light and a fast, tailgating agent, 2) Executing an unprotected left turn at a traffic intersection, and 3) Changing lanes in the presence of multiple agents. For all of these scenarios, our approach reliably computes multi-modal solutions to the path-planning problem at real-time frequencies.
Abstract:This work presents a novel Learning Model Predictive Control (LMPC) strategy for autonomous racing at the handling limit that can iteratively explore and learn unknown dynamics in high-speed operational domains. We start from existing LMPC formulations and modify the system dynamics learning method. In particular, our approach uses a nominal, global, nonlinear, physics-based model with a local, linear, data-driven learning of the error dynamics. We conduct experiments in simulation, 1/10th scale hardware, and deployed the proposed LMPC on a full-scale autonomous race car used in the Indy Autonomous Challenge (IAC) with closed loop experiments at the Putnam Park Road Course in Indiana, USA. The results show that the proposed control policy exhibits improved robustness to parameter tuning and data scarcity. Incremental and safety-aware exploration toward the limit of handling and iterative learning of the vehicle dynamics in high-speed domains is observed both in simulations and experiments.
Abstract:We present two approaches to compute raceline trajectories for quadrotors by solving an optimal control problem. The approaches involve expressing quadrotor pose in either a Euclidean or non-Euclidean frame of reference and are both based on collocation. The compute times of both approaches are over 100x faster than published methods. Additionally, both approaches compute trajectories with faster lap time and show improved numerical convergence. In the last part of the paper we devise a novel method to compute racelines in dense obstacle fields using the non-Euclidean approach.
Abstract:We present a data-driven optimization approach for robotic controlled deposition with a degradable tool. Existing methods make the assumption that the tool tip is not changing or is replaced frequently. Errors can accumulate over time as the tool wears away and this leads to poor performance in the case where the tool degradation is unaccounted for during deposition. In the proposed approach, we utilize visual and force feedback to update the unknown model parameters of our tool-tip. Subsequently, we solve a constrained finite time optimal control problem for tracking a reference deposition profile, where our robot plans with the learned tool degradation dynamics. We focus on a robotic drawing problem as an illustrative example. Using real-world experiments, we show that the error in target vs actual deposition decreases when learned degradation models are used in the control design.