Abstract:The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations in behavior-based optimization and deep reinforcement learning (RL) were deployed on these USV systems in two versus two teams and tested against each other during a competition period in the fall of 2023. Deep reinforcement learning applied to USV agents was achieved via the Pyquaticus test bed, a lightweight gymnasium environment that allows simulated CTF training in a low-level environment. The results of the experiment demonstrate that rule-based cooperation for behavior-based agents outperformed those trained in Deep-reinforcement learning paradigms as implemented in these competitions. Further integration of the Pyquaticus gymnasium environment for RL with MOOS-IvP in terms of configuration and control schema will allow for more competitive CTF games in future studies. As the development of experimental deep RL methods continues, the authors expect that the competitive gap between behavior-based autonomy and deep RL will be reduced. As such, this report outlines the overall competition, methods, and results with an emphasis on future works such as reward shaping and sim-to-real methodologies and extending rule-based cooperation among agents to react to safety and security events in accordance with human experts intent/rules for executing safety and security processes.
Abstract:This paper addresses the long-standing open problem of observability of mass and inertia plant parameters in the adaptive identification (AID) of second-order nonlinear models of 6 degree-of-freedom rigid-body dynamical systems subject to externally applied forces and moments. Although stable methods for AID of plant parameters for this class of systems, as well numerous approaches to stable model-based direct adaptive trajectory-tracking control of such systems, have been reported, these studies have been unable to prove analytically that the adaptive parameter estimates converge to the true plant parameter values. This paper reports necessary and sufficient conditions for the uniform complete observability (UCO) of 6-DOF plant inertial parameters for a stable adaptive identifier for this class of systems. When the UCO condition is satisfied, the adaptive parameter estimates are shown to converge to the true plant parameter values. To the best of our knowledge this is the first reported proof for this class of systems of UCO of plant parameters and for convergence of adaptive parameter estimates to true parameter values. We also report a numerical simulation study of this AID approach which shows that (a) the UCO condition can be met for fully-actuated plants as well as underactuated plants with the proper choice of control input and (b) convergence of adaptive parameter estimates to the true parameter values. We conjecture that this approach can be extended to include other parameters that appear rigid body plant models including parameters for drag, buoyancy, added mass, bias, and actuators.