Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chace Ashcraft

Backdoors in DRL: Four Environments Focusing on In-distribution Triggers

May 22, 2025

Chace Ashcraft, Ted Staley, Josh Carney, Cameron Hickert, Derek Juba, Kiran Karra, Nathan Drenkow

Abstract:Backdoor attacks, or trojans, pose a security risk by concealing undesirable behavior in deep neural network models. Open-source neural networks are downloaded from the internet daily, possibly containing backdoors, and third-party model developers are common. To advance research on backdoor attack mitigation, we develop several trojans for deep reinforcement learning (DRL) agents. We focus on in-distribution triggers, which occur within the agent's natural data distribution, since they pose a more significant security threat than out-of-distribution triggers due to their ease of activation by the attacker during model deployment. We implement backdoor attacks in four reinforcement learning (RL) environments: LavaWorld, Randomized LavaWorld, Colorful Memory, and Modified Safety Gymnasium. We train various models, both clean and backdoored, to characterize these attacks. We find that in-distribution triggers can require additional effort to implement and be more challenging for models to learn, but are nevertheless viable threats in DRL even using basic data poisoning attacks.

Via

Access Paper or Ask Questions

Investigating the Treacherous Turn in Deep Reinforcement Learning

Apr 11, 2025

Chace Ashcraft, Kiran Karra, Josh Carney, Nathan Drenkow

Abstract:The Treacherous Turn refers to the scenario where an artificial intelligence (AI) agent subtly, and perhaps covertly, learns to perform a behavior that benefits itself but is deemed undesirable and potentially harmful to a human supervisor. During training, the agent learns to behave as expected by the human supervisor, but when deployed to perform its task, it performs an alternate behavior without the supervisor there to prevent it. Initial experiments applying DRL to an implementation of the A Link to the Past example do not produce the treacherous turn effect naturally, despite various modifications to the environment intended to produce it. However, in this work, we find the treacherous behavior to be reproducible in a DRL agent when using other trojan injection strategies. This approach deviates from the prototypical treacherous turn behavior since the behavior is explicitly trained into the agent, rather than occurring as an emergent consequence of environmental complexity or poor objective specification. Nonetheless, these experiments provide new insights into the challenges of producing agents capable of true treacherous turn behavior.

Via

Access Paper or Ask Questions

Difference Learning for Air Quality Forecasting Transport Emulation

Feb 22, 2024

Reed River Chen, Christopher Ribaudo, Jennifer Sleeman, Chace Ashcraft, Collin Kofroth, Marisa Hughes, Ivanka Stajner, Kevin Viner, Kai Wang

Figure 1 for Difference Learning for Air Quality Forecasting Transport Emulation

Figure 2 for Difference Learning for Air Quality Forecasting Transport Emulation

Figure 3 for Difference Learning for Air Quality Forecasting Transport Emulation

Figure 4 for Difference Learning for Air Quality Forecasting Transport Emulation

Abstract:Human health is negatively impacted by poor air quality including increased risk for respiratory and cardiovascular disease. Due to a recent increase in extreme air quality events, both globally and locally in the United States, finer resolution air quality forecasting guidance is needed to effectively adapt to these events. The National Oceanic and Atmospheric Administration provides air quality forecasting guidance for the Continental United States. Their air quality forecasting model is based on a 15 km spatial resolution; however, the goal is to reach a three km spatial resolution. This is currently not feasible due in part to prohibitive computational requirements for modeling the transport of chemical species. In this work, we describe a deep learning transport emulator that is able to reduce computations while maintaining skill comparable with the existing numerical model. We show how this method maintains skill in the presence of extreme air quality events, making it a potential candidate for operational use. We also explore evaluating how well this model maintains the physical properties of the modeled transport for a given set of species.

Via

Access Paper or Ask Questions

Neuro-Symbolic Bi-Directional Translation -- Deep Learning Explainability for Climate Tipping Point Research

Jun 19, 2023

Chace Ashcraft, Jennifer Sleeman, Caroline Tang, Jay Brett, Anand Gnanadesikan

Abstract:In recent years, there has been an increase in using deep learning for climate and weather modeling. Though results have been impressive, explainability and interpretability of deep learning models are still a challenge. A third wave of Artificial Intelligence (AI), which includes logic and reasoning, has been described as a way to address these issues. Neuro-symbolic AI is a key component of this integration of logic and reasoning with deep learning. In this work we propose a neuro-symbolic approach called Neuro-Symbolic Question-Answer Program Translator, or NS-QAPT, to address explainability and interpretability for deep learning climate simulation, applied to climate tipping point discovery. The NS-QAPT method includes a bidirectional encoder-decoder architecture that translates between domain-specific questions and executable programs used to direct the climate simulation, acting as a bridge between climate scientists and deep learning models. We show early compelling results of this translation method and introduce a domain-specific language and associated executable programs for a commonly known tipping point, the collapse of the Atlantic Meridional Overturning Circulation (AMOC).

Via

Access Paper or Ask Questions

A Generative Adversarial Network for Climate Tipping Point Discovery (TIP-GAN)

Feb 16, 2023

Jennifer Sleeman, David Chung, Anand Gnanadesikan, Jay Brett, Yannis Kevrekidis, Marisa Hughes, Thomas Haine, Marie-Aude Pradal, Renske Gelderloos, Chace Ashcraft(+3 more)

Abstract:We propose a new Tipping Point Generative Adversarial Network (TIP-GAN) for better characterizing potential climate tipping points in Earth system models. We describe an adversarial game to explore the parameter space of these models, detect upcoming tipping points, and discover the drivers of tipping points. In this setup, a set of generators learn to construct model configurations that will invoke a climate tipping point. The discriminator learns to identify which generators are generating each model configuration and whether a given configuration will lead to a tipping point. The discriminator is trained using an oracle (a surrogate climate model) to test if a generated model configuration leads to a tipping point or not. We demonstrate the application of this GAN to invoke the collapse of the Atlantic Meridional Overturning Circulation (AMOC). We share experimental results of modifying the loss functions and the number of generators to exploit the area of uncertainty in model state space near a climate tipping point. In addition, we show that our trained discriminator can predict AMOC collapse with a high degree of accuracy without the use of the oracle. This approach could generalize to other tipping points, and could augment climate modeling research by directing users interested in studying tipping points to parameter sets likely to induce said tipping points in their computationally intensive climate models.

Via

Access Paper or Ask Questions

Using Artificial Intelligence to aid Scientific Discovery of Climate Tipping Points

Feb 14, 2023

Jennifer Sleeman, David Chung, Chace Ashcraft, Jay Brett, Anand Gnanadesikan, Yannis Kevrekidis, Marisa Hughes, Thomas Haine, Marie-Aude Pradal, Renske Gelderloos(+3 more)

Abstract:We propose a hybrid Artificial Intelligence (AI) climate modeling approach that enables climate modelers in scientific discovery using a climate-targeted simulation methodology based on a novel combination of deep neural networks and mathematical methods for modeling dynamical systems. The simulations are grounded by a neuro-symbolic language that both enables question answering of what is learned by the AI methods and provides a means of explainability. We describe how this methodology can be applied to the discovery of climate tipping points and, in particular, the collapse of the Atlantic Meridional Overturning Circulation (AMOC). We show how this methodology is able to predict AMOC collapse with a high degree of accuracy using a surrogate climate model for ocean interaction. We also show preliminary results of neuro-symbolic method performance when translating between natural language questions and symbolically learned representations. Our AI methodology shows promising early results, potentially enabling faster climate tipping point related research that would otherwise be computationally infeasible.

* This is the preprint of work presented at the 2022 AAAI Fall Symposium Series, Third Symposium on Knowledge-Guided ML, November 2022

Via

Access Paper or Ask Questions

Context-Adaptive Deep Neural Networks via Bridge-Mode Connectivity

Nov 28, 2022

Nathan Drenkow, Alvin Tan, Chace Ashcraft, Kiran Karra

Abstract:The deployment of machine learning models in safety-critical applications comes with the expectation that such models will perform well over a range of contexts (e.g., a vision model for classifying street signs should work in rural, city, and highway settings under varying lighting/weather conditions). However, these one-size-fits-all models are typically optimized for average case performance, encouraging them to achieve high performance in nominal conditions but exposing them to unexpected behavior in challenging or rare contexts. To address this concern, we develop a new method for training context-dependent models. We extend Bridge-Mode Connectivity (BMC) (Garipov et al., 2018) to train an infinite ensemble of models over a continuous measure of context such that we can sample model parameters specifically tuned to the corresponding evaluation context. We explore the definition of context in image classification tasks through multiple lenses including changes in the risk profile, long-tail image statistics/appearance, and context-dependent distribution shift. We develop novel extensions of the BMC optimization for each of these cases and our experiments demonstrate that model performance can be successfully tuned to context in each scenario.

* Accepted to the NeurIPS 2022 ML Safety Workshop

Via

Access Paper or Ask Questions

Latent Properties of Lifelong Learning Systems

Jul 28, 2022

Corban Rivera, Chace Ashcraft, Alexander New, James Schmidt, Gautam Vallabha

Figure 1 for Latent Properties of Lifelong Learning Systems

Figure 2 for Latent Properties of Lifelong Learning Systems

Figure 3 for Latent Properties of Lifelong Learning Systems

Figure 4 for Latent Properties of Lifelong Learning Systems

Abstract:Creating artificial intelligence (AI) systems capable of demonstrating lifelong learning is a fundamental challenge, and many approaches and metrics have been proposed to analyze algorithmic properties. However, for existing lifelong learning metrics, algorithmic contributions are confounded by task and scenario structure. To mitigate this issue, we introduce an algorithm-agnostic explainable surrogate-modeling approach to estimate latent properties of lifelong learning algorithms. We validate the approach for estimating these properties via experiments on synthetic data. To validate the structure of the surrogate model, we analyze real performance data from a collection of popular lifelong learning approaches and baselines adapted for lifelong classification and lifelong reinforcement learning.

* Accepted at 1st Conference on Lifelong Learning Agents (CoLLAs) Workshop Track, 2022

Via

Access Paper or Ask Questions

L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Mar 14, 2022

Erik C. Johnson, Eric Q. Nguyen, Blake Schreurs, Chigozie S. Ewulum, Chace Ashcraft, Neil M. Fendley, Megan M. Baker, Alexander New, Gautam K. Vallabha

Figure 1 for L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Figure 2 for L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Figure 3 for L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Figure 4 for L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Abstract:Despite groundbreaking progress in reinforcement learning for robotics, gameplay, and other complex domains, major challenges remain in applying reinforcement learning to the evolving, open-world problems often found in critical application spaces. Reinforcement learning solutions tend to generalize poorly when exposed to new tasks outside of the data distribution they are trained on, prompting an interest in continual learning algorithms. In tandem with research on continual learning algorithms, there is a need for challenge environments, carefully designed experiments, and metrics to assess research progress. We address the latter need by introducing a framework for continual reinforcement-learning development and assessment using Lifelong Learning Explorer (L2Explorer), a new, Unity-based, first-person 3D exploration environment that can be continuously reconfigured to generate a range of tasks and task variants structured into complex and evolving evaluation curricula. In contrast to procedurally generated worlds with randomized components, we have developed a systematic approach to defining curricula in response to controlled changes with accompanying metrics to assess transfer, performance recovery, and data efficiency. Taken together, the L2Explorer environment and evaluation approach provides a framework for developing future evaluation methodologies in open-world settings and rigorously evaluating approaches to lifelong learning.

* 10 Pages submitted to AAAI AI for Open Worlds Symposium 2022

Via

Access Paper or Ask Questions

Meta Arcade: A Configurable Environment Suite for Meta-Learning

Dec 01, 2021

Edward W. Staley, Chace Ashcraft, Benjamin Stoler, Jared Markowitz, Gautam Vallabha, Christopher Ratto, Kapil D. Katyal

Figure 1 for Meta Arcade: A Configurable Environment Suite for Meta-Learning

Figure 2 for Meta Arcade: A Configurable Environment Suite for Meta-Learning

Figure 3 for Meta Arcade: A Configurable Environment Suite for Meta-Learning

Figure 4 for Meta Arcade: A Configurable Environment Suite for Meta-Learning

Abstract:Most approaches to deep reinforcement learning (DRL) attempt to solve a single task at a time. As a result, most existing research benchmarks consist of individual games or suites of games that have common interfaces but little overlap in their perceptual features, objectives, or reward structures. To facilitate research into knowledge transfer among trained agents (e.g. via multi-task and meta-learning), more environment suites that provide configurable tasks with enough commonality to be studied collectively are needed. In this paper we present Meta Arcade, a tool to easily define and configure custom 2D arcade games that share common visuals, state spaces, action spaces, game components, and scoring mechanisms. Meta Arcade differs from prior environments in that both task commonality and configurability are prioritized: entire sets of games can be constructed from common elements, and these elements are adjustable through exposed parameters. We include a suite of 24 predefined games that collectively illustrate the possibilities of this framework and discuss how these games can be configured for research applications. We provide several experiments that illustrate how Meta Arcade could be used, including single-task benchmarks of predefined games, sample curriculum-based approaches that change game parameters over a set schedule, and an exploration of transfer learning between games.

* 17 pages, 6 figures, 6 tables, extended version of an accepted paper to NeurIPS DRL Workshop 2021

Via

Access Paper or Ask Questions