Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Faizan Ahmed

Higher-Order Action Regularization in Deep Reinforcement Learning: From Continuous Control to Building Energy Management

Jan 05, 2026

Faizan Ahmed, Aniket Dixit, James Brusey

Abstract:Deep reinforcement learning agents often exhibit erratic, high-frequency control behaviors that hinder real-world deployment due to excessive energy consumption and mechanical wear. We systematically investigate action smoothness regularization through higher-order derivative penalties, progressing from theoretical understanding in continuous control benchmarks to practical validation in building energy management. Our comprehensive evaluation across four continuous control environments demonstrates that third-order derivative penalties (jerk minimization) consistently achieve superior smoothness while maintaining competitive performance. We extend these findings to HVAC control systems where smooth policies reduce equipment switching by 60%, translating to significant operational benefits. Our work establishes higher-order action regularization as an effective bridge between RL optimization and operational constraints in energy-critical applications.

* 6 pages, accepted at NeurIPS workshop 2025

Via

Access Paper or Ask Questions

Learning from Less: SINDy Surrogates in RL

Apr 25, 2025

Aniket Dixit, Muhammad Ibrahim Khan, Faizan Ahmed, James Brusey

Abstract:This paper introduces an approach for developing surrogate environments in reinforcement learning (RL) using the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm. We demonstrate the effectiveness of our approach through extensive experiments in OpenAI Gym environments, particularly Mountain Car and Lunar Lander. Our results show that SINDy-based surrogate models can accurately capture the underlying dynamics of these environments while reducing computational costs by 20-35%. With only 75 interactions for Mountain Car and 1000 for Lunar Lander, we achieve state-wise correlations exceeding 0.997, with mean squared errors as low as 3.11e-06 for Mountain Car velocity and 1.42e-06 for LunarLander position. RL agents trained in these surrogate environments require fewer total steps (65,075 vs. 100,000 for Mountain Car and 801,000 vs. 1,000,000 for Lunar Lander) while achieving comparable performance to those trained in the original environments, exhibiting similar convergence patterns and final performance metrics. This work contributes to the field of model-based RL by providing an efficient method for generating accurate, interpretable surrogate environments.

* World Models @ ICLR 2025

Via

Access Paper or Ask Questions

C-SHAP for time series: An approach to high-level temporal explanations

Apr 15, 2025

Annemarie Jutte, Faizan Ahmed, Jeroen Linssen, Maurice van Keulen

Figure 1 for C-SHAP for time series: An approach to high-level temporal explanations

Figure 2 for C-SHAP for time series: An approach to high-level temporal explanations

Figure 3 for C-SHAP for time series: An approach to high-level temporal explanations

Figure 4 for C-SHAP for time series: An approach to high-level temporal explanations

Abstract:Time series are ubiquitous in domains such as energy forecasting, healthcare, and industry. Using AI systems, some tasks within these domains can be efficiently handled. Explainable AI (XAI) aims to increase the reliability of AI solutions by explaining model reasoning. For time series, many XAI methods provide point- or sequence-based attribution maps. These methods explain model reasoning in terms of low-level patterns. However, they do not capture high-level patterns that may also influence model reasoning. We propose a concept-based method to provide explanations in terms of these high-level patterns. In this paper, we present C-SHAP for time series, an approach which determines the contribution of concepts to a model outcome. We provide a general definition of C-SHAP and present an example implementation using time series decomposition. Additionally, we demonstrate the effectiveness of the methodology through a use case from the energy domain.

* 10 pages, 6 figures

Via

Access Paper or Ask Questions