Abstract:Motivated by the increasing popularity of overparameterized Stochastic Differential Equations (SDEs) like Neural SDEs, Wang, Blanchet and Glynn recently introduced the generator gradient estimator, a novel unbiased stochastic gradient estimator for SDEs whose computation time remains stable in the number of parameters. In this note, we demonstrate that this estimator is in fact an adjoint state method, an approach which is known to scale with the number of states and not the number of parameters in the case of Ordinary Differential Equations (ODEs). In addition, we show that the generator gradient estimator is a close analogue to the exact Integral Path Algorithm (eIPA) estimator which was introduced by Gupta, Rathinam and Khammash for a class of Continuous-Time Markov Chains (CTMCs) known as stochastic chemical reactions networks (CRNs).
Abstract:Stochastic filtering is a vibrant area of research in both control theory and statistics, with broad applications in many scientific fields. Despite its extensive historical development, there still lacks an effective method for joint parameter-state estimation in SDEs. The state-of-the-art particle filtering methods suffer from either sample degeneracy or information loss, with both issues stemming from the dynamics of the particles generated to represent system parameters. This paper provides a novel and effective approach for joint parameter-state estimation in SDEs via Rao-Blackwellization and modularization. Our method operates in two layers: the first layer estimates the system states using a bootstrap particle filter, and the second layer marginalizes out system parameters explicitly. This strategy circumvents the need to generate particles representing system parameters, thereby mitigating their associated problems of sample degeneracy and information loss. Moreover, our method employs a modularization approach when integrating out the parameters, which significantly reduces the computational complexity. All these designs ensure the superior performance of our method. Finally, a numerical example is presented to illustrate that our method outperforms existing approaches by a large margin.