Abstract:We introduce a methodology for performing parameter inference in high-dimensional, non-linear diffusion processes. We illustrate its applicability for obtaining insights into the evolution of and relationships between species, including ancestral state reconstruction. Estimation is performed by utilising score matching to approximate diffusion bridges, which are subsequently used in an importance sampler to estimate log-likelihoods. The entire setup is differentiable, allowing gradient ascent on approximated log-likelihoods. This allows both parameter inference and diffusion mean estimation. This novel, numerically stable, score matching-based parameter inference framework is presented and demonstrated on biological two- and three-dimensional morphometry data.
Abstract:The diffusion bridge is a type of diffusion process that conditions on hitting a specific state within a finite time period. It has broad applications in fields such as Bayesian inference, financial mathematics, control theory, and shape analysis. However, simulating the diffusion bridge for natural data can be challenging due to both the intractability of the drift term and continuous representations of the data. Although several methods are available to simulate finite-dimensional diffusion bridges, infinite-dimensional cases remain unresolved. In the paper, we present a solution to this problem by merging score-matching techniques with operator learning, enabling a direct approach to score-matching for the infinite-dimensional bridge. We construct the score to be discretization invariant, which is natural given the underlying spatially continuous process. We conduct a series of experiments, ranging from synthetic examples with closed-form solutions to the stochastic nonlinear evolution of real-world biological shape data, and our method demonstrates high efficacy, particularly due to its ability to adapt to any resolution without extra training.
Abstract:Generative diffusion models and many stochastic models in science and engineering naturally live in infinite dimensions before discretisation. To incorporate observed data for statistical and learning tasks, one needs to condition on observations. While recent work has treated conditioning linear processes in infinite dimensions, conditioning non-linear processes in infinite dimensions has not been explored. This paper conditions function valued stochastic processes without prior discretisation. To do so, we use an infinite-dimensional version of Girsanov's theorem to condition a function-valued stochastic process, leading to a stochastic differential equation (SDE) for the conditioned process involving the score. We apply this technique to do time series analysis for shapes of organisms in evolutionary biology, where we discretise via the Fourier basis and then learn the coefficients of the score function with score matching methods.