Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Gracyk

Complex variational autoencoders admit Kähler structure

Nov 19, 2025

Andrew Gracyk

Figure 1 for Complex variational autoencoders admit Kähler structure

Figure 2 for Complex variational autoencoders admit Kähler structure

Figure 3 for Complex variational autoencoders admit Kähler structure

Figure 4 for Complex variational autoencoders admit Kähler structure

Abstract:It has been discovered that latent-Euclidean variational autoencoders (VAEs) admit, in various capacities, Riemannian structure. We adapt these arguments but for complex VAEs with a complex latent stage. We show that complex VAEs reveal to some level Kähler geometric structure. Our methods will be tailored for decoder geometry. We derive the Fisher information metric in the complex case under a latent complex Gaussian regularization with trivial relation matrix. It is well known from statistical information theory that the Fisher information coincides with the Hessian of the Kullback-Leibler (KL) divergence. Thus, the metric Kähler potential relation is exactly achieved under relative entropy. We propose a Kähler potential derivative of complex Gaussian mixtures that has rough equivalence to the Fisher information metric while still being faithful to the underlying Kähler geometry. Computation of the metric via this potential is efficient, and through our potential, valid as a plurisubharmonic (PSH) function, large scale computational burden of automatic differentiation is displaced to small scale. We show that we can regularize the latent space with decoder geometry, and that we can sample in accordance with a weighted complex volume element. We demonstrate these strategies, at the exchange of sample variation, yield consistently smoother representations and fewer semantic outliers.

* First version

Via

Access Paper or Ask Questions

Observability conditions for neural state-space models with eigenvalues and their roots of unity

Apr 22, 2025

Andrew Gracyk

Abstract:We operate through the lens of ordinary differential equations and control theory to study the concept of observability in the context of neural state-space models and the Mamba architecture. We develop strategies to enforce observability, which are tailored to a learning context, specifically where the hidden states are learnable at initial time, in conjunction to over its continuum, and high-dimensional. We also highlight our methods emphasize eigenvalues, roots of unity, or both. Our methods effectuate computational efficiency when enforcing observability, sometimes at great scale. We formulate observability conditions in machine learning based on classical control theory and discuss their computational complexity. Our nontrivial results are fivefold. We discuss observability through the use of permutations in neural applications with learnable matrices without high precision. We present two results built upon the Fourier transform that effect observability with high probability up to the randomness in the learning. These results are worked with the interplay of representations in Fourier space and their eigenstructure, nonlinear mappings, and the observability matrix. We present a result for Mamba that is similar to a Hautus-type condition, but instead employs an argument using a Vandermonde matrix instead of eigenvectors. Our final result is a shared-parameter construction of the Mamba system, which is computationally efficient in high exponentiation. We develop a training algorithm with this coupling, showing it satisfies a Robbins-Monro condition under certain orthogonality, while a more classical training procedure fails to satisfy a contraction with high Lipschitz constant.

* First version

Via

Access Paper or Ask Questions

Variational autoencoders with latent high-dimensional steady geometric flows for dynamics

Oct 14, 2024

Andrew Gracyk

Figure 1 for Variational autoencoders with latent high-dimensional steady geometric flows for dynamics

Figure 2 for Variational autoencoders with latent high-dimensional steady geometric flows for dynamics

Figure 3 for Variational autoencoders with latent high-dimensional steady geometric flows for dynamics

Figure 4 for Variational autoencoders with latent high-dimensional steady geometric flows for dynamics

Abstract:We develop Riemannian approaches to variational autoencoders (VAEs) for PDE-type ambient data with regularizing geometric latent dynamics, which we refer to as VAE-DLM, or VAEs with dynamical latent manifolds. We redevelop the VAE framework such that manifold geometries, subject to a geometric flow, embedded in Euclidean space are learned in the intermediary latent space developed by encoders and decoders. We reformulate the traditional evidence lower bound (ELBO) loss with a considerate choice of prior. We develop a linear geometric flow with a steady-state regularizing term. This geometric flow requires only automatic differentiation of one time derivative, and can be solved in moderately high dimensions in a physics-informed approach, allowing more expressive latent representations. We discuss how this flow can be formulated as a gradient flow, and maintains entropy away from metric singularity. This, along with an eigenvalue penalization condition, helps ensure the manifold is sufficiently large in measure, nondegenerate, and a canonical geometry, which contribute to a robust representation. Our methods focus on the modified multi-layer perceptron architecture with tanh activations for the manifold encoder-decoder. We demonstrate, on our datasets of interest, our methods perform at least as well as the traditional VAE, and oftentimes better. Our methods can outperform a standard VAE and a VAE endowed with our proposed architecture by up to 25% reduction in out-of-distribution (OOD) error and potentially greater. We highlight our method on ambient PDEs whose solutions maintain minimal variation in late times over its solution. Our approaches are particularly favorable with severe OOD effect. We provide empirical justification towards how latent Riemannian manifolds improve robust learning for external dynamics with VAEs.

* 35 pages; 21 figures

Via

Access Paper or Ask Questions

Ricci flow-guided autoencoders in learning time-dependent dynamics

Feb 04, 2024

Andrew Gracyk

Figure 1 for Ricci flow-guided autoencoders in learning time-dependent dynamics

Figure 2 for Ricci flow-guided autoencoders in learning time-dependent dynamics

Figure 3 for Ricci flow-guided autoencoders in learning time-dependent dynamics

Figure 4 for Ricci flow-guided autoencoders in learning time-dependent dynamics

Abstract:We present a manifold-based autoencoder method for learning nonlinear dynamics in time, notably partial differential equations (PDEs), in which the manifold latent space evolves according to Ricci flow. This can be accomplished by simulating Ricci flow in a physics-informed setting, and manifold quantities can be matched so that Ricci flow is empirically achieved. With our methodology, the manifold is learned as part of the training procedure, so ideal geometries may be discerned, while the evolution simultaneously induces a more accommodating latent representation over static methods. We present our method on a range of numerical experiments consisting of PDEs that encompass desirable characteristics such as periodicity and randomness, remarking error on in-distribution and extrapolation scenarios.

Via

Access Paper or Ask Questions

GeONet: a neural operator for learning the Wasserstein geodesic

Sep 30, 2022

Andrew Gracyk, Xiaohui Chen

Figure 1 for GeONet: a neural operator for learning the Wasserstein geodesic

Figure 2 for GeONet: a neural operator for learning the Wasserstein geodesic

Figure 3 for GeONet: a neural operator for learning the Wasserstein geodesic

Figure 4 for GeONet: a neural operator for learning the Wasserstein geodesic

Abstract:Optimal transport (OT) offers a versatile framework to compare complex data distributions in a geometrically meaningful way. Traditional methods for computing the Wasserstein distance and geodesic between probability measures require mesh-dependent domain discretization and suffer from the curse-of-dimensionality. We present GeONet, a mesh-invariant deep neural operator network that learns the non-linear mapping from the input pair of initial and terminal distributions to the Wasserstein geodesic connecting the two endpoint distributions. In the offline training stage, GeONet learns the saddle point optimality conditions for the dynamic formulation of the OT problem in the primal and dual spaces that are characterized by a coupled PDE system. The subsequent inference stage is instantaneous and can be deployed for real-time predictions in the online learning setting. We demonstrate that GeONet achieves comparable testing accuracy to the standard OT solvers on a simulation example and the CIFAR-10 dataset with considerably reduced inference-stage computational cost by orders of magnitude.

Via

Access Paper or Ask Questions