Abstract:Recent advances in recording technology have allowed neuroscientists to monitor activity from thousands of neurons simultaneously. Latent variable models are increasingly valuable for distilling these recordings into compact and interpretable representations. Here we propose a new approach to neural data analysis that leverages advances in conditional generative modeling to enable the unsupervised inference of disentangled behavioral variables from recorded neural activity. Our approach builds on InfoDiffusion, which augments diffusion models with a set of latent variables that capture important factors of variation in the data. We apply our model, called Generating Neural Observations Conditioned on Codes with High Information (GNOCCHI), to time series neural data and test its application to synthetic and biological recordings of neural activity during reaching. In comparison to a VAE-based sequential autoencoder, GNOCCHI learns higher-quality latent spaces that are more clearly structured and more disentangled with respect to key behavioral variables. These properties enable accurate generation of novel samples (unseen behavioral conditions) through simple linear traversal of the latent spaces produced by GNOCCHI. Our work demonstrates the potential of unsupervised, information-based models for the discovery of interpretable latent spaces from neural data, enabling researchers to generate high-quality samples from unseen conditions.