Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edward Chien

Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior

Jun 11, 2024

Anming Gu, Edward Chien, Kristjan Greenewald

Abstract:Trajectory inference seeks to recover the temporal dynamics of a population from snapshots of its (uncoupled) temporal marginals, i.e. where observed particles are not tracked over time. Lavenant et al. arXiv:2102.09204 addressed this challenging problem under a stochastic differential equation (SDE) model with a gradient-driven drift in the observed space, introducing a minimum entropy estimator relative to the Wiener measure. Chizat et al. arXiv:2205.07146 then provided a practical grid-free mean-field Langevin (MFL) algorithm using Schr\"odinger bridges. Motivated by the overwhelming success of observable state space models in the traditional paired trajectory inference problem (e.g. target tracking), we extend the above framework to a class of latent SDEs in the form of observable state space models. In this setting, we use partial observations to infer trajectories in the latent space under a specified dynamics model (e.g. the constant velocity/acceleration models from target tracking). We introduce PO-MFL to solve this latent trajectory inference problem and provide theoretical guarantees by extending the results of arXiv:2102.09204 to the partially observed setting. We leverage the MFL framework of arXiv:2205.07146, yielding an algorithm based on entropic OT between dynamics-adjusted adjacent time marginals. Experiments validate the robustness of our method and the exponential convergence of the MFL dynamics, and demonstrate significant outperformance over the latent-free method of arXiv:2205.07146 in key scenarios.

* 32 pages, 9 figures

Via

Access Paper or Ask Questions

k-Mixup Regularization for Deep Learning via Optimal Transport

Jun 05, 2021

Kristjan Greenewald, Anming Gu, Mikhail Yurochkin, Justin Solomon, Edward Chien

Figure 1 for k-Mixup Regularization for Deep Learning via Optimal Transport

Figure 2 for k-Mixup Regularization for Deep Learning via Optimal Transport

Figure 3 for k-Mixup Regularization for Deep Learning via Optimal Transport

Figure 4 for k-Mixup Regularization for Deep Learning via Optimal Transport

Abstract:Mixup is a popular regularization technique for training deep neural networks that can improve generalization and increase adversarial robustness. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup to \emph{$k$-mixup} by perturbing $k$-batches of training points in the direction of other $k$-batches using displacement interpolation, interpolation under the Wasserstein metric. We demonstrate theoretically and in simulations that $k$-mixup preserves cluster and manifold structures, and we extend theory studying efficacy of standard mixup. Our empirical results show that training with $k$-mixup further improves generalization and robustness on benchmark datasets.

Via

Access Paper or Ask Questions

Incorporating Unlabeled Data into Distributionally Robust Learning

Dec 18, 2019

Charlie Frogner, Sebastian Claici, Edward Chien, Justin Solomon

Figure 1 for Incorporating Unlabeled Data into Distributionally Robust Learning

Figure 2 for Incorporating Unlabeled Data into Distributionally Robust Learning

Figure 3 for Incorporating Unlabeled Data into Distributionally Robust Learning

Figure 4 for Incorporating Unlabeled Data into Distributionally Robust Learning

Abstract:We study a robust alternative to empirical risk minimization called distributionally robust learning (DRL), in which one learns to perform against an adversary who can choose the data distribution from a specified set of distributions. We illustrate a problem with current DRL formulations, which rely on an overly broad definition of allowed distributions for the adversary, leading to learned classifiers that are unable to predict with any confidence. We propose a solution that incorporates unlabeled data into the DRL problem to further constrain the adversary. We show that this new formulation is tractable for stochastic gradient-based optimization and yields a computable guarantee on the future performance of the learned classifier, analogous to -- but tighter than -- guarantees from conventional DRL. We examine the performance of this new formulation on 14 real datasets and find that it often yields effective classifiers with nontrivial performance guarantees in situations where conventional DRL produces neither. Inspired by these results, we extend our DRL formulation to active learning with a novel, distributionally-robust version of the standard model-change heuristic. Our active learning algorithm often achieves superior learning performance to the original heuristic on real datasets.

Via

Access Paper or Ask Questions

Alleviating Label Switching with Optimal Transport

Nov 10, 2019

Pierre Monteiller, Sebastian Claici, Edward Chien, Farzaneh Mirzazadeh, Justin Solomon, Mikhail Yurochkin

Figure 1 for Alleviating Label Switching with Optimal Transport

Figure 2 for Alleviating Label Switching with Optimal Transport

Figure 3 for Alleviating Label Switching with Optimal Transport

Figure 4 for Alleviating Label Switching with Optimal Transport

Abstract:Label switching is a phenomenon arising in mixture model posterior inference that prevents one from meaningfully assessing posterior statistics using standard Monte Carlo procedures. This issue arises due to invariance of the posterior under actions of a group; for example, permuting the ordering of mixture components has no effect on the likelihood. We propose a resolution to label switching that leverages machinery from optimal transport. Our algorithm efficiently computes posterior statistics in the quotient space of the symmetry group. We give conditions under which there is a meaningful solution to label switching and demonstrate advantages over alternative approaches on simulated and real data.

* 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

Via

Access Paper or Ask Questions

Hierarchical Optimal Transport for Document Representation

Jun 26, 2019

Mikhail Yurochkin, Sebastian Claici, Edward Chien, Farzaneh Mirzazadeh, Justin Solomon

Figure 1 for Hierarchical Optimal Transport for Document Representation

Figure 2 for Hierarchical Optimal Transport for Document Representation

Figure 3 for Hierarchical Optimal Transport for Document Representation

Figure 4 for Hierarchical Optimal Transport for Document Representation

Abstract:The ability to measure similarity between documents enables intelligent summarization and analysis of large corpora. Past distances between documents suffer from either an inability to incorporate semantic similarities between words or from scalability issues. As an alternative, we introduce hierarchical optimal transport as a meta-distance between documents, where documents are modeled as distributions over topics, which themselves are modeled as distributions over words. We then solve an optimal transport problem on the smaller topic space to compute a similarity score. We give conditions on the topics under which this construction defines a distance, and we relate it to the word mover's distance. We evaluate our technique for $k$-NN classification and show better interpretability and scalability with comparable performance to current methods at a fraction of the cost.

Via

Access Paper or Ask Questions

Stochastic Wasserstein Barycenters

Jun 07, 2018

Sebastian Claici, Edward Chien, Justin Solomon

Figure 1 for Stochastic Wasserstein Barycenters

Figure 2 for Stochastic Wasserstein Barycenters

Figure 3 for Stochastic Wasserstein Barycenters

Figure 4 for Stochastic Wasserstein Barycenters

Abstract:We present a stochastic algorithm to compute the barycenter of a set of probability distributions under the Wasserstein metric from optimal transport. Unlike previous approaches, our method extends to continuous input distributions and allows the support of the barycenter to be adjusted in each iteration. We tackle the problem without regularization, allowing us to recover a sharp output whose support is contained within the support of the true barycenter. We give examples where our algorithm recovers a more meaningful barycenter than previous work. Our method is versatile and can be extended to applications such as generating super samples from a given distribution and recovering blue noise approximations.

* ICML 2018

Via

Access Paper or Ask Questions