Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ullrich Köthe

OOD Detection with immature Models

Feb 02, 2025

Behrooz Montazeran, Ullrich Köthe

Figure 1 for OOD Detection with immature Models

Figure 2 for OOD Detection with immature Models

Figure 3 for OOD Detection with immature Models

Figure 4 for OOD Detection with immature Models

Abstract:Likelihood-based deep generative models (DGMs) have gained significant attention for their ability to approximate the distributions of high-dimensional data. However, these models lack a performance guarantee in assigning higher likelihood values to in-distribution (ID) inputs, data the models are trained on, compared to out-of-distribution (OOD) inputs. This counter-intuitive behaviour is particularly pronounced when ID inputs are more complex than OOD data points. One potential approach to address this challenge involves leveraging the gradient of a data point with respect to the parameters of the DGMs. A recent OOD detection framework proposed estimating the joint density of layer-wise gradient norms for a given data point as a model-agnostic method, demonstrating superior performance compared to the Typicality Test across likelihood-based DGMs and image dataset pairs. In particular, most existing methods presuppose access to fully converged models, the training of which is both time-intensive and computationally demanding. In this work, we demonstrate that using immature models,stopped at early stages of training, can mostly achieve equivalent or even superior results on this downstream task compared to mature models capable of generating high-quality samples that closely resemble ID data. This novel finding enhances our understanding of how DGMs learn the distribution of ID data and highlights the potential of leveraging partially trained models for downstream tasks. Furthermore, we offer a possible explanation for this unexpected behaviour through the concept of support overlap.

* 17 pages, 2 Tables, 9 Figures

Via

Access Paper or Ask Questions

TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Oct 25, 2024

Stefan Wahl, Armand Rousselot, Felix Draxler, Ullrich Köthe

Figure 1 for TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Figure 2 for TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Figure 3 for TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Figure 4 for TRADE: Transfer of Distributions between External Conditions with Normalizing Flows

Abstract:Modeling distributions that depend on external control parameters is a common scenario in diverse applications like molecular simulations, where system properties like temperature affect molecular configurations. Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on backward training, which is prone to unstable training. We introduce TRADE, which overcomes these limitations by formulating the learning process as a boundary value problem. By initially training the model for a specific condition using either i.i.d. samples or backward KL training, we establish a boundary distribution. We then propagate this information across other conditions using the gradient of the unnormalized density with respect to the external parameter. This formulation, akin to the principles of physics-informed neural networks, allows us to efficiently learn parameter-dependent distributions without restrictive assumptions. Experimentally, we demonstrate that TRADE achieves excellent results in a wide range of applications, ranging from Bayesian inference and molecular simulations to physical lattice models.

* Preprint, under review

Via

Access Paper or Ask Questions

Analyzing Generative Models by Manifold Entropic Metrics

Oct 25, 2024

Daniel Galperin, Ullrich Köthe

Figure 1 for Analyzing Generative Models by Manifold Entropic Metrics

Figure 2 for Analyzing Generative Models by Manifold Entropic Metrics

Figure 3 for Analyzing Generative Models by Manifold Entropic Metrics

Figure 4 for Analyzing Generative Models by Manifold Entropic Metrics

Abstract:Good generative models should not only synthesize high quality data, but also utilize interpretable representations that aid human understanding of their behavior. However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducing a novel set of tractable information-theoretic evaluation metrics. We demonstrate the usefulness of our metrics on illustrative toy examples and conduct an in-depth comparison of various normalizing flow architectures and $\beta$-VAEs on the EMNIST dataset. Our method allows to sort latent features by importance and assess the amount of residual correlations of the resulting concepts. The most interesting finding of our experiments is a ranking of model architectures and training procedures in terms of their inductive bias to converge to aligned and disentangled representations during training.

Via

Access Paper or Ask Questions

Learning Distances from Data with Normalizing Flows and Score Matching

Jul 12, 2024

Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnörr, Ullrich Köthe

Figure 1 for Learning Distances from Data with Normalizing Flows and Score Matching

Figure 2 for Learning Distances from Data with Normalizing Flows and Score Matching

Figure 3 for Learning Distances from Data with Normalizing Flows and Score Matching

Figure 4 for Learning Distances from Data with Normalizing Flows and Score Matching

Abstract:Density-based distances (DBDs) offer an elegant solution to the problem of metric learning. By defining a Riemannian metric which increases with decreasing probability density, shortest paths naturally follow the data manifold and points are clustered according to the modes of the data. We show that existing methods to estimate Fermat distances, a particular choice of DBD, suffer from poor convergence in both low and high dimensions due to i) inaccurate density estimates and ii) reliance on graph-based paths which are increasingly rough in high dimensions. To address these issues, we propose learning the densities using a normalizing flow, a generative model with tractable density estimation, and employing a smooth relaxation method using a score model initialized from a graph-based proposal. Additionally, we introduce a dimension-adapted Fermat distance that exhibits more intuitive behavior when scaled to high dimensions and offers better numerical properties. Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.

Via

Access Paper or Ask Questions

Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Jun 25, 2024

Peter Lorenz, Mario Fernandez, Jens Müller, Ullrich Köthe

Figure 1 for Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Figure 2 for Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Figure 3 for Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Figure 4 for Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

Abstract:Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast and showing an option to protect a pre-trained classifier against natural distribution shifts, claiming to be ready for real-world scenarios. However, its efficacy in handling adversarial examples has been neglected in the majority of studies. This paper investigates the adversarial robustness of the 16 post-hoc detectors on several evasion attacks and discuss a roadmap towards adversarial defense in OOD detectors.

Via

Access Paper or Ask Questions

Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Jun 06, 2024

Marvin Schmitt, Paul-Christian Bürkner, Ullrich Köthe, Stefan T. Radev

Figure 1 for Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Figure 2 for Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Figure 3 for Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Figure 4 for Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation

Abstract:Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately, that is, if the true system behavior at test time deviates from the one seen during training? We conceptualize the types of such model misspecification arising in SBI and systematically investigate how the performance of neural posterior approximators gradually deteriorates as a consequence, making inference results less and less trustworthy. To notify users about this problem, we propose a new misspecification measure that can be trained in an unsupervised fashion (i.e., without training data from the true distribution) and reliably detects model misspecification at test time. Our experiments clearly demonstrate the utility of our new measure both on toy examples with an analytical ground-truth and on representative scientific tasks in cell biology, cognitive decision making, disease outbreak dynamics, and computer vision. We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.

* Extended version of the conference paper https://doi.org/10.1007/978-3-031-54605-1_35. arXiv admin note: text overlap with arXiv:2112.08866

Via

Access Paper or Ask Questions

DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Mar 12, 2024

Michael Götz, Christian Weber, Franciszek Binczyk, Joanna Polanska, Rafal Tarnawski, Barbara Bobek-Billewicz, Ullrich Köthe, Jens Kleesiek, Bram Stieltjes, Klaus H. Maier-Hein

Figure 1 for DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Figure 2 for DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Figure 3 for DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Figure 4 for DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR Images

Abstract:We propose a new method that employs transfer learning techniques to effectively correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation. The practicality of current learning-based automated tissue classification approaches is severely impeded by their dependency on manually segmented training databases that need to be recreated for each scenario of application, site, or acquisition setup. The comprehensive annotation of reference datasets can be highly labor-intensive, complex, and error-prone. The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations and employs domain adaptation techniques for effectively correcting sampling selection errors introduced by the sparse sampling. The new approach is validated on labeled, multi-modal MR images of 19 patients with malignant gliomas and by comparative analysis on the BraTS 2013 challenge data sets. Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy. This dramatically eases the establishment and constant extension of large annotated databases in various scenarios and imaging setups and thus represents an important step towards practical applicability of learning-based approaches in tissue classification.

* IEEE Transactions on Medical Imaging ( Volume: 35, Issue: 1, January 2016)

Via

Access Paper or Ask Questions

On the Universality of Coupling-based Normalizing Flows

Feb 09, 2024

Felix Draxler, Stefan Wahl, Christoph Schnörr, Ullrich Köthe

Figure 1 for On the Universality of Coupling-based Normalizing Flows

Figure 2 for On the Universality of Coupling-based Normalizing Flows

Figure 3 for On the Universality of Coupling-based Normalizing Flows

Figure 4 for On the Universality of Coupling-based Normalizing Flows

Abstract:We present a novel theoretical framework for understanding the expressive power of coupling-based normalizing flows such as RealNVP. Despite their prevalence in scientific applications, a comprehensive understanding of coupling flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. Additionally, we demonstrate that these constructions inherently lead to volume-preserving flows, a property which we show to be a fundamental constraint for expressivity. We propose a new distributional universality theorem for coupling-based normalizing flows, which overcomes several limitations of prior work. Our results support the general wisdom that the coupling architecture is expressive and provide a nuanced view for choosing the expressivity of coupling functions, bridging a gap between empirical results and theoretical understanding.

* under review

Via

Access Paper or Ask Questions

Towards Context-Aware Domain Generalization: Representing Environments with Permutation-Invariant Networks

Dec 15, 2023

Jens Müller, Lars Kühmichel, Martin Rohbeck, Stefan T. Radev, Ullrich Köthe

Abstract:In this work, we show that information about the context of an input $X$ can improve the predictions of deep learning models when applied in new domains or production environments. We formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same environment/domain as the input itself. These representations are jointly learned with a standard supervised learning objective, providing incremental information about the unknown outcome. Furthermore, we offer a theoretical analysis of the conditions under which our approach can, in principle, yield benefits, and formulate two necessary criteria that can be easily verified in practice. Additionally, we contribute insights into the kind of distribution shifts for which our approach promises robustness. Our empirical evaluation demonstrates the effectiveness of our approach for both low-dimensional and high-dimensional data sets. Finally, we demonstrate that we can reliably detect scenarios where a model is tasked with unwarranted extrapolation in out-of-distribution (OOD) domains, identifying potential failure cases. Consequently, we showcase a method to select between the most predictive and the most robust model, circumventing the well-known trade-off between predictive performance and robustness.

Via

Access Paper or Ask Questions

Learning Distributions on Manifolds with Free-form Flows

Dec 15, 2023

Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe

Figure 1 for Learning Distributions on Manifolds with Free-form Flows

Figure 2 for Learning Distributions on Manifolds with Free-form Flows

Figure 3 for Learning Distributions on Manifolds with Free-form Flows

Figure 4 for Learning Distributions on Manifolds with Free-form Flows

Abstract:Many real world data, particularly in the natural sciences and computer vision, lie on known Riemannian manifolds such as spheres, tori or the group of rotation matrices. The predominant approaches to learning a distribution on such a manifold require solving a differential equation in order to sample from the model and evaluate densities. The resulting sampling times are slowed down by a high number of function evaluations. In this work, we propose an alternative approach which only requires a single function evaluation followed by a projection to the manifold. Training is achieved by an adaptation of the recently proposed free-form flow framework to Riemannian manifolds. The central idea is to estimate the gradient of the negative log-likelihood via a trace evaluated in the tangent space. We evaluate our method on various manifolds, and find significantly faster inference at competitive performance compared to previous work. We make our code public at https://github.com/vislearn/FFF.

* Preprint, under review

Via

Access Paper or Ask Questions