Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vivek Narayanaswamy

Leveraging Registers in Vision Transformers for Robust Adaptation

Jan 08, 2025

Srikar Yellapragada, Kowshik Thopalli, Vivek Narayanaswamy, Wesam Sakla, Yang Liu, Yamen Mubarka, Dimitris Samaras, Jayaraman J. Thiagarajan

Figure 1 for Leveraging Registers in Vision Transformers for Robust Adaptation

Figure 2 for Leveraging Registers in Vision Transformers for Robust Adaptation

Figure 3 for Leveraging Registers in Vision Transformers for Robust Adaptation

Abstract:Vision Transformers (ViTs) have shown success across a variety of tasks due to their ability to capture global image representations. Recent studies have identified the existence of high-norm tokens in ViTs, which can interfere with unsupervised object discovery. To address this, the use of "registers" which are additional tokens that isolate high norm patch tokens while capturing global image-level information has been proposed. While registers have been studied extensively for object discovery, their generalization properties particularly in out-of-distribution (OOD) scenarios, remains underexplored. In this paper, we examine the utility of register token embeddings in providing additional features for improving generalization and anomaly rejection. To that end, we propose a simple method that combines the special CLS token embedding commonly employed in ViTs with the average-pooled register embeddings to create feature representations which are subsequently used for training a downstream classifier. We find that this enhances OOD generalization and anomaly rejection, while maintaining in-distribution (ID) performance. Extensive experiments across multiple ViT backbones trained with and without registers reveal consistent improvements of 2-4\% in top-1 OOD accuracy and a 2-3\% reduction in false positive rates for anomaly detection. Importantly, these gains are achieved without additional computational overhead.

* Accepted at ICASSP 2025

Via

Access Paper or Ask Questions

DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation

Aug 01, 2024

Rakshith Subramanyam, Kowshik Thopalli, Vivek Narayanaswamy, Jayaraman J. Thiagarajan

Figure 1 for DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation

Figure 2 for DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation

Figure 3 for DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation

Figure 4 for DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation

Abstract:Reliably detecting when a deployed machine learning model is likely to fail on a given input is crucial for ensuring safe operation. In this work, we propose DECIDER (Debiasing Classifiers to Identify Errors Reliably), a novel approach that leverages priors from large language models (LLMs) and vision-language models (VLMs) to detect failures in image classification models. DECIDER utilizes LLMs to specify task-relevant core attributes and constructs a ``debiased'' version of the classifier by aligning its visual features to these core attributes using a VLM, and detects potential failure by measuring disagreement between the original and debiased models. In addition to proactively identifying samples on which the model would fail, DECIDER also provides human-interpretable explanations for failure through a novel attribute-ablation strategy. Through extensive experiments across diverse benchmarks spanning subpopulation shifts (spurious correlations, class imbalance) and covariate shifts (synthetic corruptions, domain shifts), DECIDER consistently achieves state-of-the-art failure detection performance, significantly outperforming baselines in terms of the overall Matthews correlation coefficient as well as failure and success recall. Our codes can be accessed at~\url{https://github.com/kowshikthopalli/DECIDER/}

* Accepted at ECCV (European Conference on Computer Vision) 2024

Via

Access Paper or Ask Questions

On the Use of Anchoring for Training Vision Models

Jun 01, 2024

Vivek Narayanaswamy, Kowshik Thopalli, Rushil Anirudh, Yamen Mubarka, Wesam Sakla, Jayaraman J. Thiagarajan

Abstract:Anchoring is a recent, architecture-agnostic principle for training deep neural networks that has been shown to significantly improve uncertainty estimation, calibration, and extrapolation capabilities. In this paper, we systematically explore anchoring as a general protocol for training vision models, providing fundamental insights into its training and inference processes and their implications for generalization and safety. Despite its promise, we identify a critical problem in anchored training that can lead to an increased risk of learning undesirable shortcuts, thereby limiting its generalization capabilities. To address this, we introduce a new anchored training protocol that employs a simple regularizer to mitigate this issue and significantly enhances generalization. We empirically evaluate our proposed approach across datasets and architectures of varying scales and complexities, demonstrating substantial performance gains in generalization and safety metrics compared to the standard training protocol.

Via

Access Paper or Ask Questions

PAGER: A Framework for Failure Analysis of Deep Regression Models

Sep 20, 2023

Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Puja Trivedi, Rushil Anirudh

Figure 1 for PAGER: A Framework for Failure Analysis of Deep Regression Models

Figure 2 for PAGER: A Framework for Failure Analysis of Deep Regression Models

Figure 3 for PAGER: A Framework for Failure Analysis of Deep Regression Models

Figure 4 for PAGER: A Framework for Failure Analysis of Deep Regression Models

Abstract:Safe deployment of AI models requires proactive detection of potential prediction failures to prevent costly errors. While failure detection in classification problems has received significant attention, characterizing failure modes in regression tasks is more complicated and less explored. Existing approaches rely on epistemic uncertainties or feature inconsistency with the training distribution to characterize model risk. However, we show that uncertainties are necessary but insufficient to accurately characterize failure, owing to the various sources of error. In this paper, we propose PAGER (Principled Analysis of Generalization Errors in Regressors), a framework to systematically detect and characterize failures in deep regression models. Built upon the recently proposed idea of anchoring in deep models, PAGER unifies both epistemic uncertainties and novel, complementary non-conformity scores to organize samples into different risk regimes, thereby providing a comprehensive analysis of model errors. Additionally, we introduce novel metrics for evaluating failure detectors in regression tasks. We demonstrate the effectiveness of PAGER on synthetic and real-world benchmarks. Our results highlight the capability of PAGER to identify regions of accurate generalization and detect failure cases in out-of-distribution and out-of-support scenarios.

Via

Access Paper or Ask Questions

An L2-Normalized Spatial Attention Network For Accurate And Fast Classification Of Brain Tumors In 2D T1-Weighted CE-MRI Images

Aug 01, 2023

Grace Billingsley, Julia Dietlmeier, Vivek Narayanaswamy, Andreas Spanias, Noel E. OConnor

Figure 1 for An L2-Normalized Spatial Attention Network For Accurate And Fast Classification Of Brain Tumors In 2D T1-Weighted CE-MRI Images

Figure 2 for An L2-Normalized Spatial Attention Network For Accurate And Fast Classification Of Brain Tumors In 2D T1-Weighted CE-MRI Images

Figure 3 for An L2-Normalized Spatial Attention Network For Accurate And Fast Classification Of Brain Tumors In 2D T1-Weighted CE-MRI Images

Abstract:We propose an accurate and fast classification network for classification of brain tumors in MRI images that outperforms all lightweight methods investigated in terms of accuracy. We test our model on a challenging 2D T1-weighted CE-MRI dataset containing three types of brain tumors: Meningioma, Glioma and Pituitary. We introduce an l2-normalized spatial attention mechanism that acts as a regularizer against overfitting during training. We compare our results against the state-of-the-art on this dataset and show that by integrating l2-normalized spatial attention into a baseline network we achieve a performance gain of 1.79 percentage points. Even better accuracy can be attained by combining our model in an ensemble with the pretrained VGG16 at the expense of execution speed. Our code is publicly available at https://github.com/juliadietlmeier/MRI_image_classification

* Accepted to be published in: IEEE International Conference on Image Processing (ICIP), Kuala Lumpur October 8-11, 2023

Via

Access Paper or Ask Questions

Single Model Uncertainty Estimation via Stochastic Data Centering

Jul 14, 2022

Jayaraman J. Thiagarajan, Rushil Anirudh, Vivek Narayanaswamy, Peer-Timo Bremer

Figure 1 for Single Model Uncertainty Estimation via Stochastic Data Centering

Figure 2 for Single Model Uncertainty Estimation via Stochastic Data Centering

Figure 3 for Single Model Uncertainty Estimation via Stochastic Data Centering

Figure 4 for Single Model Uncertainty Estimation via Stochastic Data Centering

Abstract:We are interested in estimating the uncertainties of deep neural networks, which play an important role in many scientific and engineering problems. In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences in predictions are a strong indicator of epistemic uncertainties. Using the neural tangent kernel (NTK), we demonstrate that this phenomena occurs in part because the NTK is not shift-invariant. Since this is achieved via a trivial input transformation, we show that it can therefore be approximated using just a single neural network -- using a technique that we call $\Delta-$UQ -- that estimates uncertainty around prediction by marginalizing out the effect of the biases. We show that $\Delta-$UQ's uncertainty estimates are superior to many of the current methods on a variety of benchmarks -- outlier rejection, calibration under distribution shift, and sequential design optimization of black box functions.

Via

Access Paper or Ask Questions

Revisiting Inlier and Outlier Specification for Improved Out-of-Distribution Detection

Jul 12, 2022

Vivek Narayanaswamy, Yamen Mubarka, Rushil Anirudh, Deepta Rajan, Andreas Spanias, Jayaraman J. Thiagarajan

Figure 1 for Revisiting Inlier and Outlier Specification for Improved Out-of-Distribution Detection

Figure 2 for Revisiting Inlier and Outlier Specification for Improved Out-of-Distribution Detection

Figure 3 for Revisiting Inlier and Outlier Specification for Improved Out-of-Distribution Detection

Figure 4 for Revisiting Inlier and Outlier Specification for Improved Out-of-Distribution Detection

Abstract:Accurately detecting out-of-distribution (OOD) data with varying levels of semantic and covariate shifts with respect to the in-distribution (ID) data is critical for deployment of safe and reliable models. This is particularly the case when dealing with highly consequential applications (e.g. medical imaging, self-driving cars, etc). The goal is to design a detector that can accept meaningful variations of the ID data, while also rejecting examples from OOD regimes. In practice, this dual objective can be realized by enforcing consistency using an appropriate scoring function (e.g., energy) and calibrating the detector to reject a curated set of OOD data (referred to as outlier exposure or shortly OE). While OE methods are widely adopted, assembling representative OOD datasets is both costly and challenging due to the unpredictability of real-world scenarios, hence the recent trend of designing OE-free detectors. In this paper, we make a surprising finding that controlled generalization to ID variations and exposure to diverse (synthetic) outlier examples are essential to simultaneously improving semantic and modality shift detection. In contrast to existing methods, our approach samples inliers in the latent space, and constructs outlier examples via negative data augmentation. Through a rigorous empirical study on medical imaging benchmarks (MedMNIST, ISIC2019 and NCT), we demonstrate significant performance gains ($15\% - 35\%$ in AUROC) over existing OE-free, OOD detection approaches under both semantic and modality shifts.

Via

Access Paper or Ask Questions

Designing Counterfactual Generators using Deep Model Inversion

Oct 05, 2021

Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Deepta Rajan, Jason Liang, Akshay Chaudhari, Andreas Spanias

Figure 1 for Designing Counterfactual Generators using Deep Model Inversion

Figure 2 for Designing Counterfactual Generators using Deep Model Inversion

Figure 3 for Designing Counterfactual Generators using Deep Model Inversion

Figure 4 for Designing Counterfactual Generators using Deep Model Inversion

Abstract:Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

* Neurips 2021

Via

Access Paper or Ask Questions

Loss Estimators Improve Model Generalization

Mar 05, 2021

Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Deepta Rajan, Andreas Spanias

Figure 1 for Loss Estimators Improve Model Generalization

Figure 2 for Loss Estimators Improve Model Generalization

Figure 3 for Loss Estimators Improve Model Generalization

Figure 4 for Loss Estimators Improve Model Generalization

Abstract:With increased interest in adopting AI methods for clinical diagnosis, a vital step towards safe deployment of such tools is to ensure that the models not only produce accurate predictions but also do not generalize to data regimes where the training data provide no meaningful evidence. Existing approaches for ensuring the distribution of model predictions to be similar to that of the true distribution rely on explicit uncertainty estimators that are inherently hard to calibrate. In this paper, we propose to train a loss estimator alongside the predictive model, using a contrastive training objective, to directly estimate the prediction uncertainties. Interestingly, we find that, in addition to producing well-calibrated uncertainties, this approach improves the generalization behavior of the predictor. Using a dermatology use-case, we show the impact of loss estimators on model generalization, in terms of both its fidelity on in-distribution data and its ability to detect out of distribution samples or new classes unseen during training.

Via

Access Paper or Ask Questions

Using Deep Image Priors to Generate Counterfactual Explanations

Oct 22, 2020

Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias

Figure 1 for Using Deep Image Priors to Generate Counterfactual Explanations

Figure 2 for Using Deep Image Priors to Generate Counterfactual Explanations

Figure 3 for Using Deep Image Priors to Generate Counterfactual Explanations

Figure 4 for Using Deep Image Priors to Generate Counterfactual Explanations

Abstract:Through the use of carefully tailored convolutional neural network architectures, a deep image prior (DIP) can be used to obtain pre-images from latent representation encodings. Though DIP inversion has been known to be superior to conventional regularized inversion strategies such as total variation, such an over-parameterized generator is able to effectively reconstruct even images that are not in the original data distribution. This limitation makes it challenging to utilize such priors for tasks such as counterfactual reasoning, wherein the goal is to generate small, interpretable changes to an image that systematically leads to changes in the model prediction. To this end, we propose a novel regularization strategy based on an auxiliary loss estimator jointly trained with the predictor, which efficiently guides the prior to recover natural pre-images. Our empirical studies with a real-world ISIC skin lesion detection problem clearly evidence the effectiveness of the proposed approach in synthesizing meaningful counterfactuals. In comparison, we find that the standard DIP inversion often proposes visually imperceptible perturbations to irrelevant parts of the image, thus providing no additional insights into the model behavior.

Via

Access Paper or Ask Questions