Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Glasner

LatentCRF: Continuous CRF for Efficient Latent Diffusion

Dec 24, 2024

Kanchana Ranasinghe, Sadeep Jayasumana, Andreas Veit, Ayan Chakrabarti, Daniel Glasner, Michael S Ryoo, Srikumar Ramalingam, Sanjiv Kumar

Figure 1 for LatentCRF: Continuous CRF for Efficient Latent Diffusion

Figure 2 for LatentCRF: Continuous CRF for Efficient Latent Diffusion

Figure 3 for LatentCRF: Continuous CRF for Efficient Latent Diffusion

Figure 4 for LatentCRF: Continuous CRF for Efficient Latent Diffusion

Abstract:Latent Diffusion Models (LDMs) produce high-quality, photo-realistic images, however, the latency incurred by multiple costly inference iterations can restrict their applicability. We introduce LatentCRF, a continuous Conditional Random Field (CRF) model, implemented as a neural network layer, that models the spatial and semantic relationships among the latent vectors in the LDM. By replacing some of the computationally-intensive LDM inference iterations with our lightweight LatentCRF, we achieve a superior balance between quality, speed and diversity. We increase inference efficiency by 33% with no loss in image quality or diversity compared to the full LDM. LatentCRF is an easy add-on, which does not require modifying the LDM.

Via

Access Paper or Ask Questions

SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models

Aug 14, 2023

Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar

Abstract:Modern text-to-image generation models produce high-quality images that are both photorealistic and faithful to the text prompts. However, this quality comes at significant computational cost: nearly all of these models are iterative and require running inference multiple times with large models. This iterative process is needed to ensure that different regions of the image are not only aligned with the text prompt, but also compatible with each other. In this work, we propose a light-weight approach to achieving this compatibility between different regions of an image, using a Markov Random Field (MRF) model. This method is shown to work in conjunction with the recently proposed Muse model. The MRF encodes the compatibility among image tokens at different spatial locations and enables us to significantly reduce the required number of Muse prediction steps. Inference with the MRF is significantly cheaper, and its parameters can be quickly learned through back-propagation by modeling MRF inference as a differentiable neural-network layer. Our full model, SPEGTI, uses this proposed MRF model to speed up Muse by 1.5X with no loss in output image quality.

Via

Access Paper or Ask Questions

Balancing Robustness and Sensitivity using Feature Contrastive Learning

May 19, 2021

Seungyeon Kim, Daniel Glasner, Srikumar Ramalingam, Cho-Jui Hsieh, Kishore Papineni, Sanjiv Kumar

Figure 1 for Balancing Robustness and Sensitivity using Feature Contrastive Learning

Figure 2 for Balancing Robustness and Sensitivity using Feature Contrastive Learning

Figure 3 for Balancing Robustness and Sensitivity using Feature Contrastive Learning

Figure 4 for Balancing Robustness and Sensitivity using Feature Contrastive Learning

Abstract:It is generally believed that robust training of extremely large networks is critical to their success in real-world applications. However, when taken to the extreme, methods that promote robustness can hurt the model's sensitivity to rare or underrepresented patterns. In this paper, we discuss this trade-off between sensitivity and robustness to natural (non-adversarial) perturbations by introducing two notions: contextual feature utility and contextual feature sensitivity. We propose Feature Contrastive Learning (FCL) that encourages a model to be more sensitive to the features that have higher contextual utility. Empirical results demonstrate that models trained with FCL achieve a better balance of robustness and sensitivity, leading to improved generalization in the presence of noise on both vision and NLP datasets.

* 31 pages, 5 figures, 3 tables

Via

Access Paper or Ask Questions

Balancing Constraints and Submodularity in Data Subset Selection

Apr 26, 2021

Srikumar Ramalingam, Daniel Glasner, Kaushal Patel, Raviteja Vemulapalli, Sadeep Jayasumana, Sanjiv Kumar

Figure 1 for Balancing Constraints and Submodularity in Data Subset Selection

Figure 2 for Balancing Constraints and Submodularity in Data Subset Selection

Figure 3 for Balancing Constraints and Submodularity in Data Subset Selection

Figure 4 for Balancing Constraints and Submodularity in Data Subset Selection

Abstract:Deep learning has yielded extraordinary results in vision and natural language processing, but this achievement comes at a cost. Most deep learning models require enormous resources during training, both in terms of computation and in human labeling effort. In this paper, we show that one can achieve similar accuracy to traditional deep-learning models, while using less training data. Much of the previous work in this area relies on using uncertainty or some form of diversity to select subsets of a larger training set. Submodularity, a discrete analogue of convexity, has been exploited to model diversity in various settings including data subset selection. In contrast to prior methods, we propose a novel diversity driven objective function, and balancing constraints on class labels and decision boundaries using matroids. This allows us to use efficient greedy algorithms with approximation guarantees for subset selection. We outperform baselines on standard image classification datasets such as CIFAR-10, CIFAR-100, and ImageNet. In addition, we also show that the proposed balancing constraints can play a key role in boosting the performance in long-tailed datasets such as CIFAR-100-LT.

Via

Access Paper or Ask Questions

Understanding Robustness of Transformers for Image Classification

Mar 26, 2021

Srinadh Bhojanapalli, Ayan Chakrabarti, Daniel Glasner, Daliang Li, Thomas Unterthiner, Andreas Veit

Figure 1 for Understanding Robustness of Transformers for Image Classification

Figure 2 for Understanding Robustness of Transformers for Image Classification

Figure 3 for Understanding Robustness of Transformers for Image Classification

Figure 4 for Understanding Robustness of Transformers for Image Classification

Abstract:Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification. However, details of the Transformer architecture -- such as the use of non-overlapping patches -- lead one to wonder whether these networks are as robust. In this paper, we perform an extensive study of a variety of different measures of robustness of ViT models and compare the findings to ResNet baselines. We investigate robustness to input perturbations as well as robustness to model perturbations. We find that when pre-trained with a sufficient amount of data, ViT models are at least as robust as the ResNet counterparts on a broad range of perturbations. We also find that Transformers are robust to the removal of almost any single layer, and that while activations from later layers are highly correlated with each other, they nevertheless play an important role in classification.

Via

Access Paper or Ask Questions

A Global Approach for Solving Edge-Matching Puzzles

Feb 10, 2015

Shahar Z. Kovalsky, Daniel Glasner, Ronen Basri

Figure 1 for A Global Approach for Solving Edge-Matching Puzzles

Figure 2 for A Global Approach for Solving Edge-Matching Puzzles

Figure 3 for A Global Approach for Solving Edge-Matching Puzzles

Figure 4 for A Global Approach for Solving Edge-Matching Puzzles

Abstract:We consider apictorial edge-matching puzzles, in which the goal is to arrange a collection of puzzle pieces with colored edges so that the colors match along the edges of adjacent pieces. We devise an algebraic representation for this problem and provide conditions under which it exactly characterizes a puzzle. Using the new representation, we recast the combinatorial, discrete problem of solving puzzles as a global, polynomial system of equations with continuous variables. We further propose new algorithms for generating approximate solutions to the continuous problem by solving a sequence of convex relaxations.

* SIAM J. Imaging Sciences, Vol. 8, Issue 2, 916--938, 2015

Via

Access Paper or Ask Questions