Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gautam Sreekumar

Compositional World Knowledge leads to High Utility Synthetic data

Mar 06, 2025

Sachit Gaudi, Gautam Sreekumar, Vishnu Boddeti

Abstract:Machine learning systems struggle with robustness, under subpopulation shifts. This problem becomes especially pronounced in scenarios where only a subset of attribute combinations is observed during training -a severe form of subpopulation shift, referred as compositional shift. To address this problem, we ask the following question: Can we improve the robustness by training on synthetic data, spanning all possible attribute combinations? We first show that training of conditional diffusion models on limited data lead to incorrect underlying distribution. Therefore, synthetic data sampled from such models will result in unfaithful samples and does not lead to improve performance of downstream machine learning systems. To address this problem, we propose CoInD to reflect the compositional nature of the world by enforcing conditional independence through minimizing Fisher's divergence between joint and marginal distributions. We demonstrate that synthetic data generated by CoInD is faithful and this translates to state-of-the-art worst-group accuracy on compositional shift tasks on CelebA.

Via

Access Paper or Ask Questions

CoInD: Enabling Logical Compositions in Diffusion Models

Mar 03, 2025

Sachit Gaudi, Gautam Sreekumar, Vishnu Boddeti

Abstract:How can we learn generative models to sample data with arbitrary logical compositions of statistically independent attributes? The prevailing solution is to sample from distributions expressed as a composition of attributes' conditional marginal distributions under the assumption that they are statistically independent. This paper shows that standard conditional diffusion models violate this assumption, even when all attribute compositions are observed during training. And, this violation is significantly more severe when only a subset of the compositions is observed. We propose CoInD to address this problem. It explicitly enforces statistical independence between the conditional marginal distributions by minimizing Fisher's divergence between the joint and marginal distributions. The theoretical advantages of CoInD are reflected in both qualitative and quantitative experiments, demonstrating a significantly more faithful and controlled generation of samples for arbitrary logical compositions of attributes. The benefit is more pronounced for scenarios that current solutions relying on the assumption of conditionally independent marginals struggle with, namely, logical compositions involving the NOT operation and when only a subset of compositions are observed during training.

Via

Access Paper or Ask Questions

OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

Jan 01, 2025

Sepehr Dehdashtian, Gautam Sreekumar, Vishnu Naresh Boddeti

Figure 1 for OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

Figure 2 for OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

Figure 3 for OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

Figure 4 for OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

Abstract:Images generated by text-to-image (T2I) models often exhibit visual biases and stereotypes of concepts such as culture and profession. Existing quantitative measures of stereotypes are based on statistical parity that does not align with the sociological definition of stereotypes and, therefore, incorrectly categorizes biases as stereotypes. Instead of oversimplifying stereotypes as biases, we propose a quantitative measure of stereotypes that aligns with its sociological definition. We then propose OASIS to measure the stereotypes in a generated dataset and understand their origins within the T2I model. OASIS includes two scores to measure stereotypes from a generated image dataset: (M1) Stereotype Score to measure the distributional violation of stereotypical attributes, and (M2) WALS to measure spectral variance in the images along a stereotypical attribute. OASIS also includes two methods to understand the origins of stereotypes in T2I models: (U1) StOP to discover attributes that the T2I model internally associates with a given concept, and (U2) SPI to quantify the emergence of stereotypical attributes in the latent space of the T2I model during image generation. Despite the considerable progress in image fidelity, using OASIS, we conclude that newer T2I models such as FLUX.1 and SDv3 contain strong stereotypical predispositions about concepts and still generate images with widespread stereotypical attributes. Additionally, the quantity of stereotypes worsens for nationalities with lower Internet footprints.

Via

Access Paper or Ask Questions

Spurious Correlations and Where to Find Them

Aug 21, 2023

Gautam Sreekumar, Vishnu Naresh Boddeti

Figure 1 for Spurious Correlations and Where to Find Them

Figure 2 for Spurious Correlations and Where to Find Them

Figure 3 for Spurious Correlations and Where to Find Them

Figure 4 for Spurious Correlations and Where to Find Them

Abstract:Spurious correlations occur when a model learns unreliable features from the data and are a well-known drawback of data-driven learning. Although there are several algorithms proposed to mitigate it, we are yet to jointly derive the indicators of spurious correlations. As a result, the solutions built upon standalone hypotheses fail to beat simple ERM baselines. We collect some of the commonly studied hypotheses behind the occurrence of spurious correlations and investigate their influence on standard ERM baselines using synthetic datasets generated from causal graphs. Subsequently, we observe patterns connecting these hypotheses and model design choices.

* 2nd Workshop on SCIS, ICML 2023

Via

Access Paper or Ask Questions

On the Biometric Capacity of Generative Face Models

Aug 03, 2023

Vishnu Naresh Boddeti, Gautam Sreekumar, Arun Ross

Figure 1 for On the Biometric Capacity of Generative Face Models

Figure 2 for On the Biometric Capacity of Generative Face Models

Figure 3 for On the Biometric Capacity of Generative Face Models

Figure 4 for On the Biometric Capacity of Generative Face Models

Abstract:There has been tremendous progress in generating realistic faces with high fidelity over the past few years. Despite this progress, a crucial question remains unanswered: "Given a generative face model, how many unique identities can it generate?" In other words, what is the biometric capacity of the generative face model? A scientific basis for answering this question will benefit evaluating and comparing different generative face models and establish an upper bound on their scalability. This paper proposes a statistical approach to estimate the biometric capacity of generated face images in a hyperspherical feature space. We employ our approach on multiple generative models, including unconditional generators like StyleGAN, Latent Diffusion Model, and "Generated Photos," as well as DCFace, a class-conditional generator. We also estimate capacity w.r.t. demographic attributes such as gender and age. Our capacity estimates indicate that (a) under ArcFace representation at a false acceptance rate (FAR) of 0.1%, StyleGAN3 and DCFace have a capacity upper bound of $1.43\times10^6$ and $1.190\times10^4$, respectively; (b) the capacity reduces drastically as we lower the desired FAR with an estimate of $1.796\times10^4$ and $562$ at FAR of 1% and 10%, respectively, for StyleGAN3; (c) there is no discernible disparity in the capacity w.r.t gender; and (d) for some generative models, there is an appreciable disparity in the capacity w.r.t age. Code is available at https://github.com/human-analysis/capacity-generative-face-models.

* IJCB 2023

Via

Access Paper or Ask Questions

Neuro-DynaStress: Predicting Dynamic Stress Distributions in Structural Components

Dec 19, 2022

Hamed Bolandi, Gautam Sreekumar, Xuyang Li, Nizar Lajnef, Vishnu Naresh Boddeti

Abstract:Structural components are typically exposed to dynamic loading, such as earthquakes, wind, and explosions. Structural engineers should be able to conduct real-time analysis in the aftermath or during extreme disaster events requiring immediate corrections to avoid fatal failures. As a result, it is crucial to predict dynamic stress distributions during highly disruptive events in real-time. Currently available high-fidelity methods, such as Finite Element Models (FEMs), suffer from their inherent high complexity and are computationally prohibitive. Therefore, to reduce computational cost while preserving accuracy, a deep learning model, Neuro-DynaStress, is proposed to predict the entire sequence of stress distribution based on finite element simulations using a partial differential equation (PDE) solver. The model was designed and trained to use the geometry, boundary conditions and sequence of loads as input and predict the sequences of high-resolution stress contours. The performance of the proposed framework is compared to finite element simulations using a PDE solver.

* 16 pages, 12 figures. arXiv admin note: text overlap with arXiv:2211.16190

Via

Access Paper or Ask Questions

Physics Informed Neural Network for Dynamic Stress Prediction

Nov 28, 2022

Hamed Bolandi, Gautam Sreekumar, Xuyang Li, Nizar Lajnef, Vishnu Naresh Boddeti

Abstract:Structural failures are often caused by catastrophic events such as earthquakes and winds. As a result, it is crucial to predict dynamic stress distributions during highly disruptive events in real time. Currently available high-fidelity methods, such as Finite Element Models (FEMs), suffer from their inherent high complexity. Therefore, to reduce computational cost while maintaining accuracy, a Physics Informed Neural Network (PINN), PINN-Stress model, is proposed to predict the entire sequence of stress distribution based on Finite Element simulations using a partial differential equation (PDE) solver. Using automatic differentiation, we embed a PDE into a deep neural network's loss function to incorporate information from measurements and PDEs. The PINN-Stress model can predict the sequence of stress distribution in almost real-time and can generalize better than the model without PINN.

* 14 pages, 13 figures

Via

Access Paper or Ask Questions

Neural Architecture Transfer

May 12, 2020

Zhichao Lu, Gautam Sreekumar, Erik Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti

Figure 1 for Neural Architecture Transfer

Figure 2 for Neural Architecture Transfer

Figure 3 for Neural Architecture Transfer

Figure 4 for Neural Architecture Transfer

Abstract:Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. Most existing NAS approaches require one complete search for each deployment specification of hardware or objective. This is a computationally impractical endeavor given the potentially large number of application scenarios. In this paper, we propose Neural Architecture Transfer (NAT) to overcome this limitation. NAT is designed to efficiently generate task-specific custom models that are competitive even under multiple conflicting objectives. To realize this goal we learn task-specific supernets from which specialized subnets can be sampled without any additional training. The key to our approach is an integrated online transfer learning and many-objective evolutionary search procedure. A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NAT on 11 benchmark image classification tasks ranging from large-scale multi-class to small-scale fine-grained datasets. In all cases, including ImageNet, NATNets improve upon the state-of-the-art under mobile settings ($\leq$ 600M Multiply-Adds). Surprisingly, small-scale fine-grained datasets benefit the most from NAT. At the same time, the architecture search and transfer is orders of magnitude more efficient than existing NAS methods. Overall, experimental evaluation indicates that across diverse image classification tasks and computational objectives, NAT is an appreciably more effective alternative to fine-tuning based transfer learning. Code is available at https://github.com/human-analysis/neural-architecture-transfer

* 17 pages

Via

Access Paper or Ask Questions