Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Krishnan Raghavan

Sampling Imbalanced Data with Multi-objective Bilevel Optimization

Jun 12, 2025

Karen Medlin, Sven Leyffer, Krishnan Raghavan

Abstract:Two-class classification problems are often characterized by an imbalance between the number of majority and minority datapoints resulting in poor classification of the minority class in particular. Traditional approaches, such as reweighting the loss function or na\"ive resampling, risk overfitting and subsequently fail to improve classification because they do not consider the diversity between majority and minority datasets. Such consideration is infeasible because there is no metric that can measure the impact of imbalance on the model. To obviate these challenges, we make two key contributions. First, we introduce MOODS~(Multi-Objective Optimization for Data Sampling), a novel multi-objective bilevel optimization framework that guides both synthetic oversampling and majority undersampling. Second, we introduce a validation metric -- `$\epsilon/ \delta$ non-overlapping diversification metric' -- that quantifies the goodness of a sampling method towards model performance. With this metric we experimentally demonstrate state-of-the-art performance with improvement in diversity driving a $1-15 \%$ increase in $F1$ scores.

Via

Access Paper or Ask Questions

A Bilevel Optimization Framework for Imbalanced Data Classification

Oct 15, 2024

Karen Medlin, Sven Leyffer, Krishnan Raghavan

Abstract:Data rebalancing techniques, including oversampling and undersampling, are a common approach to addressing the challenges of imbalanced data. To tackle unresolved problems related to both oversampling and undersampling, we propose a new undersampling approach that: (i) avoids the pitfalls of noise and overlap caused by synthetic data and (ii) avoids the pitfall of under-fitting caused by random undersampling. Instead of undersampling majority data randomly, our method undersamples datapoints based on their ability to improve model loss. Using improved model loss as a proxy measurement for classification performance, our technique assesses a datapoint's impact on loss and rejects those unable to improve it. In so doing, our approach rejects majority datapoints redundant to datapoints already accepted and, thereby, finds an optimal subset of majority training data for classification. The accept/reject component of our algorithm is motivated by a bilevel optimization problem uniquely formulated to identify the optimal training set we seek. Experimental results show our proposed technique with F1 scores up to 10% higher than state-of-the-art methods.

Via

Access Paper or Ask Questions

Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Jul 24, 2024

Hongwei Jin, George Papadimitriou, Krishnan Raghavan, Pawel Zuk, Prasanna Balaprakash, Cong Wang, Anirban Mandal, Ewa Deelman

Figure 1 for Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Figure 2 for Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Figure 3 for Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Figure 4 for Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Abstract:Anomaly detection in computational workflows is critical for ensuring system reliability and security. However, traditional rule-based methods struggle to detect novel anomalies. This paper leverages large language models (LLMs) for workflow anomaly detection by exploiting their ability to learn complex data patterns. Two approaches are investigated: 1) supervised fine-tuning (SFT), where pre-trained LLMs are fine-tuned on labeled data for sentence classification to identify anomalies, and 2) in-context learning (ICL) where prompts containing task descriptions and examples guide LLMs in few-shot anomaly detection without fine-tuning. The paper evaluates the performance, efficiency, generalization of SFT models, and explores zero-shot and few-shot ICL prompts and interpretability enhancement via chain-of-thought prompting. Experiments across multiple workflow datasets demonstrate the promising potential of LLMs for effective anomaly detection in complex executions.

* 12 pages, 14 figures, paper is accepted by SC'24, source code, see: https://github.com/PoSeiDon-Workflows/LLM_AD

Via

Access Paper or Ask Questions

Forward Gradients for Data-Driven CFD Wall Modeling

Nov 28, 2023

Jan Hückelheim, Tadbhagya Kumar, Krishnan Raghavan, Pinaki Pal

Abstract:Computational Fluid Dynamics (CFD) is used in the design and optimization of gas turbines and many other industrial/ scientific applications. However, the practical use is often limited by the high computational cost, and the accurate resolution of near-wall flow is a significant contributor to this cost. Machine learning (ML) and other data-driven methods can complement existing wall models. Nevertheless, training these models is bottlenecked by the large computational effort and memory footprint demanded by back-propagation. Recent work has presented alternatives for computing gradients of neural networks where a separate forward and backward sweep is not needed and storage of intermediate results between sweeps is not required because an unbiased estimator for the gradient is computed in a single forward sweep. In this paper, we discuss the application of this approach for training a subgrid wall model that could potentially be used as a surrogate in wall-bounded flow CFD simulations to reduce the computational overhead while preserving predictive accuracy.

Via

Access Paper or Ask Questions

Self-supervised Learning for Anomaly Detection in Computational Workflows

Oct 02, 2023

Hongwei Jin, Krishnan Raghavan, George Papadimitriou, Cong Wang, Anirban Mandal, Ewa Deelman, Prasanna Balaprakash

Figure 1 for Self-supervised Learning for Anomaly Detection in Computational Workflows

Figure 2 for Self-supervised Learning for Anomaly Detection in Computational Workflows

Figure 3 for Self-supervised Learning for Anomaly Detection in Computational Workflows

Figure 4 for Self-supervised Learning for Anomaly Detection in Computational Workflows

Abstract:Anomaly detection is the task of identifying abnormal behavior of a system. Anomaly detection in computational workflows is of special interest because of its wide implications in various domains such as cybersecurity, finance, and social networks. However, anomaly detection in computational workflows~(often modeled as graphs) is a relatively unexplored problem and poses distinct challenges. For instance, when anomaly detection is performed on graph data, the complex interdependency of nodes and edges, the heterogeneity of node attributes, and edge types must be accounted for. Although the use of graph neural networks can help capture complex inter-dependencies, the scarcity of labeled anomalous examples from workflow executions is still a significant challenge. To address this problem, we introduce an autoencoder-driven self-supervised learning~(SSL) approach that learns a summary statistic from unlabeled workflow data and estimates the normal behavior of the computational workflow in the latent space. In this approach, we combine generative and contrastive learning objectives to detect outliers in the summary statistics. We demonstrate that by estimating the distribution of normal behavior in the latent space, we can outperform state-of-the-art anomaly detection methods on our benchmark datasets.

Via

Access Paper or Ask Questions

Learning Continually on a Sequence of Graphs -- The Dynamical System Way

May 19, 2023

Krishnan Raghavan, Prasanna Balaprakash

Abstract:Continual learning~(CL) is a field concerned with learning a series of inter-related task with the tasks typically defined in the sense of either regression or classification. In recent years, CL has been studied extensively when these tasks are defined using Euclidean data-- data, such as images, that can be described by a set of vectors in an n-dimensional real space. However, the literature is quite sparse, when the data corresponding to a CL task is nonEuclidean-- data , such as graphs, point clouds or manifold, where the notion of similarity in the sense of Euclidean metric does not hold. For instance, a graph is described by a tuple of vertices and edges and similarities between two graphs is not well defined through a Euclidean metric. Due to this fundamental nature of the data, developing CL for nonEuclidean data presents several theoretical and methodological challenges. In particular, CL for graphs requires explicit modelling of nonstationary behavior of vertices and edges and their effects on the learning problem. Therefore, in this work, we develop a adaptive dynamic programming viewpoint for CL with graphs. In this work, we formulate a two-player sequential game between the act of learning new tasks~(generalization) and remembering previously learned tasks~(forgetting). We prove mathematically the existence of a solution to the game and demonstrate convergence to the solution of the game. Finally, we demonstrate the efficacy of our method on a number of graph benchmarks with a comprehensive ablation study while establishing state-of-the-art performance.

Via

Access Paper or Ask Questions

Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

Feb 20, 2023

Romit Maulik, Romain Egele, Krishnan Raghavan, Prasanna Balaprakash

Figure 1 for Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

Figure 2 for Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

Figure 3 for Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

Figure 4 for Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

Abstract:Classical problems in computational physics such as data-driven forecasting and signal reconstruction from sparse sensors have recently seen an explosion in deep neural network (DNN) based algorithmic approaches. However, most DNN models do not provide uncertainty estimates, which are crucial for establishing the trustworthiness of these techniques in downstream decision making tasks and scenarios. In recent years, ensemble-based methods have achieved significant success for the uncertainty quantification in DNNs on a number of benchmark problems. However, their performance on real-world applications remains under-explored. In this work, we present an automated approach to DNN discovery and demonstrate how this may also be utilized for ensemble-based uncertainty quantification. Specifically, we propose the use of a scalable neural and hyperparameter architecture search for discovering an ensemble of DNN models for complex dynamical systems. We highlight how the proposed method not only discovers high-performing neural network ensembles for our tasks, but also quantifies uncertainty seamlessly. This is achieved by using genetic algorithms and Bayesian optimization for sampling the search space of neural network architectures and hyperparameters. Subsequently, a model selection approach is used to identify candidate models for an ensemble set construction. Afterwards, a variance decomposition approach is used to estimate the uncertainty of the predictions from the ensemble. We demonstrate the feasibility of this framework for two tasks - forecasting from historical data and flow reconstruction from sparse sensors for the sea-surface temperature. We demonstrate superior performance from the ensemble in contrast with individual high-performing models and other benchmarks.

Via

Access Paper or Ask Questions

Cooperative Deep $Q$-learning Framework for Environments Providing Image Feedback

Oct 28, 2021

Krishnan Raghavan, Vignesh Narayanan, Jagannathan Sarangapani

Figure 1 for Cooperative Deep $Q$-learning Framework for Environments Providing Image Feedback

Figure 2 for Cooperative Deep $Q$-learning Framework for Environments Providing Image Feedback

Figure 3 for Cooperative Deep $Q$-learning Framework for Environments Providing Image Feedback

Figure 4 for Cooperative Deep $Q$-learning Framework for Environments Providing Image Feedback

Abstract:In this paper, we address two key challenges in deep reinforcement learning setting, sample inefficiency and slow learning, with a dual NN-driven learning approach. In the proposed approach, we use two deep NNs with independent initialization to robustly approximate the action-value function in the presence of image inputs. In particular, we develop a temporal difference (TD) error-driven learning approach, where we introduce a set of linear transformations of the TD error to directly update the parameters of each layer in the deep NN. We demonstrate theoretically that the cost minimized by the error-driven learning (EDL) regime is an approximation of the empirical cost and the approximation error reduces as learning progresses, irrespective of the size of the network. Using simulation analysis, we show that the proposed methods enables faster learning and convergence and requires reduced buffer size (thereby increasing the sample efficiency).

Via

Access Paper or Ask Questions

Learning to Control using Image Feedback

Oct 28, 2021

Krishnan Raghavan, Vignesh Narayanan, Jagannathan Saraangapani

Figure 1 for Learning to Control using Image Feedback

Figure 2 for Learning to Control using Image Feedback

Figure 3 for Learning to Control using Image Feedback

Abstract:Learning to control complex systems using non-traditional feedback, e.g., in the form of snapshot images, is an important task encountered in diverse domains such as robotics, neuroscience, and biology (cellular systems). In this paper, we present a two neural-network (NN)-based feedback control framework to design control policies for systems that generate feedback in the form of images. In particular, we develop a deep $Q$-network (DQN)-driven learning control strategy to synthesize a sequence of control inputs from snapshot images that encode the information pertaining to the current state and control action of the system. Further, to train the networks we employ a direct error-driven learning (EDL) approach that utilizes a set of linear transformations of the NN training error to update the NN weights in each layer. We verify the efficacy of the proposed control strategy using numerical examples.

Via

Access Paper or Ask Questions

AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Oct 26, 2021

Romain Egele, Romit Maulik, Krishnan Raghavan, Prasanna Balaprakash, Bethany Lusch

Figure 1 for AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Figure 2 for AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Figure 3 for AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Figure 4 for AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

Abstract:Deep neural networks are powerful predictors for a variety of tasks. However, they do not capture uncertainty directly. Using neural network ensembles to quantify uncertainty is competitive with approaches based on Bayesian neural networks while benefiting from better computational scalability. However, building ensembles of neural networks is a challenging task because, in addition to choosing the right neural architecture or hyperparameters for each member of the ensemble, there is an added cost of training each model. We propose AutoDEUQ, an automated approach for generating an ensemble of deep neural networks. Our approach leverages joint neural architecture and hyperparameter search to generate ensembles. We use the law of total variance to decompose the predictive variance of deep ensembles into aleatoric (data) and epistemic (model) uncertainties. We show that AutoDEUQ outperforms probabilistic backpropagation, Monte Carlo dropout, deep ensemble, distribution-free ensembles, and hyper ensemble methods on a number of regression benchmarks.

Via

Access Paper or Ask Questions