Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gabriel Terejanu

CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization

Nov 09, 2024

Jawad Chowdhury, Gabriel Terejanu

Abstract:Improving generalization and achieving highly predictive, robust machine learning models necessitates learning the underlying causal structure of the variables of interest. A prominent and effective method for this is learning invariant predictors across multiple environments. In this work, we introduce a simple yet powerful approach, CGLearn, which relies on the agreement of gradients across various environments. This agreement serves as a powerful indication of reliable features, while disagreement suggests less reliability due to potential differences in underlying causal mechanisms. Our proposed method demonstrates superior performance compared to state-of-the-art methods in both linear and nonlinear settings across various regression and classification tasks. CGLearn shows robust applicability even in the absence of separate environments by exploiting invariance across different subsamples of observational data. Comprehensive experiments on both synthetic and real-world datasets highlight its effectiveness in diverse scenarios. Our findings underscore the importance of leveraging gradient agreement for learning causal invariance, providing a significant step forward in the field of robust machine learning. The source code of the linear and nonlinear implementation of CGLearn is open-source and available at: https://github.com/hasanjawad001/CGLearn.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Two-Stage Stance Labeling: User-Hashtag Heuristics with Graph Neural Networks

Apr 16, 2024

Joshua Melton, Shannon Reid, Gabriel Terejanu, Siddharth Krishnan

Abstract:The high volume and rapid evolution of content on social media present major challenges for studying the stance of social media users. In this work, we develop a two stage stance labeling method that utilizes the user-hashtag bipartite graph and the user-user interaction graph. In the first stage, a simple and efficient heuristic for stance labeling uses the user-hashtag bipartite graph to iteratively update the stance association of user and hashtag nodes via a label propagation mechanism. This set of soft labels is then integrated with the user-user interaction graph to train a graph neural network (GNN) model using semi-supervised learning. We evaluate this method on two large-scale datasets containing tweets related to climate change from June 2021 to June 2022 and gun control from January 2022 to January 2023. Experiments demonstrate that our user-hashtag heuristic and the semi-supervised GNN method outperform zero-shot stance labeling using LLMs such as GPT4. Further analysis illustrates how the stance labeling information and interaction graph can be used for evaluating the polarization of social media interactions on divisive issues such as climate change and gun control.

Via

Access Paper or Ask Questions

Machine Fault Classification using Hamiltonian Neural Networks

Jan 04, 2023

Jeremy Shen, Jawad Chowdhury, Sourav Banerjee, Gabriel Terejanu

Abstract:A new approach is introduced to classify faults in rotating machinery based on the total energy signature estimated from sensor measurements. The overall goal is to go beyond using black-box models and incorporate additional physical constraints that govern the behavior of mechanical systems. Observational data is used to train Hamiltonian neural networks that describe the conserved energy of the system for normal and various abnormal regimes. The estimated total energy function, in the form of the weights of the Hamiltonian neural network, serves as the new feature vector to discriminate between the faults using off-the-shelf classification models. The experimental results are obtained using the MaFaulDa database, where the proposed model yields a promising area under the curve (AUC) of $0.78$ for the binary classification (normal vs abnormal) and $0.84$ for the multi-class problem (normal, and $5$ different abnormal regimes).

* ICPRAM 2023

Via

Access Paper or Ask Questions

Evaluation of Induced Expert Knowledge in Causal Structure Learning by NOTEARS

Jan 04, 2023

Jawad Chowdhury, Rezaur Rashid, Gabriel Terejanu

Abstract:Causal modeling provides us with powerful counterfactual reasoning and interventional mechanism to generate predictions and reason under various what-if scenarios. However, causal discovery using observation data remains a nontrivial task due to unobserved confounding factors, finite sampling, and changes in the data distribution. These can lead to spurious cause-effect relationships. To mitigate these challenges in practice, researchers augment causal learning with known causal relations. The goal of the paper is to study the impact of expert knowledge on causal relations in the form of additional constraints used in the formulation of the nonparametric NOTEARS. We provide a comprehensive set of comparative analyses of biasing the model using different types of knowledge. We found that (i) knowledge that corrects the mistakes of the NOTEARS model can lead to statistically significant improvements, (ii) constraints on active edges have a larger positive impact on causal discovery than inactive edges, and surprisingly, (iii) the induced knowledge does not correct on average more incorrect active and/or inactive edges than expected. We also demonstrate the behavior of the model and the effectiveness of domain knowledge on a real-world dataset.

* ICPRAM 2023

Via

Access Paper or Ask Questions

From Causal Pairs to Causal Graphs

Nov 08, 2022

Rezaur Rashid, Jawad Chowdhury, Gabriel Terejanu

Abstract:Causal structure learning from observational data remains a non-trivial task due to various factors such as finite sampling, unobserved confounding factors, and measurement errors. Constraint-based and score-based methods tend to suffer from high computational complexity due to the combinatorial nature of estimating the directed acyclic graph (DAG). Motivated by the `Cause-Effect Pair' NIPS 2013 Workshop on Causality Challenge, in this paper, we take a different approach and generate a probability distribution over all possible graphs informed by the cause-effect pair features proposed in response to the workshop challenge. The goal of the paper is to propose new methods based on this probabilistic information and compare their performance with traditional and state-of-the-art approaches. Our experiments, on both synthetic and real datasets, show that our proposed methods not only have statistically similar or better performances than some traditional approaches but also are computationally faster.

* ICMLA 2022

Via

Access Paper or Ask Questions

Explainable Deep Modeling of Tabular Data using TableGraphNet

Feb 12, 2020

Gabriel Terejanu, Jawad Chowdhury, Rezaur Rashid, Asif Chowdhury

Figure 1 for Explainable Deep Modeling of Tabular Data using TableGraphNet

Figure 2 for Explainable Deep Modeling of Tabular Data using TableGraphNet

Figure 3 for Explainable Deep Modeling of Tabular Data using TableGraphNet

Figure 4 for Explainable Deep Modeling of Tabular Data using TableGraphNet

Abstract:The vast majority of research on explainability focuses on post-explainability rather than explainable modeling. Namely, an explanation model is derived to explain a complex black box model built with the sole purpose of achieving the highest performance possible. In part, this trend might be driven by the misconception that there is a trade-off between explainability and accuracy. Furthermore, the consequential work on Shapely values, grounded in game theory, has also contributed to a new wave of post-explainability research on better approximations for various machine learning models, including deep learning models. We propose a new architecture that inherently produces explainable predictions in the form of additive feature attributions. Our approach learns a graph representation for each record in the dataset. Attribute centric features are then derived from the graph and fed into a contribution deep set model to produce the final predictions. We show that our explainable model attains the same level of performance as black box models. Finally, we provide an augmented model training approach that leverages the missingness property and yields high levels of consistency (as required for the Shapely values) without loss of accuracy.

Via

Access Paper or Ask Questions

Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Oct 21, 2019

Asif J. Chowdhury, Gabriel Terejanu

Figure 1 for Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Figure 2 for Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Figure 3 for Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Figure 4 for Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Abstract:Markov Chain Monte Carlo (MCMC) methods have a drawback when working with a target distribution or likelihood function that is computationally expensive to evaluate, specially when working with big data. This paper focuses on Metropolis-Hastings (MH) algorithm for unimodal distributions. Here, an enhanced MH algorithm is proposed that requires less number of expensive function evaluations, has shorter burn-in period, and uses a better proposal distribution. The main innovations include the use of Bayesian optimization to reach the high probability region quickly, emulating the target distribution using Gaussian processes (GP), and using Laplace approximation of the GP to build a proposal distribution that captures the underlying correlation better. The experiments show significant improvement over the regular MH. Statistical comparison between the results from two algorithms is presented.

Via

Access Paper or Ask Questions

A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications

Oct 01, 2019

Asif J. Chowdhury, Wenqiang Yang, Kareem E. Abdelfatah, Mehdi Zare, Andreas Heyden, Gabriel Terejanu

Figure 1 for A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications

Figure 2 for A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications

Figure 3 for A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications

Figure 4 for A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications

Abstract:Computational catalyst discovery involves the development of microkinetic reactor models based on estimated parameters determined from density functional theory (DFT). For complex surface chemistries, the cost of calculating the adsorption energies by DFT for a large number of reaction intermediates can become prohibitive. Here, we have identified appropriate descriptors and machine learning models that can be used to predict part of these adsorption energies given data on the rest of them. Our investigations also included the case when the species data used to train the predictive model is of different size relative to the species the model tries to predict - an extrapolation in the data space which is typically difficult with regular machine learning models. We have developed a neural network based predictive model that combines an established model with the concepts of a convolutional neural network that, when extrapolating, achieves significant improvement over the previous models.

Via

Access Paper or Ask Questions

An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Dec 23, 2017

Chao Chen, Xiao Lin, Gabriel Terejanu

Figure 1 for An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Figure 2 for An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Figure 3 for An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Figure 4 for An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Abstract:Long Short-Term Memory networks trained with gradient descent and back-propagation have received great success in various applications. However, point estimation of the weights of the networks is prone to over-fitting problems and lacks important uncertainty information associated with the estimation. However, exact Bayesian neural network methods are intractable and non-applicable for real-world applications. In this study, we propose an approximate estimation of the weights uncertainty using Ensemble Kalman Filter, which is easily scalable to a large number of weights. Furthermore, we optimize the covariance of the noise distribution in the ensemble update step using maximum likelihood estimation. To assess the proposed algorithm, we apply it to outlier detection in five real-world events retrieved from the Twitter platform.

Via

Access Paper or Ask Questions

Environmental Modeling Framework using Stacked Gaussian Processes

Jun 18, 2017

Kareem Abdelfatah, Junshu Bao, Gabriel Terejanu

Figure 1 for Environmental Modeling Framework using Stacked Gaussian Processes

Figure 2 for Environmental Modeling Framework using Stacked Gaussian Processes

Figure 3 for Environmental Modeling Framework using Stacked Gaussian Processes

Figure 4 for Environmental Modeling Framework using Stacked Gaussian Processes

Abstract:A network of independently trained Gaussian processes (StackedGP) is introduced to obtain predictions of quantities of interest with quantified uncertainties. The main applications of the StackedGP framework are to integrate different datasets through model composition, enhance predictions of quantities of interest through a cascade of intermediate predictions, and to propagate uncertainties through emulated dynamical systems driven by uncertain forcing variables. By using analytical first and second-order moments of a Gaussian process with uncertain inputs using squared exponential and polynomial kernels, approximated expectations of quantities of interests that require an arbitrary composition of functions can be obtained. The StackedGP model is extended to any number of layers and nodes per layer, and it provides flexibility in kernel selection for the input nodes. The proposed nonparametric stacked model is validated using synthetic datasets, and its performance in model composition and cascading predictions is measured in two applications using real data.

Via

Access Paper or Ask Questions