Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Abir De

Differentiable Adversarial Attacks for Marked Temporal Point Processes

Jan 17, 2025

Pritish Chakraborty, Vinayak Gupta, Rahul R, Srikanta J. Bedathur, Abir De

Abstract:Marked temporal point processes (MTPPs) have been shown to be extremely effective in modeling continuous time event sequences (CTESs). In this work, we present adversarial attacks designed specifically for MTPP models. A key criterion for a good adversarial attack is its imperceptibility. For objects such as images or text, this is often achieved by bounding perturbation in some fixed $L_p$ norm-ball. However, similarly minimizing distance norms between two CTESs in the context of MTPPs is challenging due to their sequential nature and varying time-scales and lengths. We address this challenge by first permuting the events and then incorporating the additive noise to the arrival timestamps. However, the worst case optimization of such adversarial attacks is a hard combinatorial problem, requiring exploration across a permutation space that is factorially large in the length of the input sequence. As a result, we propose a novel differentiable scheme PERMTPP using which we can perform adversarial attacks by learning to minimize the likelihood, while minimizing the distance between two CTESs. Our experiments on four real-world datasets demonstrate the offensive and defensive capabilities, and lower inference times of PERMTPP.

* AAAI 2025 (Main Track)

Via

Access Paper or Ask Questions

Graph Edit Distance with General Costs Using Neural Set Divergence

Sep 26, 2024

Eeshaan Jain, Indradyumna Roy, Saswat Meher, Soumen Chakrabarti, Abir De

Abstract:Graph Edit Distance (GED) measures the (dis-)similarity between two given graphs, in terms of the minimum-cost edit sequence that transforms one graph to the other. However, the exact computation of GED is NP-Hard, which has recently motivated the design of neural methods for GED estimation. However, they do not explicitly account for edit operations with different costs. In response, we propose GRAPHEDX, a neural GED estimator that can work with general costs specified for the four edit operations, viz., edge deletion, edge addition, node deletion and node addition. We first present GED as a quadratic assignment problem (QAP) that incorporates these four costs. Then, we represent each graph as a set of node and edge embeddings and use them to design a family of neural set divergence surrogates. We replace the QAP terms corresponding to each operation with their surrogates. Computing such neural set divergence require aligning nodes and edges of the two graphs. We learn these alignments using a Gumbel-Sinkhorn permutation generator, additionally ensuring that the node and edge alignments are consistent with each other. Moreover, these alignments are cognizant of both the presence and absence of edges between node-pairs. Experiments on several datasets, under a variety of edit cost settings, show that GRAPHEDX consistently outperforms state-of-the-art methods and heuristics in terms of prediction error.

* Advances in Neural Information Processing Systems, 38 (2024)
* Published at NeurIPS 2024

Via

Access Paper or Ask Questions

Continuous Treatment Effect Estimation Using Gradient Interpolation and Kernel Smoothing

Jan 27, 2024

Lokesh Nagalapatti, Akshay Iyer, Abir De, Sunita Sarawagi

Figure 1 for Continuous Treatment Effect Estimation Using Gradient Interpolation and Kernel Smoothing

Figure 2 for Continuous Treatment Effect Estimation Using Gradient Interpolation and Kernel Smoothing

Figure 3 for Continuous Treatment Effect Estimation Using Gradient Interpolation and Kernel Smoothing

Figure 4 for Continuous Treatment Effect Estimation Using Gradient Interpolation and Kernel Smoothing

Abstract:We address the Individualized continuous treatment effect (ICTE) estimation problem where we predict the effect of any continuous-valued treatment on an individual using observational data. The main challenge in this estimation task is the potential confounding of treatment assignment with an individual's covariates in the training data, whereas during inference ICTE requires prediction on independently sampled treatments. In contrast to prior work that relied on regularizers or unstable GAN training, we advocate the direct approach of augmenting training individuals with independently sampled treatments and inferred counterfactual outcomes. We infer counterfactual outcomes using a two-pronged strategy: a Gradient Interpolation for close-to-observed treatments, and a Gaussian Process based Kernel Smoothing which allows us to downweigh high variance inferences. We evaluate our method on five benchmarks and show that our method outperforms six state-of-the-art methods on the counterfactual estimation error. We analyze the superior performance of our method by showing that (1) our inferred counterfactual responses are more accurate, and (2) adding them to the training data reduces the distributional distance between the confounded training distribution and test distribution where treatment is independent of covariates. Our proposed method is model-agnostic and we show that it improves ICTE accuracy of several existing models.

* Accepted at AAAI 24

Via

Access Paper or Ask Questions

Generator Assisted Mixture of Experts For Feature Acquisition in Batch

Dec 19, 2023

Vedang Asgaonkar, Aditya Jain, Abir De

Abstract:Given a set of observations, feature acquisition is about finding the subset of unobserved features which would enhance accuracy. Such problems have been explored in a sequential setting in prior work. Here, the model receives feedback from every new feature acquired and chooses to explore more features or to predict. However, sequential acquisition is not feasible in some settings where time is of the essence. We consider the problem of feature acquisition in batch, where the subset of features to be queried in batch is chosen based on the currently observed features, and then acquired as a batch, followed by prediction. We solve this problem using several technical innovations. First, we use a feature generator to draw a subset of the synthetic features for some examples, which reduces the cost of oracle queries. Second, to make the feature acquisition problem tractable for the large heterogeneous observed features, we partition the data into buckets, by borrowing tools from locality sensitive hashing and then train a mixture of experts model. Third, we design a tractable lower bound of the original objective. We use a greedy algorithm combined with model training to solve the underlying problem. Experiments with four datasets show that our approach outperforms these methods in terms of trade-off between accuracy and feature acquisition cost.

* Accepted in AAAI-24

Via

Access Paper or Ask Questions

Retrieving Continuous Time Event Sequences using Neural Temporal Point Processes with Learnable Hashing

Jul 13, 2023

Vinayak Gupta, Srikanta Bedathur, Abir De

Abstract:Temporal sequences have become pervasive in various real-world applications. Consequently, the volume of data generated in the form of continuous time-event sequence(s) or CTES(s) has increased exponentially in the past few years. Thus, a significant fraction of the ongoing research on CTES datasets involves designing models to address downstream tasks such as next-event prediction, long-term forecasting, sequence classification etc. The recent developments in predictive modeling using marked temporal point processes (MTPP) have enabled an accurate characterization of several real-world applications involving the CTESs. However, due to the complex nature of these CTES datasets, the task of large-scale retrieval of temporal sequences has been overlooked by the past literature. In detail, by CTES retrieval we mean that for an input query sequence, a retrieval system must return a ranked list of relevant sequences from a large corpus. To tackle this, we propose NeuroSeqRet, a first-of-its-kind framework designed specifically for end-to-end CTES retrieval. Specifically, NeuroSeqRet introduces multiple enhancements over standard retrieval frameworks and first applies a trainable unwarping function on the query sequence which makes it comparable with corpus sequences, especially when a relevant query-corpus pair has individually different attributes. Next, it feeds the unwarped query sequence and the corpus sequence into MTPP-guided neural relevance models. We develop four variants of the relevance model for different kinds of applications based on the trade-off between accuracy and efficiency. We also propose an optimization framework to learn binary sequence embeddings from the relevance scores, suitable for the locality-sensitive hashing. Our experiments show the significant accuracy boost of NeuroSeqRet as well as the efficacy of our hashing mechanism.

* Extended version of Gupta et al. [arXiv:2202.11485] (AAAI 2022). Under review in a journal

Via

Access Paper or Ask Questions

Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes

Jun 23, 2022

Vinayak Gupta, Srikanta Bedathur, Sourangshu Bhattacharya, Abir De

Figure 1 for Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes

Figure 2 for Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes

Figure 3 for Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes

Figure 4 for Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes

Abstract:A large fraction of data generated via human activities such as online purchases, health records, spatial mobility etc. can be represented as a sequence of events over a continuous-time. Learning deep learning models over these continuous-time event sequences is a non-trivial task as it involves modeling the ever-increasing event timestamps, inter-event time gaps, event types, and the influences between different events within and across different sequences. In recent years neural enhancements to marked temporal point processes (MTPP) have emerged as a powerful framework to model the underlying generative mechanism of asynchronous events localized in continuous time. However, most existing models and inference methods in the MTPP framework consider only the complete observation scenario i.e. the event sequence being modeled is completely observed with no missing events -- an ideal setting that is rarely applicable in real-world applications. A recent line of work which considers missing events while training MTPP utilizes supervised learning techniques that require additional knowledge of missing or observed label for each event in a sequence, which further restricts its practicability as in several scenarios the details of missing events is not known apriori. In this work, we provide a novel unsupervised model and inference method for learning MTPP in presence of event sequences with missing events. Specifically, we first model the generative processes of observed events and missing events using two MTPP, where the missing events are represented as latent random variables. Then, we devise an unsupervised training method that jointly learns both the MTPP by means of variational inference. Such a formulation can effectively impute the missing data among the observed events and can identify the optimal position of missing events in a sequence.

* ACM TIST

Via

Access Paper or Ask Questions

Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences

Feb 17, 2022

Vinayak Gupta, Srikanta Bedathur, Abir De

Figure 1 for Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences

Figure 2 for Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences

Figure 3 for Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences

Abstract:Recent developments in predictive modeling using marked temporal point processes (MTPP) have enabled an accurate characterization of several real-world applications involving continuous-time event sequences (CTESs). However, the retrieval problem of such sequences remains largely unaddressed in literature. To tackle this, we propose NEUROSEQRET which learns to retrieve and rank a relevant set of continuous-time event sequences for a given query sequence, from a large corpus of sequences. More specifically, NEUROSEQRET first applies a trainable unwarping function on the query sequence, which makes it comparable with corpus sequences, especially when a relevant query-corpus pair has individually different attributes. Next, it feeds the unwarped query sequence and the corpus sequence into MTPP guided neural relevance models. We develop two variants of the relevance model which offer a tradeoff between accuracy and efficiency. We also propose an optimization framework to learn binary sequence embeddings from the relevance scores, suitable for the locality-sensitive hashing leading to a significant speedup in returning top-K results for a given query sequence. Our experiments with several datasets show the significant accuracy boost of NEUROSEQRET beyond several baselines, as well as the efficacy of our hashing mechanism.

* AAAI 2022

Via

Access Paper or Ask Questions

Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

Nov 30, 2021

Santanu Rathod, Manoj Bhadu, Abir De

Figure 1 for Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

Figure 2 for Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

Figure 3 for Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

Abstract:Owing to the growth of interest in Reinforcement Learning in the last few years, gradient based policy control methods have been gaining popularity for Control problems as well. And rightly so, since gradient policy methods have the advantage of optimizing a metric of interest in an end-to-end manner, along with being relatively easy to implement without complete knowledge of the underlying system. In this paper, we study the global convergence of gradient-based policy optimization methods for quadratic control of discrete-time and model-free Markovian jump linear systems (MJLS). We surmount myriad challenges that arise because of more than one states coupled with lack of knowledge of the system dynamics and show global convergence of the policy using gradient descent and natural policy gradient methods. We also provide simulation studies to corroborate our claims.

* 42 pages, 3 figures

Via

Access Paper or Ask Questions

Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Aug 23, 2021

Chitrank Gupta, Yash Jain, Abir De, Soumen Chakrabarti

Figure 1 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Figure 2 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Figure 3 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Figure 4 for Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy

Abstract:In recent years, inductive graph embedding models, \emph{viz.}, graph neural networks (GNNs) have become increasingly accurate at link prediction (LP) in online social networks. The performance of such networks depends strongly on the input node features, which vary across networks and applications. Selecting appropriate node features remains application-dependent and generally an open question. Moreover, owing to privacy and ethical issues, use of personalized node features is often restricted. In fact, many publicly available data from online social network do not contain any node features (e.g., demography). In this work, we provide a comprehensive experimental analysis which shows that harnessing a transductive technique (e.g., Node2Vec) for obtaining initial node representations, after which an inductive node embedding technique takes over, leads to substantial improvements in link prediction accuracy. We demonstrate that, for a wide variety of GNN variants, node representation vectors obtained from Node2Vec serve as high quality input features to GNNs, thereby improving LP performance.

* 5 Pages, Accepted by CIKM 2021

Via

Access Paper or Ask Questions

Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Aug 15, 2021

Anshul Nasery, Soumyadeep Thakur, Vihari Piratla, Abir De, Sunita Sarawagi

Figure 1 for Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Figure 2 for Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Figure 3 for Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Figure 4 for Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Abstract:In several real world applications, machine learning models are deployed to make predictions on data whose distribution changes gradually along time, leading to a drift between the train and test distributions. Such models are often re-trained on new data periodically, and they hence need to generalize to data not too far into the future. In this context, there is much prior work on enhancing temporal generalization, e.g. continuous transportation of past data, kernel smoothed time-sensitive parameters and more recently, adversarial learning of time-invariant features. However, these methods share several limitations, e.g, poor scalability, training instability, and dependence on unlabeled data from the future. Responding to the above limitations, we propose a simple method that starts with a model with time-sensitive parameters but regularizes its temporal complexity using a Gradient Interpolation (GI) loss. GI allows the decision boundary to change along time and can still prevent overfitting to the limited training time snapshots by allowing task-specific control over changes along time. We compare our method to existing baselines on multiple real-world datasets, which show that GI outperforms more complicated generative and adversarial approaches on the one hand, and simpler gradient regularization methods on the other.

Via

Access Paper or Ask Questions