Abstract:In the realm of artificial intelligence, the generation of realistic training data for supervised learning tasks presents a significant challenge. This is particularly true in the synthesis of electrocardiograms (ECGs), where the objective is to develop a synthetic 12-lead ECG model. The primary complexity of this task stems from accurately modeling the intricate biological and physiological interactions among different ECG leads. Although mathematical process simulators have shed light on these dynamics, effectively incorporating this understanding into generative models is not straightforward. In this work, we introduce an innovative method that employs ordinary differential equations (ODEs) to enhance the fidelity of generating 12-lead ECG data. This approach integrates a system of ODEs that represent cardiac dynamics directly into the generative model's optimization process, allowing for the production of biologically plausible ECG training data that authentically reflects real-world variability and inter-lead dependencies. We conducted an empirical analysis of thousands of ECGs and found that incorporating cardiac simulation insights into the data generation process significantly improves the accuracy of heart abnormality classifiers trained on this synthetic 12-lead ECG data.
Abstract:Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective learning process through graph-based learning. By doing so, we can capture complex relationships and dependencies between proteins and their functional annotations, resulting in more robust and contextually enriched protein representations. Unlike previous fusion methods, GOProteinGNN uniquely learns the entire protein knowledge graph during training, which allows it to capture broader relational nuances and dependencies beyond mere triplets as done in previous work. We perform a comprehensive evaluation on several downstream tasks demonstrating that GOProteinGNN consistently outperforms previous methods, showcasing its effectiveness and establishing it as a state-of-the-art solution for protein representation learning.
Abstract:Multivariate time series forecasting is a pivotal task in several domains, including financial planning, medical diagnostics, and climate science. This paper presents the Neural Fourier Transform (NFT) algorithm, which combines multi-dimensional Fourier transforms with Temporal Convolutional Network layers to improve both the accuracy and interpretability of forecasts. The Neural Fourier Transform is empirically validated on fourteen diverse datasets, showing superior performance across multiple forecasting horizons and lookbacks, setting new benchmarks in the field. This work advances multivariate time series forecasting by providing a model that is both interpretable and highly predictive, making it a valuable tool for both practitioners and researchers. The code for this study is publicly available.
Abstract:Consumer demand forecasting is of high importance for many e-commerce applications, including supply chain optimization, advertisement placement, and delivery speed optimization. However, reliable time series sales forecasting for e-commerce is difficult, especially during periods with many anomalies, as can often happen during pandemics, abnormal weather, or sports events. Although many time series algorithms have been applied to the task, prediction during anomalies still remains a challenge. In this work, we hypothesize that leveraging external knowledge found in world events can help overcome the challenge of prediction under anomalies. We mine a large repository of 40 years of world events and their textual representations. Further, we present a novel methodology based on transformers to construct an embedding of a day based on the relations of the day's events. Those embeddings are then used to forecast future consumer behavior. We empirically evaluate the methods over a large e-commerce products sales dataset, extracted from eBay, one of the world's largest online marketplaces. We show over numerous categories that our method outperforms state-of-the-art baselines during anomalies.
Abstract:Mitigating social biases typically requires identifying the social groups associated with each data sample. In this paper, we present DAFair, a novel approach to address social bias in language models. Unlike traditional methods that rely on explicit demographic labels, our approach does not require any such information. Instead, we leverage predefined prototypical demographic texts and incorporate a regularization term during the fine-tuning process to mitigate bias in the model's representations. Our empirical results across two tasks and two models demonstrate the effectiveness of our method compared to previous approaches that do not rely on labeled data. Moreover, with limited demographic-annotated data, our approach outperforms common debiasing approaches.
Abstract:Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non-linear encoded concepts from neural representations. Our method consists of iteratively training neural classifiers to predict a particular attribute we seek to eliminate, followed by a projection of the representation on a hypersurface, such that the classifiers become oblivious to the target attribute. We evaluate the effectiveness of our method on the task of removing gender and race information as sensitive attributes. Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations, with minimal impact on downstream task accuracy.
Abstract:Temporal graph neural networks (temporal GNNs) have been widely researched, reaching state-of-the-art results on multiple prediction tasks. A common approach employed by most previous works is to apply a layer that aggregates information from the historical neighbors of a node. Taking a different research direction, in this work, we propose tBDFS -- a novel temporal GNN architecture. tBDFS applies a layer that efficiently aggregates information from temporal paths to a given (target) node in the graph. For each given node, the aggregation is applied in two stages: (1) A single representation is learned for each temporal path ending in that node, and (2) all path representations are aggregated into a final node representation. Overall, our goal is not to add new information to a node, but rather observe the same exact information in a new perspective. This allows our model to directly observe patterns that are path-oriented rather than neighborhood-oriented. This can be thought as a Depth-First Search (DFS) traversal over the temporal graph, compared to the popular Breath-First Search (BFS) traversal that is applied in previous works. We evaluate tBDFS over multiple link prediction tasks and show its favorable performance compared to state-of-the-art baselines. To the best of our knowledge, we are the first to apply a temporal-DFS neural network.
Abstract:Many texts, especially in chemistry and biology, describe complex processes. We focus on texts that describe a chemical reaction process and questions that ask about the process's outcome under different environmental conditions. To answer questions about such processes, one needs to understand the interactions between the different entities involved in the process and to simulate their state transitions during the process execution under different conditions. A state transition is defined as the memory modification the program does to the variables during the execution. We hypothesize that generating code and executing it to simulate the process will allow answering such questions. We, therefore, define a domain-specific language (DSL) to represent processes. We contribute to the community a unique dataset curated by chemists and annotated by computer scientists. The dataset is composed of process texts, simulation questions, and their corresponding computer codes represented by the DSL.We propose a neural program synthesis approach based on reinforcement learning with a novel state-transition semantic reward. The novel reward is based on the run-time semantic similarity between the predicted code and the reference code. This allows simulating complex process transitions and thus answering simulation questions. Our approach yields a significant boost in accuracy for simulation questions: 88\% accuracy as opposed to 83\% accuracy of the state-of-the-art neural program synthesis approaches and 54\% accuracy of state-of-the-art end-to-end text-based approaches.
Abstract:Pretrained language models based on the transformer architecture have shown great success in NLP. Textual training data often comes from the web and is thus tagged with time-specific information, but most language models ignore this information. They are trained on the textual data alone, limiting their ability to generalize temporally. In this work, we extend the key component of the transformer architecture, i.e., the self-attention mechanism, and propose temporal attention - a time-aware self-attention mechanism. Temporal attention can be applied to any transformer model and requires the input texts to be accompanied with their relevant time points. It allows the transformer to capture this temporal information and create time-specific contextualized word representations. We leverage these representations for the task of semantic change detection; we apply our proposed mechanism to BERT and experiment on three datasets in different languages (English, German, and Latin) that also vary in time, size, and genre. Our proposed model achieves state-of-the-art results on all the datasets.
Abstract:Our world is constantly evolving, and so is the content on the web. Consequently, our languages, often said to mirror the world, are dynamic in nature. However, most current contextual language models are static and cannot adapt to changes over time. In this work, we propose a temporal contextual language model called TempoBERT, which uses time as an additional context of texts. Our technique is based on modifying texts with temporal information and performing time masking - specific masking for the supplementary time information. We leverage our approach for the tasks of semantic change detection and sentence time prediction, experimenting on diverse datasets in terms of time, size, genre, and language. Our extensive evaluation shows that both tasks benefit from exploiting time masking.