Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rishita Anubhai

BioBridge: Bridging Biomedical Foundation Models via Knowledge Graph

Oct 12, 2023

Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan, Vassilis N. Ioannidis, Huzefa Rangwala, Rishita Anubhai

Abstract:Foundation models (FMs) are able to leverage large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small molecule structures alone, or clinical data alone. To overcome this limitation of biomedical FMs, we present BioBridge, a novel parameter-efficient learning framework, to bridge independently trained unimodal FMs to establish multimodal behavior. BioBridge achieves it by utilizing Knowledge Graphs (KG) to learn transformations between one unimodal FM and another without fine-tuning any underlying unimodal FMs. Our empirical results demonstrate that BioBridge can beat the best baseline KG embedding methods (on average by around 76.3%) in cross-modal retrieval tasks. We also identify BioBridge demonstrates out-of-domain generalization ability by extrapolating to unseen modalities or relations. Additionally, we also show that BioBridge presents itself as a general purpose retriever that can aid biomedical multimodal question answering as well as enhance the guided generation of novel drugs.

* this paper needs further internal review for being published

Via

Access Paper or Ask Questions

Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

Oct 12, 2022

Siddharth Varia, Shuai Wang, Kishaloy Halder, Robert Vacareanu, Miguel Ballesteros, Yassine Benajiba, Neha Anna John, Rishita Anubhai, Smaranda Muresan, Dan Roth

Figure 1 for Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

Figure 2 for Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

Figure 3 for Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

Figure 4 for Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

Abstract:Aspect-based Sentiment Analysis (ABSA) is a fine-grained sentiment analysis task which involves four elements from user-generated texts: aspect term, aspect category, opinion term, and sentiment polarity. Most computational approaches focus on some of the ABSA sub-tasks such as tuple (aspect term, sentiment polarity) or triplet (aspect term, opinion term, sentiment polarity) extraction using either pipeline or joint modeling approaches. Recently, generative approaches have been proposed to extract all four elements as (one or more) quadruplets from text as a single task. In this work, we take a step further and propose a unified framework for solving ABSA, and the associated sub-tasks to improve the performance in few-shot scenarios. To this end, we fine-tune a T5 model with instructional prompts in a multi-task learning fashion covering all the sub-tasks, as well as the entire quadruple prediction task. In experiments with multiple benchmark data sets, we show that the proposed multi-task prompting approach brings performance boost (by absolute $6.75$ F1) in the few-shot learning setting.

Via

Access Paper or Ask Questions

Label Semantics for Few Shot Named Entity Recognition

Mar 16, 2022

Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth

Figure 1 for Label Semantics for Few Shot Named Entity Recognition

Figure 2 for Label Semantics for Few Shot Named Entity Recognition

Figure 3 for Label Semantics for Few Shot Named Entity Recognition

Figure 4 for Label Semantics for Few Shot Named Entity Recognition

Abstract:We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format. Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder. The label semantics signal is shown to support improved state-of-the-art results in multiple few shot NER benchmarks and on-par performance in standard benchmarks. Our model is especially effective in low resource settings.

* Findings of ACL 2022

Via

Access Paper or Ask Questions

Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Jun 17, 2021

Elsbeth Turcan, Shuai Wang, Rishita Anubhai, Kasturi Bhattacharjee, Yaser Al-Onaizan, Smaranda Muresan

Figure 1 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Figure 2 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Figure 3 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Figure 4 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Abstract:Detecting what emotions are expressed in text is a well-studied problem in natural language processing. However, research on finer grained emotion analysis such as what causes an emotion is still in its infancy. We present solutions that tackle both emotion recognition and emotion cause detection in a joint fashion. Considering that common-sense knowledge plays an important role in understanding implicitly expressed emotions and the reasons for those emotions, we propose novel methods that combine common-sense knowledge via adapted knowledge models with multi-task learning to perform joint emotion classification and emotion cause tagging. We show performance improvement on both tasks when including common-sense reasoning and a multitask framework. We provide a thorough analysis to gain insights into model performance.

* 15 pages, 6 figures. Findings of ACL 2021

Via

Access Paper or Ask Questions

Structured Prediction as Translation between Augmented Natural Languages

Jan 28, 2021

Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Figure 1 for Structured Prediction as Translation between Augmented Natural Languages

Figure 2 for Structured Prediction as Translation between Augmented Natural Languages

Figure 3 for Structured Prediction as Translation between Augmented Natural Languages

Figure 4 for Structured Prediction as Translation between Augmented Natural Languages

Abstract:We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative classifiers, we frame it as a translation task between augmented natural languages, from which the task-relevant information can be easily extracted. Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction (CoNLL04, ADE, NYT, and ACE2005 datasets), relation classification (FewRel and TACRED), and semantic role labeling (CoNLL-2005 and CoNLL-2012). We accomplish this while using the same architecture and hyperparameters for all tasks and even when training a single model to solve all tasks at the same time (multi-task learning). Finally, we show that our framework can also significantly improve the performance in a low-resource regime, thanks to better use of label semantics.

Via

Access Paper or Ask Questions

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Oct 27, 2020

Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Figure 1 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Figure 2 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Figure 3 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Figure 4 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Abstract:Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. CVT uses a much lighter model architecture and we show that it achieves similar performance to BERT on a set of sequence tagging tasks, with lesser financial and environmental impact.

* Accepted in the Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)(https://2020.emnlp.org/papers/main)

Via

Access Paper or Ask Questions

Resource-Enhanced Neural Model for Event Argument Extraction

Oct 06, 2020

Jie Ma, Shuai Wang, Rishita Anubhai, Miguel Ballesteros, Yaser Al-Onaizan

Figure 1 for Resource-Enhanced Neural Model for Event Argument Extraction

Figure 2 for Resource-Enhanced Neural Model for Event Argument Extraction

Figure 3 for Resource-Enhanced Neural Model for Event Argument Extraction

Figure 4 for Resource-Enhanced Neural Model for Event Argument Extraction

Abstract:Event argument extraction (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. Despite great efforts made in prior work, there remain many challenges: (1) Data scarcity. (2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument. (3) Integrating event trigger information into candidate argument representation. For (1), we explore using unlabeled data in different ways. For (2), we propose to use a syntax-attending Transformer that can utilize dependency parses to guide the attention mechanism. For (3), we propose a trigger-aware sequence encoder with several types of trigger-dependent sequence representations. We also support argument extraction either from text annotated with gold entities or from plain text. Experiments on the English ACE2005 benchmark show that our approach achieves a new state-of-the-art.

* Findings of EMNLP 2020

Via

Access Paper or Ask Questions

Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

Apr 08, 2020

Miguel Ballesteros, Rishita Anubhai, Shuai Wang, Nima Pourdamghani, Yogarshi Vyas, Jie Ma, Parminder Bhatia, Kathleen McKeown, Yaser Al-Onaizan

Figure 1 for Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

Abstract:In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ELMo), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. Experiments on the MATRES dataset of English documents establish a new state-of-the-art on this task.

Via

Access Paper or Ask Questions

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Dec 08, 2015

Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos(+24 more)

Figure 1 for Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Figure 2 for Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Figure 3 for Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Figure 4 for Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Abstract:We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages. Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages. Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system. Because of this efficiency, experiments that previously took weeks now run in days. This enables us to iterate more quickly to identify superior architectures and algorithms. As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets. Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

Via

Access Paper or Ask Questions