Abstract:Various robustness evaluation methodologies from different perspectives have been proposed for different natural language processing (NLP) tasks. These methods have often focused on either universal or task-specific generalization capabilities. In this work, we propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis. TextFlint enables practitioners to automatically evaluate their models from all aspects or to customize their evaluations as desired with just a few lines of code. To guarantee user acceptability, all the text transformations are linguistically based, and we provide a human evaluation for each one. TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness. To validate TextFlint's utility, we performed large-scale empirical evaluations (over 67,000 evaluations) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. Almost all models showed significant performance degradation, including a decline of more than 50% of BERT's prediction accuracy on tasks such as aspect-level sentiment classification, named entity recognition, and natural language inference. Therefore, we call for the robustness to be included in the model evaluation, so as to promote the healthy development of NLP technology.
Abstract:The recent deep generative models for static graphs that are now being actively developed have achieved significant success in areas such as molecule design. However, many real-world problems involve temporal graphs whose topology and attribute values evolve dynamically over time, including important applications such as protein folding, human mobility networks, and social network growth. As yet, deep generative models for temporal graphs are not yet well understood and existing techniques for static graphs are not adequate for temporal graphs since they cannot 1) encode and decode continuously-varying graph topology chronologically, 2) enforce validity via temporal constraints, or 3) ensure efficiency for information-lossless temporal resolution. To address these challenges, we propose a new model, called ``Temporal Graph Generative Adversarial Network'' (TG-GAN) for continuous-time temporal graph generation, by modeling the deep generative process for truncated temporal random walks and their compositions. Specifically, we first propose a novel temporal graph generator that jointly model truncated edge sequences, time budgets, and node attributes, with novel activation functions that enforce temporal validity constraints under recurrent architecture. In addition, a new temporal graph discriminator is proposed, which combines time and node encoding operations over a recurrent architecture to distinguish the generated sequences from the real ones sampled by a newly-developed truncated temporal random walk sampler. Extensive experiments on both synthetic and real-world datasets demonstrate TG-GAN significantly outperforms the comparison methods in efficiency and effectiveness.