Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Fang

Dual Adversarial Alignment for Realistic Support-Query Shift Few-shot Learning

Sep 05, 2023

Siyang Jiang, Rui Fang, Hsi-Wen Chen, Wei Ding, Ming-Syan Chen

Abstract:Support-query shift few-shot learning aims to classify unseen examples (query set) to labeled data (support set) based on the learned embedding in a low-dimensional space under a distribution shift between the support set and the query set. However, in real-world scenarios the shifts are usually unknown and varied, making it difficult to estimate in advance. Therefore, in this paper, we propose a novel but more difficult challenge, RSQS, focusing on Realistic Support-Query Shift few-shot learning. The key feature of RSQS is that the individual samples in a meta-task are subjected to multiple distribution shifts in each meta-task. In addition, we propose a unified adversarial feature alignment method called DUal adversarial ALignment framework (DuaL) to relieve RSQS from two aspects, i.e., inter-domain bias and intra-domain variance. On the one hand, for the inter-domain bias, we corrupt the original data in advance and use the synthesized perturbed inputs to train the repairer network by minimizing distance in the feature level. On the other hand, for intra-domain variance, we proposed a generator network to synthesize hard, i.e., less similar, examples from the support set in a self-supervised manner and introduce regularized optimal transportation to derive a smooth optimal transportation plan. Lastly, a benchmark of RSQS is built with several state-of-the-art baselines among three datasets (CIFAR100, mini-ImageNet, and Tiered-Imagenet). Experiment results show that DuaL significantly outperforms the state-of-the-art methods in our benchmark.

* Best student paper in PAKDD 2022

Via

Access Paper or Ask Questions

Filter Pruning via Filters Similarity in Consecutive Layers

Apr 26, 2023

Xiaorui Wang, Jun Wang, Xin Tang, Peng Gao, Rui Fang, Guotong Xie

Abstract:Filter pruning is widely adopted to compress and accelerate the Convolutional Neural Networks (CNNs), but most previous works ignore the relationship between filters and channels in different layers. Processing each layer independently fails to utilize the collaborative relationship across layers. In this paper, we intuitively propose a novel pruning method by explicitly leveraging the Filters Similarity in Consecutive Layers (FSCL). FSCL compresses models by pruning filters whose corresponding features are more worthless in the model. The extensive experiments demonstrate the effectiveness of FSCL, and it yields remarkable improvement over state-of-the-art on accuracy, FLOPs and parameter reduction on several benchmark models and datasets.

* Accepted by ICASSP 2023 (oral)

Via

Access Paper or Ask Questions

STAGE: Span Tagging and Greedy Inference Scheme for Aspect Sentiment Triplet Extraction

Nov 29, 2022

Shuo Liang, Wei Wei, Xian-Ling Mao, Yuanyuan Fu, Rui Fang, Dangyang Chen

Abstract:Aspect Sentiment Triplet Extraction (ASTE) has become an emerging task in sentiment analysis research, aiming to extract triplets of the aspect term, its corresponding opinion term, and its associated sentiment polarity from a given sentence. Recently, many neural networks based models with different tagging schemes have been proposed, but almost all of them have their limitations: heavily relying on 1) prior assumption that each word is only associated with a single role (e.g., aspect term, or opinion term, etc. ) and 2) word-level interactions and treating each opinion/aspect as a set of independent words. Hence, they perform poorly on the complex ASTE task, such as a word associated with multiple roles or an aspect/opinion term with multiple words. Hence, we propose a novel approach, Span TAgging and Greedy infErence (STAGE), to extract sentiment triplets in span-level, where each span may consist of multiple words and play different roles simultaneously. To this end, this paper formulates the ASTE task as a multi-class span classification problem. Specifically, STAGE generates more accurate aspect sentiment triplet extractions via exploring span-level information and constraints, which consists of two components, namely, span tagging scheme and greedy inference strategy. The former tag all possible candidate spans based on a newly-defined tagging set. The latter retrieves the aspect/opinion term with the maximum length from the candidate sentiment snippet to output sentiment triplets. Furthermore, we propose a simple but effective model based on the STAGE, which outperforms the state-of-the-arts by a large margin on four widely-used datasets. Moreover, our STAGE can be easily generalized to other pair/triplet extraction tasks, which also demonstrates the superiority of the proposed scheme STAGE.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions

HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold

Oct 17, 2022

Ruihan Zhang, Wei Wei, Xian-Ling Mao, Rui Fang, Dangyang Chen

Figure 1 for HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold

Figure 2 for HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold

Figure 3 for HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold

Figure 4 for HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event Detection with Task-Adaptive Threshold

Abstract:Conventional event detection models under supervised learning settings suffer from the inability of transfer to newly-emerged event types owing to lack of sufficient annotations. A commonly-adapted solution is to follow a identify-then-classify manner, which first identifies the triggers and then converts the classification task via a few-shot learning paradigm. However, these methods still fall far short of expectations due to: (i) insufficient learning of discriminative representations in low-resource scenarios, and (ii) trigger misidentification caused by the overlap of the learned representations of triggers and non-triggers. To address the problems, in this paper, we propose a novel Hybrid Contrastive Learning method with a Task-Adaptive Threshold (abbreviated as HCLTAT), which enables discriminative representation learning with a two-view contrastive loss (support-support and prototype-query), and devises a easily-adapted threshold to alleviate misidentification of triggers. Extensive experiments on the benchmark dataset FewEvent demonstrate the superiority of our method to achieve better results compared to the state-of-the-arts. All the code and data of this paper will be available for online public access.

* This paper has been accepted by Findings of EMNLP 2022

Via

Access Paper or Ask Questions

Multi-level Contrastive Learning Framework for Sequential Recommendation

Aug 27, 2022

Ziyang Wang, Huoyu Liu, Wei Wei, Yue Hu, Xian-Ling Mao, Shaojian He, Rui Fang, Dangyang chen

Figure 1 for Multi-level Contrastive Learning Framework for Sequential Recommendation

Figure 2 for Multi-level Contrastive Learning Framework for Sequential Recommendation

Figure 3 for Multi-level Contrastive Learning Framework for Sequential Recommendation

Figure 4 for Multi-level Contrastive Learning Framework for Sequential Recommendation

Abstract:Sequential recommendation (SR) aims to predict the subsequent behaviors of users by understanding their successive historical behaviors. Recently, some methods for SR are devoted to alleviating the data sparsity problem (i.e., limited supervised signals for training), which take account of contrastive learning to incorporate self-supervised signals into SR. Despite their achievements, it is far from enough to learn informative user/item embeddings due to the inadequacy modeling of complex collaborative information and co-action information, such as user-item relation, user-user relation, and item-item relation. In this paper, we study the problem of SR and propose a novel multi-level contrastive learning framework for sequential recommendation, named MCLSR. Different from the previous contrastive learning-based methods for SR, MCLSR learns the representations of users and items through a cross-view contrastive learning paradigm from four specific views at two different levels (i.e., interest- and feature-level). Specifically, the interest-level contrastive mechanism jointly learns the collaborative information with the sequential transition patterns, and the feature-level contrastive mechanism re-observes the relation between users and items via capturing the co-action information (i.e., co-occurrence). Extensive experiments on four real-world datasets show that the proposed MCLSR outperforms the state-of-the-art methods consistently.

* CIKM 2022

Via

Access Paper or Ask Questions

Improving Personality Consistency in Conversation by Persona Extending

Aug 23, 2022

Yifan Liu, Wei Wei, Jiayi Liu, Xianling Mao, Rui Fang, Dangyang Chen

Figure 1 for Improving Personality Consistency in Conversation by Persona Extending

Figure 2 for Improving Personality Consistency in Conversation by Persona Extending

Figure 3 for Improving Personality Consistency in Conversation by Persona Extending

Figure 4 for Improving Personality Consistency in Conversation by Persona Extending

Abstract:Endowing chatbots with a consistent personality plays a vital role for agents to deliver human-like interactions. However, existing personalized approaches commonly generate responses in light of static predefined personas depicted with textual description, which may severely restrict the interactivity of human and the chatbot, especially when the agent needs to answer the query excluded in the predefined personas, which is so-called out-of-predefined persona problem (named OOP for simplicity). To alleviate the problem, in this paper we propose a novel retrieval-to-prediction paradigm consisting of two subcomponents, namely, (1) Persona Retrieval Model (PRM), it retrieves a persona from a global collection based on a Natural Language Inference (NLI) model, the inferred persona is consistent with the predefined personas; and (2) Posterior-scored Transformer (PS-Transformer), it adopts a persona posterior distribution that further considers the actual personas used in the ground response, maximally mitigating the gap between training and inferring. Furthermore, we present a dataset called IT-ConvAI2 that first highlights the OOP problem in personalized dialogue. Extensive experiments on both IT-ConvAI2 and ConvAI2 demonstrate that our proposed model yields considerable improvements in both automatic metrics and human evaluations.

* Accepted by CIKM 2022

Via

Access Paper or Ask Questions

Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning

Aug 22, 2022

Ding Zou, Wei Wei, Ziyang Wang, Xian-Ling Mao, Feida Zhu, Rui Fang, Dangyang Chen

Figure 1 for Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning

Figure 2 for Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning

Figure 3 for Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning

Figure 4 for Improving Knowledge-aware Recommendation with Multi-level Interactive Contrastive Learning

Abstract:Incorporating Knowledge Graphs (KG) into recommeder system has attracted considerable attention. Recently, the technical trend of Knowledge-aware Recommendation (KGR) is to develop end-to-end models based on graph neural networks (GNNs). However, the extremely sparse user-item interactions significantly degrade the performance of the GNN-based models, as: 1) the sparse interaction, means inadequate supervision signals and limits the supervised GNN-based models; 2) the combination of sparse interactions (CF part) and redundant KG facts (KG part) results in an unbalanced information utilization. Besides, the GNN paradigm aggregates local neighbors for node representation learning, while ignoring the non-local KG facts and making the knowledge extraction insufficient. Inspired by the recent success of contrastive learning in mining supervised signals from data itself, in this paper, we focus on exploring contrastive learning in KGR and propose a novel multi-level interactive contrastive learning mechanism. Different from traditional contrastive learning methods which contrast nodes of two generated graph views, interactive contrastive mechanism conducts layer-wise self-supervised learning by contrasting layers of different parts within graphs, which is also an "interaction" action. Specifically, we first construct local and non-local graphs for user/item in KG, exploring more KG facts for KGR. Then an intra-graph level interactive contrastive learning is performed within each graph, which contrasts layers of the CF and KG parts, for more consistent information leveraging. Besides, an inter-graph level interactive contrastive learning is performed between the local and non-local graphs, for sufficiently and coherently extracting non-local KG signals. Extensive experiments conducted on three benchmark datasets show the superior performance of our proposed method over the state-of-the-arts.

* Accepted to CIKM 2022

Via

Access Paper or Ask Questions

PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking

May 24, 2022

Yixuan Qiao, Hao Chen, Yongquan Lai, Jun Wang, Tuozhen Liu, Xianbin Ye, Rui Fang, Peng Gao, Wenfeng Xie, Guotong Xie

Figure 1 for PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking

Figure 2 for PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking

Figure 3 for PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking

Abstract:This paper describes the PASH participation in TREC 2021 Deep Learning Track. In the recall stage, we adopt a scheme combining sparse and dense retrieval method. In the multi-stage ranking phase, point-wise and pair-wise ranking strategies are used one after another based on model continual pre-trained on general knowledge and document-level data. Compared to TREC 2020 Deep Learning Track, we have additionally introduced the generative model T5 to further enhance the performance.

* TREC 2021

Via

Access Paper or Ask Questions

End-to-end speaker diarization with transformer

Dec 14, 2021

Yongquan Lai, Xin Tang, Yuanyuan Fu, Rui Fang

Figure 1 for End-to-end speaker diarization with transformer

Figure 2 for End-to-end speaker diarization with transformer

Figure 3 for End-to-end speaker diarization with transformer

Figure 4 for End-to-end speaker diarization with transformer

Abstract:Speaker diarization is connected to semantic segmentation in computer vision. Inspired from MaskFormer \cite{cheng2021per} which treats semantic segmentation as a set-prediction problem, we propose an end-to-end approach to predict a set of targets consisting of binary masks, vocal activities and speaker vectors. Our model, which we coin \textit{DiFormer}, is mainly based on a speaker encoder and a feature pyramid network (FPN) module to extract multi-scale speaker features which are then fed into a transformer encoder-decoder to predict a set of diarization targets from learned query embedding. To account for temporal characteristics of speech signal, bidirectional LSTMs are inserted into the mask prediction module to improve temporal consistency. Our model handles unknown number of speakers, speech overlaps, as well as vocal activity detection in a unified way. Experiments on multimedia and meeting datasets demonstrate the effectiveness of our approach.

* submitted to icassp2022

Via

Access Paper or Ask Questions

Visual-Semantic Transformer for Scene Text Recognition

Dec 02, 2021

Xin Tang, Yongquan Lai, Ying Liu, Yuanyuan Fu, Rui Fang

Figure 1 for Visual-Semantic Transformer for Scene Text Recognition

Figure 2 for Visual-Semantic Transformer for Scene Text Recognition

Figure 3 for Visual-Semantic Transformer for Scene Text Recognition

Figure 4 for Visual-Semantic Transformer for Scene Text Recognition

Abstract:Modeling semantic information is helpful for scene text recognition. In this work, we propose to model semantic and visual information jointly with a Visual-Semantic Transformer (VST). The VST first explicitly extracts primary semantic information from visual feature maps with a transformer module and a primary visual-semantic alignment module. The semantic information is then joined with the visual feature maps (viewed as a sequence) to form a pseudo multi-domain sequence combining visual and semantic information, which is subsequently fed into an transformer-based interaction module to enable learning of interactions between visual and semantic features. In this way, the visual features can be enhanced by the semantic information and vice versus. The enhanced version of visual features are further decoded by a secondary visual-semantic alignment module which shares weights with the primary one. Finally, the decoded visual features and the enhanced semantic features are jointly processed by the third transformer module obtaining the final text prediction. Experiments on seven public benchmarks including regular/ irregular text recognition datasets verifies the effectiveness our proposed model, reaching state of the art on four of the seven benchmarks.

Via

Access Paper or Ask Questions