Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Heng-Da Xu

A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation

Dec 16, 2024

Tian-Yi Che, Xian-Ling Mao, Chun Xu, Cheng-Xin Xin, Heng-Da Xu, Jin-Yu Liu, Heyan Huang

Figure 1 for A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation

Figure 2 for A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation

Figure 3 for A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation

Figure 4 for A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation

Abstract:Numerous retrieval models, including sparse, dense and llm-based methods, have demonstrated remarkable performance in predicting the relevance between queries and corpora. However, the preliminary effectiveness analysis experiments indicate that these models fail to achieve satisfactory performance on the majority of queries and corpora, revealing their effectiveness restricted to specific scenarios. Thus, to tackle this problem, we propose a novel Distributed Collaborative Retrieval Framework (DCRF), outperforming each single model across all queries and corpora. Specifically, the framework integrates various retrieval models into a unified system and dynamically selects the optimal results for each user's query. It can easily aggregate any retrieval model and expand to any application scenarios, illustrating its flexibility and scalability.Moreover, to reduce maintenance and training costs, we design four effective prompting strategies with large language models (LLMs) to evaluate the quality of ranks without reliance of labeled data. Extensive experiments demonstrate that proposed framework, combined with 8 efficient retrieval models, can achieve performance comparable to effective listwise methods like RankGPT and ListT5, while offering superior efficiency. Besides, DCRF surpasses all selected retrieval models on the most datasets, indicating the effectiveness of our prompting strategies on rank-oriented automatic evaluation.

Via

Access Paper or Ask Questions

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

May 26, 2021

Heng-Da Xu, Zhongli Li, Qingyu Zhou, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang, Xian-Ling Mao

Figure 1 for Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Figure 2 for Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Figure 3 for Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Figure 4 for Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Abstract:Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in the Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts noticed this phenomenon and try to use the similarity for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called ReaLiSe, by directly leveraging the multimodal information of the Chinese characters. The ReaLiSe model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.

* In ACL Findings 2021

Via

Access Paper or Ask Questions

Generating Informative Dialogue Responses with Keywords-Guided Networks

Jul 03, 2020

Heng-Da Xu, Xian-Ling Mao, Zewen Chi, Jing-Jing Zhu, Fanshu Sun, Heyan Huang

Figure 1 for Generating Informative Dialogue Responses with Keywords-Guided Networks

Figure 2 for Generating Informative Dialogue Responses with Keywords-Guided Networks

Figure 3 for Generating Informative Dialogue Responses with Keywords-Guided Networks

Figure 4 for Generating Informative Dialogue Responses with Keywords-Guided Networks

Abstract:Recently, open-domain dialogue systems have attracted growing attention. Most of them use the sequence-to-sequence (Seq2Seq) architecture to generate responses. However, traditional Seq2Seq-based open-domain dialogue models tend to generate generic and safe responses, which are less informative, unlike human responses. In this paper, we propose a simple but effective keywords-guided Sequence-to-Sequence model (KW-Seq2Seq) which uses keywords information as guidance to generate open-domain dialogue responses. Specifically, KW-Seq2Seq first uses a keywords decoder to predict some topic keywords, and then generates the final response under the guidance of them. Extensive experiments demonstrate that the KW-Seq2Seq model produces more informative, coherent and fluent responses, yielding substantive gain in both automatic and human evaluation metrics.

Via

Access Paper or Ask Questions

Complicated Table Structure Recognition

Aug 28, 2019

Zewen Chi, Heyan Huang, Heng-Da Xu, Houjin Yu, Wanxuan Yin, Xian-Ling Mao

Figure 1 for Complicated Table Structure Recognition

Figure 2 for Complicated Table Structure Recognition

Figure 3 for Complicated Table Structure Recognition

Figure 4 for Complicated Table Structure Recognition

Abstract:The task of table structure recognition aims to recognize the internal structure of a table, which is a key step to make machines understand tables. Currently, there are lots of studies on this task for different file formats such as ASCII text and HTML. It also attracts lots of attention to recognize the table structures in PDF files. However, it is hard for the existing methods to accurately recognize the structure of complicated tables in PDF files. The complicated tables contain spanning cells which occupy at least two columns or rows. To address the issue, we propose a novel graph neural network for recognizing the table structure in PDF files, named GraphTSR. Specifically, it takes table cells as input, and then recognizes the table structures by predicting relations among cells. Moreover, to evaluate the task better, we construct a large-scale table structure recognition dataset from scientific papers, named SciTSR, which contains 15,000 tables from PDF files and their corresponding structure labels. Extensive experiments demonstrate that our proposed model is highly effective for complicated tables and outperforms state-of-the-art baselines over a benchmark dataset and our new constructed dataset.

Via

Access Paper or Ask Questions