Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Honghong Zhao

SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model

Aug 13, 2024

Dayong Wu, Jiaqi Li, Baoxin Wang, Honghong Zhao, Siyuan Xue, Yanjie Yang, Zhijun Chang, Rui Zhang, Li Qian, Bo Wang(+3 more)

Figure 1 for SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model

Figure 2 for SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model

Figure 3 for SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model

Figure 4 for SparkRA: A Retrieval-Augmented Knowledge Service System Based on Spark Large Language Model

Abstract:Large language models (LLMs) have shown remarkable achievements across various language tasks.To enhance the performance of LLMs in scientific literature services, we developed the scientific literature LLM (SciLit-LLM) through pre-training and supervised fine-tuning on scientific literature, building upon the iFLYTEK Spark LLM. Furthermore, we present a knowledge service system Spark Research Assistant (SparkRA) based on our SciLit-LLM. SparkRA is accessible online and provides three primary functions: literature investigation, paper reading, and academic writing. As of July 30, 2024, SparkRA has garnered over 50,000 registered users, with a total usage count exceeding 1.3 million.

Via

Access Paper or Ask Questions

Overview of CTC 2021: Chinese Text Correction for Native Speakers

Aug 11, 2022

Honghong Zhao, Baoxin Wang, Dayong Wu, Wanxiang Che, Zhigang Chen, Shijin Wang

Figure 1 for Overview of CTC 2021: Chinese Text Correction for Native Speakers

Figure 2 for Overview of CTC 2021: Chinese Text Correction for Native Speakers

Figure 3 for Overview of CTC 2021: Chinese Text Correction for Native Speakers

Figure 4 for Overview of CTC 2021: Chinese Text Correction for Native Speakers

Abstract:In this paper, we present an overview of the CTC 2021, a Chinese text correction task for native speakers. We give detailed descriptions of the task definition and the data for training as well as evaluation. We also summarize the approaches investigated by the participants of this task. We hope the data sets collected and annotated for this task can facilitate and expedite future development in this research area. Therefore, the pseudo training data, gold standards validation data, and entire leaderboard is publicly available online at https://destwang.github.io/CTC2021-explorer/.

Via

Access Paper or Ask Questions

BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Oct 13, 2020

Jianquan Li, Xiaokang Liu, Honghong Zhao, Ruifeng Xu, Min Yang, Yaohong Jin

Figure 1 for BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Figure 2 for BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Figure 3 for BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Figure 4 for BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Abstract:Pre-trained language models (e.g., BERT) have achieved significant success in various natural language processing (NLP) tasks. However, high storage and computational costs obstruct pre-trained language models to be effectively deployed on resource-constrained devices. In this paper, we propose a novel BERT distillation method based on many-to-many layer mapping, which allows each intermediate student layer to learn from any intermediate teacher layers. In this way, our model can learn from different teacher layers adaptively for various NLP tasks. %motivated by the intuition that different NLP tasks require different levels of linguistic knowledge contained in the intermediate layers of BERT. In addition, we leverage Earth Mover's Distance (EMD) to compute the minimum cumulative cost that must be paid to transform knowledge from teacher network to student network. EMD enables the effective matching for many-to-many layer mapping. %EMD can be applied to network layers with different sizes and effectively measures semantic distance between the teacher network and student network. Furthermore, we propose a cost attention mechanism to learn the layer weights used in EMD automatically, which is supposed to further improve the model's performance and accelerate convergence time. Extensive experiments on GLUE benchmark demonstrate that our model achieves competitive performance compared to strong competitors in terms of both accuracy and model compression.

* EMNLP 2020

Via

Access Paper or Ask Questions