Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiaheng Huang

Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective

Mar 02, 2024

Fufangchen Zhao, Guoqiang Jin, Jiaheng Huang, Rui Zhao, Fei Tan

Figure 1 for Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective

Figure 2 for Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective

Figure 3 for Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective

Figure 4 for Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective

Abstract:Nowadays both commercial and open-source academic LLM have become the mainstream models of NLP. However, there is still a lack of research on LLM consistency, meaning that throughout the various stages of LLM research and deployment, its internal parameters and capabilities should remain unchanged. This issue exists in both the industrial and academic sectors. The solution to this problem is often time-consuming and labor-intensive, and there is also an additional cost of secondary deployment, resulting in economic and time losses. To fill this gap, we build an LLM consistency task dataset and design several baselines. Additionally, we choose models of diverse scales for the main experiments. Specifically, in the LightGBM experiment, we used traditional NLG metrics (i.e., ROUGE, BLEU, METEOR) as the features needed for model training. The final result exceeds the manual evaluation and GPT3.5 as well as other models in the main experiment, achieving the best performance. In the end, we use the best performing LightGBM model as the base model to build the evaluation tool, which can effectively assist in the deployment of business models. Our code and tool demo are available at https://github.com/heavenhellchen/Consistency.git

* This paper is not ready

Via

Access Paper or Ask Questions