Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chuanyi Li

A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks

Dec 25, 2023

Wentao Zou, Qi Li, Jidong Ge, Chuanyi Li, Xiaoyu Shen, Liguo Huang, Bin Luo

Abstract:Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks following the ``pre-train then fine-tune'' paradigm. As fully fine-tuning all parameters of PTMs can be computationally expensive, a widely used solution is parameter-efficient fine-tuning (PEFT), which freezes PTMs while introducing extra parameters. Though work has been done to test PEFT methods in the SE field, a comprehensive evaluation is still lacking. This paper aims to fill in this gap by evaluating the effectiveness of five PEFT methods on eight PTMs and four SE downstream tasks. For different tasks and PEFT methods, we seek answers to the following research questions: 1) Is it more effective to use PTMs trained specifically on source code, or is it sufficient to use PTMs trained on natural language text? 2) What is the impact of varying model sizes? 3) How does the model architecture affect the performance? Besides effectiveness, we also discuss the efficiency of PEFT methods, concerning the costs of required training time and GPU resource consumption. We hope that our findings can provide a deeper understanding of PEFT methods on various PTMs and SE downstream tasks. All the codes and data are available at \url{https://github.com/zwtnju/PEFT.git}.

Via

Access Paper or Ask Questions

Judicial Intelligent Assistant System: Extracting Events from Divorce Cases to Detect Disputes for the Judge

Mar 23, 2023

Yuan Zhang, Chuanyi Li, Yu Sheng, Jidong Ge, Bin Luo

Abstract:In formal procedure of civil cases, the textual materials provided by different parties describe the development process of the cases. It is a difficult but necessary task to extract the key information for the cases from these textual materials and to clarify the dispute focus of related parties. Currently, officers read the materials manually and use methods, such as keyword searching and regular matching, to get the target information. These approaches are time-consuming and heavily depending on prior knowledge and carefulness of the officers. To assist the officers to enhance working efficiency and accuracy, we propose an approach to detect disputes from divorce cases based on a two-round-labeling event extracting technique in this paper. We implement the Judicial Intelligent Assistant (JIA) system according to the proposed approach to 1) automatically extract focus events from divorce case materials, 2) align events by identifying co-reference among them, and 3) detect conflicts among events brought by the plaintiff and the defendant. With the JIA system, it is convenient for judges to determine the disputed issues. Experimental results demonstrate that the proposed approach and system can obtain the focus of cases and detect conflicts more effectively and efficiently comparing with existing method.

* 20 pages

Via

Access Paper or Ask Questions

CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Feb 10, 2023

Changan Niu, Chuanyi Li, Vincent Ng, Bin Luo

Figure 1 for CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Figure 2 for CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Figure 3 for CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Figure 4 for CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Abstract:Despite the recent advances showing that a model pre-trained on large-scale source code data is able to gain appreciable generalization capability, it still requires a sizeable amount of data on the target task for fine-tuning. And the effectiveness of the model generalization is largely affected by the size and quality of the fine-tuning data, which is detrimental for target tasks with limited or unavailable resources. Therefore, cross-task generalization, with the goal of improving the generalization of the model to unseen tasks that have not been seen before, is of strong research and application value. In this paper, we propose a large-scale benchmark that includes 216 existing code-related tasks. Then, we annotate each task with the corresponding meta information such as task description and instruction, which contains detailed information about the task and a solution guide. This also helps us to easily create a wide variety of ``training/evaluation'' task splits to evaluate the various cross-task generalization capabilities of the model. Then we perform some preliminary experiments to demonstrate that the cross-task generalization of models can be largely improved by in-context learning methods such as few-shot learning and learning from task instructions, which shows the promising prospects of conducting cross-task learning research on our benchmark. We hope that the collection of the datasets and our benchmark will facilitate future work that is not limited to cross-task generalization.

* ICSE 2023

Via

Access Paper or Ask Questions

Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code

May 24, 2022

Changan Niu, Chuanyi Li, Bin Luo, Vincent Ng

Figure 1 for Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code

Figure 2 for Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code

Figure 3 for Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code

Figure 4 for Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code

Abstract:Recent years have seen the successful application of deep learning to software engineering (SE). In particular, the development and use of pre-trained models of source code has enabled state-of-the-art results to be achieved on a wide variety of SE tasks. This paper provides an overview of this rapidly advancing field of research and reflects on future research directions.

* IJCAI 2022: Survey Track

Via

Access Paper or Ask Questions

Neural Program Repair: Systems, Challenges and Solutions

Feb 22, 2022

Wenkang Zhong, Chuanyi Li, Jidong Ge, Bin Luo

Figure 1 for Neural Program Repair: Systems, Challenges and Solutions

Figure 2 for Neural Program Repair: Systems, Challenges and Solutions

Figure 3 for Neural Program Repair: Systems, Challenges and Solutions

Figure 4 for Neural Program Repair: Systems, Challenges and Solutions

Abstract:Automated Program Repair (APR) aims to automatically fix bugs in the source code. Recently, as advances in Deep Learning (DL) field, there is a rise of Neural Program Repair (NPR) studies, which formulate APR as a translation task from buggy code to correct code and adopt neural networks based on encoder-decoder architecture. Compared with other APR techniques, NPR approaches have a great advantage in applicability because they do not need any specification (i.e., a test suite). Although NPR has been a hot research direction, there isn't any overview on this field yet. In order to help interested readers understand architectures, challenges and corresponding solutions of existing NPR systems, we conduct a literature review on latest studies in this paper. We begin with introducing the background knowledge on this field. Next, to be understandable, we decompose the NPR procedure into a series of modules and explicate various design choices on each module. Furthermore, we identify several challenges and discuss the effect of existing solutions. Finally, we conclude and provide some promising directions for future research.

* 9 pages, 2 figures

Via

Access Paper or Ask Questions

Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Dec 13, 2021

Yunyun Huang, Xiaoyu Shen, Chuanyi Li, Jidong Ge, Bin Luo

Figure 1 for Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Figure 2 for Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Figure 3 for Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Figure 4 for Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer

Abstract:Given the fact of a case, Legal Judgment Prediction (LJP) involves a series of sub-tasks such as predicting violated law articles, charges and term of penalty. We propose leveraging a unified text-to-text Transformer for LJP, where the dependencies among sub-tasks can be naturally established within the auto-regressive decoder. Compared with previous works, it has three advantages: (1) it fits in the pretraining pattern of masked language models, and thereby can benefit from the semantic prompts of each sub-task rather than treating them as atomic labels, (2) it utilizes a single unified architecture, enabling full parameter sharing across all sub-tasks, and (3) it can incorporate both classification and generative sub-tasks. We show that this unified transformer, albeit pretrained on general-domain text, outperforms pretrained models tailored specifically for the legal domain. Through an extensive set of experiments, we find that the best order to capture dependencies is different from human intuitions, and the most reasonable logical order for humans can be sub-optimal for the model. We further include two more auxiliary tasks: court view generation and article content prediction, showing they can not only improve the prediction accuracy, but also provide interpretable explanations for model outputs even when an error is made. With the best configuration, our model outperforms both previous SOTA and a single-tasked version of the unified transformer by a large margin.

* The first two authors contributed equally

Via

Access Paper or Ask Questions

AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization

Dec 02, 2021

Ze Tang, Chuanyi Li, Jidong Ge, Xiaoyu Shen, Zheling Zhu, Bin Luo

Figure 1 for AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization

Abstract:Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the encoder about the structural information. However, ASTs are usually much longer than the source code. Current approaches ignore the size limit and simply feed the whole linearized AST into the encoder. To address this problem, we propose AST-Transformer to efficiently encode tree-structured ASTs. Experiments show that AST-Transformer outperforms the state-of-arts by a substantial margin while being able to reduce $90\sim95\%$ of the computational complexity in the encoding process.

Via

Access Paper or Ask Questions

Learning Fine-grained Fact-Article Correspondence in Legal Cases

Apr 24, 2021

Jidong Ge, Yunyun huang, Xiaoyu Shen, Chuanyi Li, Wei Hu, Bin Luo

Figure 1 for Learning Fine-grained Fact-Article Correspondence in Legal Cases

Figure 2 for Learning Fine-grained Fact-Article Correspondence in Legal Cases

Figure 3 for Learning Fine-grained Fact-Article Correspondence in Legal Cases

Figure 4 for Learning Fine-grained Fact-Article Correspondence in Legal Cases

Abstract:Automatically recommending relevant law articles to a given legal case has attracted much attention as it can greatly release human labor from searching over the large database of laws. However, current researches only support coarse-grained recommendation where all relevant articles are predicted as a whole without explaining which specific fact each article is relevant with. Since one case can be formed of many supporting facts, traversing over them to verify the correctness of recommendation results can be time-consuming. We believe that learning fine-grained correspondence between each single fact and law articles is crucial for an accurate and trustworthy AI system. With this motivation, we perform a pioneering study and create a corpus with manually annotated fact-article correspondences. We treat the learning as a text matching task and propose a multi-level matching network to address it. To help the model better digest the content of law articles, we parse articles in form of premise-conclusion pairs with random forest. Experiments show that the parsed form yielded better performance and the resulting model surpassed other popular text matching baselines. Furthermore, we compare with previous researches and find that establishing the fine-grained fact-article correspondences can improve the recommendation accuracy by a large margin. Our best system reaches an F1 score of 96.3%, making it of great potential for practical use. It can also significantly boost the downstream task of legal decision prediction, increasing the F1 score by up to 12.7%.

* Code and dataset are available at https://github.com/gjdnju/MLMN

Via

Access Paper or Ask Questions

Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Mar 22, 2021

Yuxiang Liu, Jidong Ge, Chuanyi Li, Jie Gui

Figure 1 for Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Figure 2 for Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Figure 3 for Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Figure 4 for Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Abstract:Normalization operations are essential for state-of-the-art neural networks and enable us to train a network from scratch with a large learning rate (LR). We attempt to explain the real effect of Batch Normalization (BN) from the perspective of variance transmission by investigating the relationship between BN and Weights Normalization (WN). In this work, we demonstrate that the problem of the shift of the average gradient will amplify the variance of every convolutional (conv) layer. We propose Parametric Weights Standardization (PWS), a fast and robust to mini-batch size module used for conv filters, to solve the shift of the average gradient. PWS can provide the speed-up of BN. Besides, it has less computation and does not change the output of a conv layer. PWS enables the network to converge fast without normalizing the outputs. This result enhances the persuasiveness of the shift of the average gradient and explains why BN works from the perspective of variance transmission. The code and appendix will be made available on https://github.com/lyxzzz/PWSConv.

* This paper has been accepted by AAAI21

Via

Access Paper or Ask Questions