Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edi Sutoyo

Deep Learning and Data Augmentation for Detecting Self-Admitted Technical Debt

Oct 21, 2024

Edi Sutoyo, Paris Avgeriou, Andrea Capiluppi

Abstract:Self-Admitted Technical Debt (SATD) refers to circumstances where developers use textual artifacts to explain why the existing implementation is not optimal. Past research in detecting SATD has focused on either identifying SATD (classifying SATD items as SATD or not) or categorizing SATD (labeling instances as SATD that pertain to requirement, design, code, test debt, etc.). However, the performance of these approaches remains suboptimal, particularly for specific types of SATD, such as test and requirement debt, primarily due to extremely imbalanced datasets. To address these challenges, we build on earlier research by utilizing BiLSTM architecture for the binary identification of SATD and BERT architecture for categorizing different types of SATD. Despite their effectiveness, both architectures struggle with imbalanced data. Therefore, we employ a large language model data augmentation strategy to mitigate this issue. Furthermore, we introduce a two-step approach to identify and categorize SATD across various datasets derived from different artifacts. Our contributions include providing a balanced dataset for future SATD researchers and demonstrating that our approach significantly improves SATD identification and categorization performance compared to baseline methods.

* Accepted to be published at the 2024 31st Asia-Pacific Software Engineering Conference (APSEC)

Via

Access Paper or Ask Questions

SATDAUG -- A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt

Mar 12, 2024

Edi Sutoyo, Andrea Capiluppi

Abstract:Self-admitted technical debt (SATD) refers to a form of technical debt in which developers explicitly acknowledge and document the existence of technical shortcuts, workarounds, or temporary solutions within the codebase. Over recent years, researchers have manually labeled datasets derived from various software development artifacts: source code comments, messages from the issue tracker and pull request sections, and commit messages. These datasets are designed for training, evaluation, performance validation, and improvement of machine learning and deep learning models to accurately identify SATD instances. However, class imbalance poses a serious challenge across all the existing datasets, particularly when researchers are interested in categorizing the specific types of SATD. In order to address the scarcity of labeled data for SATD \textit{identification} (i.e., whether an instance is SATD or not) and \textit{categorization} (i.e., which type of SATD is being classified) in existing datasets, we share the \textit{SATDAUG} dataset, an augmented version of existing SATD datasets, including source code comments, issue tracker, pull requests, and commit messages. These augmented datasets have been balanced in relation to the available artifacts and provide a much richer source of labeled data for training machine learning or deep learning models.

* Accepted to be published at the 21st IEEE/ACM International Conference on Mining Software Repositories (MSR 2024)

Via

Access Paper or Ask Questions

Detecting Technical Debt Using Natural Language Processing Approaches -- A Systematic Literature Review

Dec 19, 2023

Edi Sutoyo, Andrea Capiluppi

Abstract:Context: Technical debt (TD) is a well-known metaphor for the long-term effects of architectural decisions in software development and the trade-off between producing high-quality, effective, and efficient code and meeting a release schedule. Thus, the code degrades and needs refactoring. A lack of resources, time, knowledge, or experience on the development team might cause TD in any software development project. Objective: In the context of TD detection, NLP has been utilized to identify the presence of TD automatically and even recognize specific types of TD. However, the enormous variety of feature extraction approaches and ML/DL algorithms employed in the literature often hinders researchers from trying to improve their performance. Method: In light of this, this SLR proposes a taxonomy of feature extraction techniques and algorithms used in technical debt detection: its objective is to compare and benchmark their performance in the examined studies. Results: We selected 55 articles that passed the quality evaluation of this SLR. We then investigated which feature extractions and algorithms were employed to identify TD in each SDLC phase. All approaches proposed in the analyzed studies were grouped into NLP, NLP+ML, and NLP+DL. This allows us to discuss the performance in three different ways. Conclusion: Overall, the NLP+DL group consistently outperforms in precision and F1-score for all projects, and in all but one project for the recall metric. Regarding the feature extraction techniques, the PTWE consistently achieves higher precision, recall, and F1-score for each project analyzed. Furthermore, TD types have been mapped, when possible, to SDLC phases: this served to determine the best-performing feature extractions and algorithms for each SDLC phase. Finally, based on the SLR results, we also identify implications that could be of concern to researchers and practitioners.

Via

Access Paper or Ask Questions