Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Denis Sushentsev

Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios

Dec 19, 2024

Egor Shibaev, Denis Sushentsev, Yaroslav Golubev, Aleksandr Khvorov

Figure 1 for Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios

Figure 2 for Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios

Figure 3 for Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios

Figure 4 for Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios

Abstract:In large-scale software systems, there are often no fully-fledged bug reports with human-written descriptions when an error occurs. In this case, developers rely on stack traces, i.e., series of function calls that led to the error. Since there can be tens and hundreds of thousands of them describing the same issue from different users, automatic deduplication into categories is necessary to allow for processing. Recent works have proposed powerful deep learning-based approaches for this, but they are evaluated and compared in isolation from real-life workflows, and it is not clear whether they will actually work well at scale. To overcome this gap, this work presents three main contributions: a novel model, an industry-based dataset, and a multi-faceted evaluation. Our model consists of two parts - (1) an embedding model with byte-pair encoding and approximate nearest neighbor search to quickly find the most relevant stack traces to the incoming one, and (2) a reranker that re-ranks the most fitting stack traces, taking into account the repeated frames between them. To complement the existing datasets collected from open-source projects, we share with the community SlowOps - a dataset of stack traces from IntelliJ-based products developed by JetBrains, which has an order of magnitude more stack traces per category. Finally, we carry out an evaluation that strives to be realistic: measuring not only the accuracy of categorization, but also the operation time and the ability to create new categories. The evaluation shows that our model strikes a good balance - it outperforms other models on both open-source datasets and SlowOps, while also being faster on time than most. We release all of our code and data, and hope that our work can pave the way to further practice-oriented research in the area.

* Published at SANER'25. 11 pages, 2 figures

Via

Access Paper or Ask Questions

DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

Jan 14, 2022

Denis Sushentsev, Aleksandr Khvorov, Roman Vasiliev, Yaroslav Golubev, Timofey Bryksin

Figure 1 for DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

Figure 2 for DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

Figure 3 for DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

Figure 4 for DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

Abstract:The task of finding the best developer to fix a bug is called bug triage. Most of the existing approaches consider the bug triage task as a classification problem, however, classification is not appropriate when the sets of classes change over time (as developers often do in a project). Furthermore, to the best of our knowledge, all the existing models use textual sources of information, i.e., bug descriptions, which are not always available. In this work, we explore the applicability of existing solutions for the bug triage problem when stack traces are used as the main data source of bug reports. Additionally, we reformulate this task as a ranking problem and propose new deep learning models to solve it. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network, with the weights of the models optimized using a ranking loss function. To improve the quality of ranking, we propose using additional information from version control system annotations. Two approaches are proposed for extracting features from annotations: manual and using an additional neural network. To evaluate our models, we collected two datasets of real-world stack traces. Our experiments show that the proposed models outperform existing models adapted to handle stack traces. To facilitate further research in this area, we publish the source code of our models and one of the collected datasets.

* 12 pages, 6 figures

Via

Access Paper or Ask Questions