Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Triet Huynh Minh Le

LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations

Feb 04, 2025

Ziyang Ye, Triet Huynh Minh Le, M. Ali Babar

Abstract:Security misconfigurations in Container Orchestrators (COs) can pose serious threats to software systems. While Static Analysis Tools (SATs) can effectively detect these security vulnerabilities, the industry currently lacks automated solutions capable of fixing these misconfigurations. The emergence of Large Language Models (LLMs), with their proven capabilities in code understanding and generation, presents an opportunity to address this limitation. This study introduces LLMSecConfig, an innovative framework that bridges this gap by combining SATs with LLMs. Our approach leverages advanced prompting techniques and Retrieval-Augmented Generation (RAG) to automatically repair security misconfigurations while preserving operational functionality. Evaluation of 1,000 real-world Kubernetes configurations achieved a 94\% success rate while maintaining a low rate of introducing new misconfigurations. Our work makes a promising step towards automated container security management, reducing the manual effort required for configuration maintenance.

Via

Access Paper or Ask Questions

Automated Code-centric Software Vulnerability Assessment: How Far Are We? An Empirical Study in C/C++

Aug 03, 2024

Anh The Nguyen, Triet Huynh Minh Le, M. Ali Babar

Figure 1 for Automated Code-centric Software Vulnerability Assessment: How Far Are We? An Empirical Study in C/C++

Figure 2 for Automated Code-centric Software Vulnerability Assessment: How Far Are We? An Empirical Study in C/C++

Figure 3 for Automated Code-centric Software Vulnerability Assessment: How Far Are We? An Empirical Study in C/C++

Figure 4 for Automated Code-centric Software Vulnerability Assessment: How Far Are We? An Empirical Study in C/C++

Abstract:Background: The C and C++ languages hold significant importance in Software Engineering research because of their widespread use in practice. Numerous studies have utilized Machine Learning (ML) and Deep Learning (DL) techniques to detect software vulnerabilities (SVs) in the source code written in these languages. However, the application of these techniques in function-level SV assessment has been largely unexplored. SV assessment is increasingly crucial as it provides detailed information on the exploitability, impacts, and severity of security defects, thereby aiding in their prioritization and remediation. Aims: We conduct the first empirical study to investigate and compare the performance of ML and DL models, many of which have been used for SV detection, for function-level SV assessment in C/C++. Method: Using 9,993 vulnerable C/C++ functions, we evaluated the performance of six multi-class ML models and five multi-class DL models for the SV assessment at the function level based on the Common Vulnerability Scoring System (CVSS). We further explore multi-task learning, which can leverage common vulnerable code to predict all SV assessment outputs simultaneously in a single model, and compare the effectiveness and efficiency of this model type with those of the original multi-class models. Results: We show that ML has matching or even better performance compared to the multi-class DL models for function-level SV assessment with significantly less training time. Employing multi-task learning allows the DL models to perform significantly better, with an average of 8-22% increase in Matthews Correlation Coefficient (MCC). Conclusions: We distill the practices of using data-driven techniques for function-level SV assessment in C/C++, including the use of multi-task DL to balance efficiency and effectiveness. This can establish a strong foundation for future work in this area.

* Accepted as a full paper in the technical track at The International Symposium on Empirical Software Engineering and Measurement (ESEM) 2024

Via

Access Paper or Ask Questions

Systematic Literature Review on Application of Machine Learning in Continuous Integration

May 22, 2023

Ali Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi, Muhammad Ali Babar

Abstract:This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in ML-based solutions for CI and analyzed various aspects such as data engineering, feature engineering, hyper-parameter tuning, ML models, evaluation methods, and metrics. In this paper, we have depicted the phases of CI testing, the connection between them, and the employed techniques in training the ML method phases. We presented nine types of data sources and four taken steps in the selected studies for preparing the data. Also, we identified four feature types and nine subsets of data features through thematic analysis of the selected studies. Besides, five methods for selecting and tuning the hyper-parameters are shown. In addition, we summarised the evaluation methods used in the literature and identified fifteen different metrics. The most commonly used evaluation methods were found to be precision, recall, and F1-score, and we have also identified five methods for evaluating the performance of trained ML models. Finally, we have presented the relationship between ML model types, performance measurements, and CI phases. The study provides valuable insights for researchers and practitioners interested in ML-based methods in CI and emphasizes the need for further research in this area.

* 37 pages, 20 Tables, 8 Figures, submitted to Journal of Systems and Software

Via

Access Paper or Ask Questions

SoK: Machine Learning for Continuous Integration

Apr 06, 2023

Ali Kazemi Arani, Mansooreh Zahedi, Triet Huynh Minh Le, Muhammad Ali Babar

Figure 1 for SoK: Machine Learning for Continuous Integration

Figure 2 for SoK: Machine Learning for Continuous Integration

Figure 3 for SoK: Machine Learning for Continuous Integration

Abstract:Continuous Integration (CI) has become a well-established software development practice for automatically and continuously integrating code changes during software development. An increasing number of Machine Learning (ML) based approaches for automation of CI phases are being reported in the literature. It is timely and relevant to provide a Systemization of Knowledge (SoK) of ML-based approaches for CI phases. This paper reports an SoK of different aspects of the use of ML for CI. Our systematic analysis also highlights the deficiencies of the existing ML-based solutions that can be improved for advancing the state-of-the-art.

* 6 pages, 2 figures, accepted in the ICSE'23 Workshop on Cloud Intelligence / AIOps

Via

Access Paper or Ask Questions