Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Burn Lewis

Automated Code generation for Information Technology Tasks in YAML through Large Language Models

May 05, 2023

Saurabh Pujar, Luca Buratti, Xiaojie Guo, Nicolas Dupuis, Burn Lewis, Sahil Suneja, Atin Sood, Ganesh Nalawade, Matthew Jones, Alessandro Morari(+1 more)

Figure 1 for Automated Code generation for Information Technology Tasks in YAML through Large Language Models

Figure 2 for Automated Code generation for Information Technology Tasks in YAML through Large Language Models

Figure 3 for Automated Code generation for Information Technology Tasks in YAML through Large Language Models

Figure 4 for Automated Code generation for Information Technology Tasks in YAML through Large Language Models

Abstract:The recent improvement in code generation capabilities due to the use of large language models has mainly benefited general purpose programming languages. Domain specific languages, such as the ones used for IT Automation, have received far less attention, despite involving many active developers and being an essential component of modern cloud platforms. This work focuses on the generation of Ansible-YAML, a widely used markup language for IT Automation. We present Ansible Wisdom, a natural-language to Ansible-YAML code generation tool, aimed at improving IT automation productivity. Ansible Wisdom is a transformer-based model, extended by training with a new dataset containing Ansible-YAML. We also develop two novel performance metrics for YAML and Ansible to capture the specific characteristics of this domain. Results show that Ansible Wisdom can accurately generate Ansible script from natural language prompts with performance comparable or better than existing state of the art code generation models.

Via

Access Paper or Ask Questions

D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Feb 16, 2021

Yunhui Zheng, Saurabh Pujar, Burn Lewis, Luca Buratti, Edward Epstein, Bo Yang, Jim Laredo, Alessandro Morari, Zhong Su

Figure 1 for D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Figure 2 for D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Figure 3 for D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Figure 4 for D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Abstract:Static analysis tools are widely used for vulnerability detection as they understand programs with complex behavior and millions of lines of code. Despite their popularity, static analysis tools are known to generate an excess of false positives. The recent ability of Machine Learning models to understand programming languages opens new possibilities when applied to static analysis. However, existing datasets to train models for vulnerability identification suffer from multiple limitations such as limited bug context, limited size, and synthetic and unrealistic source code. We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. The D2A dataset is built by analyzing version pairs from multiple open source projects. From each project, we select bug fixing commits and we run static analysis on the versions before and after such commits. If some issues detected in a before-commit version disappear in the corresponding after-commit version, they are very likely to be real bugs that got fixed by the commit. We use D2A to generate a large labeled dataset to train models for vulnerability identification. We show that the dataset can be used to build a classifier to identify possible false alarms among the issues reported by static analysis, hence helping developers prioritize and investigate potential true positives first.

* Accepted to the 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP '21)

Via

Access Paper or Ask Questions