Abstract:Automatic code summarization is beneficial to software development and maintenance since it reduces the burden of manual tasks. Currently, artificial intelligence is undergoing a paradigm shift. The foundation models pretrained on massive data and finetuned to downstream tasks surpass specially customized models. This trend inspired us to consider reusing foundation models instead of learning from scratch. Based on this, we propose a flexible and robust approach for automatic code summarization based on neural networks. We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo. Moreover, we utilize Gaussian noise as the simulation of contextual information to optimize the latent representation. Furthermore, we introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning, and design intermediate stage tasks for general sequence-to-sequence learning. Finally, we evaluate AdaMo against a benchmark dataset for code summarization, by comparing it with state-of-the-art models.
Abstract:Genetic Algorithms (GAs) are a powerful technique to address hard optimisation problems. However, scalability issues might prevent them from being applied to real-world problems. Exploiting parallel GAs in the cloud might be an affordable approach to get time efficient solutions that benefit of the appealing features of the cloud, such as scalability, reliability, fault-tolerance and cost-effectiveness. Nevertheless, distributed computation is very prone to cause considerable overhead for communication and making GAs distributed in an on-demand fashion is not trivial. Aiming to keep under control the communication overhead and support GAs developers in the construction and deployment of parallel GAs in the cloud, in this paper we propose an approach to distribute GAs using the global parallelisation model, exploiting software containers and their cloud orchestration. We also devised a conceptual workflow covering each cloud GAs distribution phase, from resources allocation to actual deployment and execution, in a DevOps fashion.
Abstract:Genetic Algorithms (GAs) are powerful metaheuristic techniques mostly used in many real-world applications. The sequential execution of GAs requires considerable computational power both in time and resources. Nevertheless, GAs are naturally parallel and accessing a parallel platform such as Cloud is easy and cheap. Apache Hadoop is one of the common services that can be used for parallel applications. However, using Hadoop to develop a parallel version of GAs is not simple without facing its inner workings. Even though some sequential frameworks for GAs already exist, there is no framework supporting the development of GA applications that can be executed in parallel. In this paper is described a framework for parallel GAs on the Hadoop platform, following the paradigm of MapReduce. The main purpose of this framework is to allow the user to focus on the aspects of GA that are specific to the problem to be addressed, being sure that this task is going to be correctly executed on the Cloud with a good performance. The framework has been also exploited to develop an application for Feature Subset Selection problem. A preliminary analysis of the performance of the developed GA application has been performed using three datasets and shown very promising performance.