Abstract:Minority languages are vital to preserving cultural heritage, yet they face growing risks of extinction due to limited digital resources and the dominance of artificial intelligence models trained on high-resource languages. This white paper proposes a framework to generate linguistic tools for low-resource languages, focusing on data creation to support the development of language models that can aid in preservation efforts. Sardinian, an endangered language, serves as the case study to demonstrate the framework's effectiveness. By addressing the data scarcity that hinders intelligent applications for such languages, we contribute to promoting linguistic diversity and support ongoing efforts in language standardization and revitalization through modern technologies.
Abstract:This paper presents MindTheDApp, a toolchain designed specifically for the structural analysis of Ethereum-based Decentralized Applications (DApps), with a distinct focus on a complex network-driven approach. Unlike existing tools, our toolchain combines the power of ANTLR4 and Abstract Syntax Tree (AST) traversal techniques to transform the architecture and interactions within smart contracts into a specialized bipartite graph. This enables advanced network analytics to highlight operational efficiencies within the DApp's architecture. The bipartite graph generated by the proposed tool comprises two sets of nodes: one representing smart contracts, interfaces, and libraries, and the other including functions, events, and modifiers. Edges in the graph connect functions to smart contracts they interact with, offering a granular view of interdependencies and execution flow within the DApp. This network-centric approach allows researchers and practitioners to apply complex network theory in understanding the robustness, adaptability, and intricacies of decentralized systems. Our work contributes to the enhancement of security in smart contracts by allowing the visualisation of the network, and it provides a deep understanding of the architecture and operational logic within DApps. Given the growing importance of smart contracts in the blockchain ecosystem and the emerging application of complex network theory in technology, our toolchain offers a timely contribution to both academic research and practical applications in the field of blockchain technology.
Abstract:This paper evaluates the capability of two state-of-the-art artificial intelligence (AI) models, GPT-3.5 and Bard, in generating Java code given a function description. We sourced the descriptions from CodingBat.com, a popular online platform that provides practice problems to learn programming. We compared the Java code generated by both models based on correctness, verified through the platform's own test cases. The results indicate clear differences in the capabilities of the two models. GPT-3.5 demonstrated superior performance, generating correct code for approximately 90.6% of the function descriptions, whereas Bard produced correct code for 53.1% of the functions. While both models exhibited strengths and weaknesses, these findings suggest potential avenues for the development and refinement of more advanced AI-assisted code generation tools. The study underlines the potential of AI in automating and supporting aspects of software development, although further research is required to fully realize this potential.
Abstract:This work aims to analyse the predictability of price movements of cryptocurrencies on both hourly and daily data observed from January 2017 to January 2021, using deep learning algorithms. For our experiments, we used three sets of features: technical, trading and social media indicators, considering a restricted model of only technical indicators and an unrestricted model with technical, trading and social media indicators. We verified whether the consideration of trading and social media indicators, along with the classic technical variables (such as price's returns), leads to a significative improvement in the prediction of cryptocurrencies price's changes. We conducted the study on the two highest cryptocurrencies in volume and value (at the time of the study): Bitcoin and Ethereum. We implemented four different machine learning algorithms typically used in time-series classification problems: Multi Layers Perceptron (MLP), Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) neural network and Attention Long Short Term Memory (ALSTM). We devised the experiments using the advanced bootstrap technique to consider the variance problem on test samples, which allowed us to evaluate a more reliable estimate of the model's performance. Furthermore, the Grid Search technique was used to find the best hyperparameters values for each implemented algorithm. The study shows that, based on the hourly frequency results, the unrestricted model outperforms the restricted one. The addition of the trading indicators to the classic technical indicators improves the accuracy of Bitcoin and Ethereum price's changes prediction, with an increase of accuracy from a range of 51-55% for the restricted model, to 67-84% for the unrestricted model.