Open Universiteit Nederland
Abstract:Transformer-based large language models (LLMs) have demonstrated significant potential in addressing logic problems. capitalizing on the great capabilities of LLMs for code-related activities, several frameworks leveraging logical solvers for logic reasoning have been proposed recently. While existing research predominantly focuses on viewing LLMs as natural language logic solvers or translators, their roles as logic code interpreters and executors have received limited attention. This study delves into a novel aspect, namely logic code simulation, which forces LLMs to emulate logical solvers in predicting the results of logical programs. To further investigate this novel task, we formulate our three research questions: Can LLMs efficiently simulate the outputs of logic codes? What strength arises along with logic code simulation? And what pitfalls? To address these inquiries, we curate three novel datasets tailored for the logic code simulation task and undertake thorough experiments to establish the baseline performance of LLMs in code simulation. Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL). This technique advocates a dual-path thinking approach for LLMs, which has demonstrated state-of-the-art performance compared to other LLM prompt strategies, achieving a notable improvement in accuracy by 7.06% with GPT-4-Turbo.
Abstract:This report presents the results of the DeepSolaris project that was carried out under the ESS action 'Merging Geostatistics and Geospatial Information in Member States'. During the project several deep learning algorithms were evaluated to detect solar panels in remote sensing data. The aim of the project was to evaluate whether deep learning models could be developed, that worked across different member states in the European Union. Two remote sensing data sources were considered: aerial images on the one hand, and satellite images on the other. Two flavours of deep learning models were evaluated: classification models and object detection models. For the evaluation of the deep learning models we used a cross-site evaluation approach: the deep learning models where trained in one geographical area and then evaluated on a different geographical area, previously unseen by the algorithm. The cross-site evaluation was furthermore carried out twice: deep learning models trained on he Netherlands were evaluated on Germany and vice versa. While the deep learning models were able to detect solar panels successfully, false detection remained a problem. Moreover, model performance decreased dramatically when evaluated in a cross-border fashion. Hence, training a model that performs reliably across different countries in the European Union is a challenging task. That being said, the models detected quite a share of solar panels not present in current solar panel registers and therefore can already be used as-is to help reduced manual labor in checking these registers.