Abstract:Malicious URL classification represents a crucial aspect of cyber security. Although existing work comprises numerous machine learning and deep learning-based URL classification models, most suffer from generalisation and domain-adaptation issues arising from the lack of representative training datasets. Furthermore, these models fail to provide explanations for a given URL classification in natural human language. In this work, we investigate and demonstrate the use of Large Language Models (LLMs) to address this issue. Specifically, we propose an LLM-based one-shot learning framework that uses Chain-of-Thought (CoT) reasoning to predict whether a given URL is benign or phishing. We evaluate our framework using three URL datasets and five state-of-the-art LLMs and show that one-shot LLM prompting indeed provides performances close to supervised models, with GPT 4-Turbo being the best model, followed by Claude 3 Opus. We conduct a quantitative analysis of the LLM explanations and show that most of the explanations provided by LLMs align with the post-hoc explanations of the supervised classifiers, and the explanations have high readability, coherency, and informativeness.