Abstract:This chapter provides an overview of research works that present approaches with some degree of cross-fertilisation between Computational Argumentation and Machine Learning. Our review of the literature identified two broad themes representing the purpose of the interaction between these two areas: argumentation for machine learning and machine learning for argumentation. Across these two themes, we systematically evaluate the spectrum of works across various dimensions, including the type of learning and the form of argumentation framework used. Further, we identify three types of interaction between these two areas: synergistic approaches, where the Argumentation and Machine Learning components are tightly integrated; segmented approaches, where the two are interleaved such that the outputs of one are the inputs of the other; and approximated approaches, where one component shadows the other at a chosen level of detail. We draw conclusions about the suitability of certain forms of Argumentation for supporting certain types of Machine Learning, and vice versa, with clear patterns emerging from the review. Whilst the reviewed works provide inspiration for successfully combining the two fields of research, we also identify and discuss limitations and challenges that ought to be addressed in order to ensure that they remain a fruitful pairing as AI advances.
Abstract:Gradual semantics have demonstrated great potential in argumentation, in particular for deploying quantitative bipolar argumentation frameworks (QBAFs) in a number of real-world settings, from judgmental forecasting to explainable AI. In this paper, we provide a novel methodology for obtaining gradual semantics for structured argumentation frameworks, where the building blocks of arguments and relations between them are known, unlike in QBAFs, where arguments are abstract entities. Differently from existing approaches, our methodology accommodates incomplete information about arguments' premises. We demonstrate the potential of our approach by introducing two different instantiations of the methodology, leveraging existing gradual semantics for QBAFs in these more complex frameworks. We also define a set of novel properties for gradual semantics in structured argumentation, discuss their suitability over a set of existing properties. Finally, we provide a comprehensive theoretical analysis assessing the instantiations, demonstrating the their advantages over existing gradual semantics for QBAFs and structured argumentation.
Abstract:In recent years, various methods have been introduced for explaining the outputs of "black-box" AI models. However, it is not well understood whether users actually comprehend and trust these explanations. In this paper, we focus on explanations for a regression tool for assessing cancer risk and examine the effect of the explanations' content and format on the user-centric metrics of comprehension and trust. Regarding content, we experiment with two explanation methods: the popular SHAP, based on game-theoretic notions and thus potentially complex for everyday users to comprehend, and occlusion-1, based on feature occlusion which may be more comprehensible. Regarding format, we present SHAP explanations as charts (SC), as is conventional, and occlusion-1 explanations as charts (OC) as well as text (OT), to which their simpler nature also lends itself. The experiments amount to user studies questioning participants, with two different levels of expertise (the general population and those with some medical training), on their subjective and objective comprehension of and trust in explanations for the outputs of the regression tool. In both studies we found a clear preference in terms of subjective comprehension and trust for occlusion-1 over SHAP explanations in general, when comparing based on content. However, direct comparisons of explanations when controlling for format only revealed evidence for OT over SC explanations in most cases, suggesting that the dominance of occlusion-1 over SHAP explanations may be driven by a preference for text over charts as explanations. Finally, we found no evidence of a difference between the explanation types in terms of objective comprehension. Thus overall, the choice of the content and format of explanations needs careful attention, since in some contexts format, rather than content, may play the critical role in improving user experience.
Abstract:As AI models become ever more complex and intertwined in humans' daily lives, greater levels of interactivity of explainable AI (XAI) methods are needed. In this paper, we propose the use of belief change theory as a formal foundation for operators that model the incorporation of new information, i.e. user feedback in interactive XAI, to logical representations of data-driven classifiers. We argue that this type of formalisation provides a framework and a methodology to develop interactive explanations in a principled manner, providing warranted behaviour and favouring transparency and accountability of such interactions. Concretely, we first define a novel, logic-based formalism to represent explanatory information shared between humans and machines. We then consider real world scenarios for interactive XAI, with different prioritisations of new and existing knowledge, where our formalism may be instantiated. Finally, we analyse a core set of belief change postulates, discussing their suitability for our real world settings and pointing to particular challenges that may require the relaxation or reinterpretation of some of the theoretical assumptions underlying existing operators.
Abstract:Urban land use inference is a critically important task that aids in city planning and policy-making. Recently, the increased use of sensor and location technologies has facilitated the collection of multi-modal mobility data, offering valuable insights into daily activity patterns. Many studies have adopted advanced data-driven techniques to explore the potential of these multi-modal mobility data in land use inference. However, existing studies often process samples independently, ignoring the spatial correlations among neighbouring objects and heterogeneity among different services. Furthermore, the inherently low interpretability of complex deep learning methods poses a significant barrier in urban planning, where transparency and extrapolability are crucial for making long-term policy decisions. To overcome these challenges, we introduce an explainable framework for inferring land use that synergises heterogeneous graph neural networks (HGNs) with Explainable AI techniques, enhancing both accuracy and explainability. The empirical experiments demonstrate that the proposed HGNs significantly outperform baseline graph neural networks for all six land-use indicators, especially in terms of 'office' and 'sustenance'. As explanations, we consider feature attribution and counterfactual explanations. The analysis of feature attribution explanations shows that the symmetrical nature of the `residence' and 'work' categories predicted by the framework aligns well with the commuter's 'work' and 'recreation' activities in London. The analysis of the counterfactual explanations reveals that variations in node features and types are primarily responsible for the differences observed between the predicted land use distribution and the ideal mixed state. These analyses demonstrate that the proposed HGNs can suitably support urban stakeholders in their urban planning and policy-making.
Abstract:AI has become pervasive in recent years, but state-of-the-art approaches predominantly neglect the need for AI systems to be contestable. Instead, contestability is advocated by AI guidelines (e.g. by the OECD) and regulation of automated decision-making (e.g. GDPR). In this position paper we explore how contestability can be achieved computationally in and for AI. We argue that contestable AI requires dynamic (human-machine and/or machine-machine) explainability and decision-making processes, whereby machines can (i) interact with humans and/or other machines to progressively explain their outputs and/or their reasoning as well as assess grounds for contestation provided by these humans and/or other machines, and (ii) revise their decision-making processes to redress any issues successfully raised during contestation. Given that much of the current AI landscape is tailored to static AIs, the need to accommodate contestability will require a radical rethinking, that, we argue, computational argumentation is ideally suited to support.
Abstract:The diversity of knowledge encoded in large language models (LLMs) and their ability to apply this knowledge zero-shot in a range of settings makes them a promising candidate for use in decision-making. However, they are currently limited by their inability to reliably provide outputs which are explainable and contestable. In this paper, we attempt to reconcile these strengths and weaknesses by introducing a method for supplementing LLMs with argumentative reasoning. Concretely, we introduce argumentative LLMs, a method utilising LLMs to construct argumentation frameworks, which then serve as the basis for formal reasoning in decision-making. The interpretable nature of these argumentation frameworks and formal reasoning means that any decision made by the supplemented LLM may be naturally explained to, and contested by, humans. We demonstrate the effectiveness of argumentative LLMs experimentally in the decision-making task of claim verification. We obtain results that are competitive with, and in some cases surpass, comparable state-of-the-art techniques.
Abstract:Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research, providing recourse recommendations for users affected by the decisions of machine learning models. However, when slight changes occur in the parameters of the underlying model, CEs found by existing methods often become invalid for the updated models. The literature lacks a way to certify deterministic robustness guarantees for CEs under model changes, in that existing methods to improve CEs' robustness are heuristic, and the robustness performances are evaluated empirically using only a limited number of retrained models. To bridge this gap, we propose a novel interval abstraction technique for parametric machine learning models, which allows us to obtain provable robustness guarantees of CEs under the possibly infinite set of plausible model changes $\Delta$. We formalise our robustness notion as the $\Delta$-robustness for CEs, in both binary and multi-class classification settings. We formulate procedures to verify $\Delta$-robustness based on Mixed Integer Linear Programming, using which we further propose two algorithms to generate CEs that are $\Delta$-robust. In an extensive empirical study, we demonstrate how our approach can be used in practice by discussing two strategies for determining the appropriate hyperparameter in our method, and we quantitatively benchmark the CEs generated by eleven methods, highlighting the effectiveness of our algorithms in finding robust CEs.
Abstract:Argument mining (AM) is the process of automatically extracting arguments, their components and/or relations amongst arguments and components from text. As the number of platforms supporting online debate increases, the need for AM becomes ever more urgent, especially in support of downstream tasks. Relation-based AM (RbAM) is a form of AM focusing on identifying agreement (support) and disagreement (attack) relations amongst arguments. RbAM is a challenging classification task, with existing methods failing to perform satisfactorily. In this paper, we show that general-purpose Large Language Models (LLMs), appropriately primed and prompted, can significantly outperform the best performing (RoBERTa-based) baseline. Specifically, we experiment with two open-source LLMs (Llama-2 and Mistral) with ten datasets.
Abstract:Counterfactual explanations (CEs) are advocated as being ideally suited to providing algorithmic recourse for subjects affected by the predictions of machine learning models. While CEs can be beneficial to affected individuals, recent work has exposed severe issues related to the robustness of state-of-the-art methods for obtaining CEs. Since a lack of robustness may compromise the validity of CEs, techniques to mitigate this risk are in order. In this survey, we review works in the rapidly growing area of robust CEs and perform an in-depth analysis of the forms of robustness they consider. We also discuss existing solutions and their limitations, providing a solid foundation for future developments.