Abstract:As businesses, products, and services spring up around large language models, the trustworthiness of these models hinges on the verifiability of their outputs. However, methods for explaining language model outputs largely fall across two distinct fields of study which both use the term "attribution" to refer to entirely separate techniques: citation generation and training data attribution. In many modern applications, such as legal document generation and medical question answering, both types of attributions are important. In this work, we argue for and present a unified framework of large language model attributions. We show how existing methods of different types of attribution fall under the unified framework. We also use the framework to discuss real-world use cases where one or both types of attributions are required. We believe that this unified framework will guide the use case driven development of systems that leverage both types of attribution, as well as the standardization of their evaluation.
Abstract:The 44th International Conference on Software Engineering (ICSE 2022) was held in person from May 22 to May 27, 2022 in Pittsburgh, PA, USA. Here, we summarize themes of research and the direction of research in the field of software engineering and testing that we observed at the conference.
Abstract:Brain-computer interfaces (BCIs) decode recorded neural signals from the brain and/or stimulate the brain with encoded neural signals. BCIs span both hardware and software and have a wide range of applications in restorative medicine, from restoring movement through prostheses and robotic limbs to restoring sensation and communication through spellers. BCIs also have applications in diagnostic medicine, e.g., providing clinicians with data for detecting seizures, sleep patterns, or emotions. Despite their promise, BCIs have not yet been adopted for long-term, day-to-day use because of challenges related to reliability and robustness, which are needed for safe operation in all scenarios. Ensuring safe operation currently requires hours of manual data collection and recalibration, involving both patients and clinicians. However, data collection is not targeted at eliminating specific faults in a BCI. This paper presents a new methodology for characterizing, detecting, and localizing faults in BCIs. Specifically, it proposes partial test oracles as a method for detecting faults and slice functions as a method for localizing faults to characteristic patterns in the input data or relevant tasks performed by the user. Through targeted data acquisition and retraining, the proposed methodology improves the correctness of BCIs. We evaluated the proposed methodology on five BCI applications. The results show that the proposed methodology (1) precisely localizes faults and (2) can significantly reduce the frequency of faults through retraining based on targeted, fault-based data acquisition. These results suggest that the proposed methodology is a promising step towards repairing faulty BCIs.