Engineering Mathematics, University of Bristol, affiliated with the Bristol Robotics Lab, United Kingdom
Abstract:In this paper, we present the design and benchmark of an innovative sensor, ViTacTip, which fulfills the demand for advanced multi-modal sensing in a compact design. A notable feature of ViTacTip is its transparent skin, which incorporates a `see-through-skin' mechanism. This mechanism aims at capturing detailed object features upon contact, significantly improving both vision-based and proximity perception capabilities. In parallel, the biomimetic tips embedded in the sensor's skin are designed to amplify contact details, thus substantially augmenting tactile and derived force perception abilities. To demonstrate the multi-modal capabilities of ViTacTip, we developed a multi-task learning model that enables simultaneous recognition of hardness, material, and textures. To assess the functionality and validate the versatility of ViTacTip, we conducted extensive benchmarking experiments, including object recognition, contact point detection, pose regression, and grating identification. To facilitate seamless switching between various sensing modalities, we employed a Generative Adversarial Network (GAN)-based approach. This method enhances the applicability of the ViTacTip sensor across diverse environments by enabling cross-modality interpretation.
Abstract:Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in processing both visual and textual information. However, the critical challenge of alignment between visual and linguistic representations is not fully understood. This survey presents a comprehensive examination of alignment and misalignment in LVLMs through an explainability lens. We first examine the fundamentals of alignment, exploring its representational and behavioral aspects, training methodologies, and theoretical foundations. We then analyze misalignment phenomena across three semantic levels: object, attribute, and relational misalignment. Our investigation reveals that misalignment emerges from challenges at multiple levels: the data level, the model level, and the inference level. We provide a comprehensive review of existing mitigation strategies, categorizing them into parameter-frozen and parameter-tuning approaches. Finally, we outline promising future research directions, emphasizing the need for standardized evaluation protocols and in-depth explainability studies.
Abstract:Many bias mitigation methods have been developed for addressing fairness issues in machine learning. We found that using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases present in dataset labels. Research presented in this paper aims to address this issue by proposing a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized to improve the generation of labels of augmented samples, which are proximity aware. Specifically, we proposed ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation. We conducted thorough experiments with three datasets, three ML models, and different hyperparameters settings. Our experimental results showed the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.
Abstract:Time series forecasting, while vital in various applications, often employs complex models that are difficult for humans to understand. Effective explainable AI techniques are crucial to bridging the gap between model predictions and user understanding. This paper presents a framework - TSFeatLIME, extending TSLIME, tailored specifically for explaining univariate time series forecasting. TSFeatLIME integrates an auxiliary feature into the surrogate model and considers the pairwise Euclidean distances between the queried time series and the generated samples to improve the fidelity of the surrogate models. However, the usefulness of such explanations for human beings remains an open question. We address this by conducting a user study with 160 participants through two interactive interfaces, aiming to measure how individuals from different backgrounds can simulate or predict model output changes in the treatment group and control group. Our results show that the surrogate model under the TSFeatLIME framework is able to better simulate the behaviour of the black-box considering distance, without sacrificing accuracy. In addition, the user study suggests that the explanations were significantly more effective for participants without a computer science background.
Abstract:Reinforcement Learning has revolutionized decision-making processes in dynamic environments, yet it often struggles with autonomously detecting and achieving goals without clear feedback signals. For example, in a Source Term Estimation problem, the lack of precise environmental information makes it challenging to provide clear feedback signals and to define and evaluate how the source's location is determined. To address this challenge, the Autonomous Goal Detection and Cessation (AGDC) module was developed, enhancing various RL algorithms by incorporating a self-feedback mechanism for autonomous goal detection and cessation upon task completion. Our method effectively identifies and ceases undefined goals by approximating the agent's belief, significantly enhancing the capabilities of RL algorithms in environments with limited feedback. To validate effectiveness of our approach, we integrated AGDC with deep Q-Network, proximal policy optimization, and deep deterministic policy gradient algorithms, and evaluated its performance on the Source Term Estimation problem. The experimental results showed that AGDC-enhanced RL algorithms significantly outperformed traditional statistical methods such as infotaxis, entrotaxis, and dual control for exploitation and exploration, as well as a non-statistical random action selection method. These improvements were evident in terms of success rate, mean traveled distance, and search time, highlighting AGDC's effectiveness and efficiency in complex, real-world scenarios.
Abstract:Recent studies highlight the effectiveness of using in-context learning (ICL) to steer large language models (LLMs) in processing tabular data, a challenging task given the structured nature of such data. Despite advancements in performance, the fairness implications of these methods are less understood. This study investigates how varying demonstrations within ICL prompts influence the fairness outcomes of LLMs. Our findings reveal that deliberately including minority group samples in prompts significantly boosts fairness without sacrificing predictive accuracy. Further experiments demonstrate that the proportion of minority to majority samples in demonstrations affects the trade-off between fairness and prediction accuracy. Based on these insights, we introduce a mitigation technique that employs clustering and evolutionary strategies to curate a diverse and representative sample set from the training data. This approach aims to enhance both predictive performance and fairness in ICL applications. Experimental results validate that our proposed method dramatically improves fairness across various metrics, showing its efficacy in real-world scenarios.
Abstract:This paper introduces a novel approach Counterfactual Shapley Values (CSV), which enhances explainability in reinforcement learning (RL) by integrating counterfactual analysis with Shapley Values. The approach aims to quantify and compare the contributions of different state dimensions to various action choices. To more accurately analyze these impacts, we introduce new characteristic value functions, the ``Counterfactual Difference Characteristic Value" and the ``Average Counterfactual Difference Characteristic Value." These functions help calculate the Shapley values to evaluate the differences in contributions between optimal and non-optimal actions. Experiments across several RL domains, such as GridWorld, FrozenLake, and Taxi, demonstrate the effectiveness of the CSV method. The results show that this method not only improves transparency in complex RL systems but also quantifies the differences across various decisions.
Abstract:Miller recently proposed a definition of contrastive (counterfactual) explanations based on the well-known Halpern-Pearl (HP) definitions of causes and (non-contrastive) explanations. Crucially, the Miller definition was based on the original HP definition of explanations, but this has since been modified by Halpern; presumably because the original yields counterintuitive results in many standard examples. More recently Borner has proposed a third definition, observing that this modified HP definition may also yield counterintuitive results. In this paper we show that the Miller definition inherits issues found in the original HP definition. We address these issues by proposing two improved variants based on the more robust modified HP and Borner definitions. We analyse our new definitions and show that they retain the spirit of the Miller definition where all three variants satisfy an alternative unified definition that is modular with respect to an underlying definition of non-contrastive explanations. To the best of our knowledge this paper also provides the first explicit comparison between the original and modified HP definitions.
Abstract:Microsurgery involves the dexterous manipulation of delicate tissue or fragile structures such as small blood vessels, nerves, etc., under a microscope. To address the limitation of imprecise manipulation of human hands, robotic systems have been developed to assist surgeons in performing complex microsurgical tasks with greater precision and safety. However, the steep learning curve for robot-assisted microsurgery (RAMS) and the shortage of well-trained surgeons pose significant challenges to the widespread adoption of RAMS. Therefore, the development of a versatile training system for RAMS is necessary, which can bring tangible benefits to both surgeons and patients. In this paper, we present a Tactile Internet-Based Micromanipulation System (TIMS) based on a ROS-Django web-based architecture for microsurgical training. This system can provide tactile feedback to operators via a wearable tactile display (WTD), while real-time data is transmitted through the internet via a ROS-Django framework. In addition, TIMS integrates haptic guidance to `guide' the trainees to follow a desired trajectory provided by expert surgeons. Learning from demonstration based on Gaussian Process Regression (GPR) was used to generate the desired trajectory. User studies were also conducted to verify the effectiveness of our proposed TIMS, comparing users' performance with and without tactile feedback and/or haptic guidance.
Abstract:The Dempster-Shafer theory of evidence has been used intensively to deal with uncertainty in knowledge-based systems. However the representation of uncertain relationships between evidence and hypothesis groups (heuristic knowledge) is still a major research problem. This paper presents an approach to representing such heuristic knowledge by evidential mappings which are defined on the basis of mass functions. The relationships between evidential mappings and multi valued mappings, as well as between evidential mappings and Bayesian multi- valued causal link models in Bayesian theory are discussed. Following this the detailed procedures for constructing evidential mappings for any set of heuristic rules are introduced. Several situations of belief propagation are discussed.