Abstract:Explanation regularisation (ER) has been introduced as a way to guide models to make their predictions in a manner more akin to humans, i.e., making their attributions "plausible". This is achieved by introducing an auxiliary explanation loss, that measures how well the output of an input attribution technique for the model agrees with relevant human-annotated rationales. One positive outcome of using ER appears to be improved performance in out-of-domain (OOD) settings, presumably due to an increased reliance on "plausible" tokens. However, previous work has under-explored the impact of the ER objective on model attributions, in particular when obtained with techniques other than the one used to train ER. In this work, we contribute a study of ER's effectiveness at informing classification decisions on plausible tokens, and the relationship between increased plausibility and robustness to OOD conditions. Through a series of analyses, we find that the connection between ER and the ability of a classifier to rely on plausible features has been overstated and that a stronger reliance on plausible tokens does not seem to be the cause for any perceived OOD improvements.
Abstract:Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) is the only in vivo method to non-invasively examine the microstructure of the human heart. Current research in DT-CMR aims to improve the understanding of how the cardiac microstructure relates to the macroscopic function of the healthy heart as well as how microstructural dysfunction contributes to disease. To get the final DT-CMR metrics, we need to acquire diffusion weighted images of at least 6 directions. However, due to DWI's low signal-to-noise ratio, the standard voxel size is quite big on the scale for microstructures. In this study, we explored the potential of deep-learning-based methods in improving the image quality volumetrically (x4 in all dimensions). This study proposed a novel framework to enable volumetric super-resolution, with an additional model input of high-resolution b0 DWI. We demonstrated that the additional input could offer higher super-resolved image quality. Going beyond, the model is also able to super-resolve DWIs of unseen b-values, proving the model framework's generalizability for cardiac DWI superresolution. In conclusion, we would then recommend giving the model a high-resolution reference image as an additional input to the low-resolution image for training and inference to guide all super-resolution frameworks for parametric imaging where a reference image is available.
Abstract:Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) enables us to probe the microstructural arrangement of cardiomyocytes within the myocardium in vivo and non-invasively, which no other imaging modality allows. This innovative technology could revolutionise the ability to perform cardiac clinical diagnosis, risk stratification, prognosis and therapy follow-up. However, DT-CMR is currently inefficient with over six minutes needed to acquire a single 2D static image. Therefore, DT-CMR is currently confined to research but not used clinically. We propose to reduce the number of repetitions needed to produce DT-CMR datasets and subsequently de-noise them, decreasing the acquisition time by a linear factor while maintaining acceptable image quality. Our proposed approach, based on Generative Adversarial Networks, Vision Transformers, and Ensemble Learning, performs significantly and considerably better than previous proposed approaches, bringing single breath-hold DT-CMR closer to reality.
Abstract:Understanding the structure of the heart at the microscopic scale of cardiomyocytes and their aggregates provides new insights into the mechanisms of heart disease and enables the investigation of effective therapeutics. Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) is a unique non-invasive technique that can resolve the microscopic structure, organisation, and integrity of the myocardium without the need for exogenous contrast agents. However, this technique suffers from relatively low signal-to-noise ratio (SNR) and frequent signal loss due to respiratory and cardiac motion. Current DT-CMR techniques rely on acquiring and averaging multiple signal acquisitions to improve the SNR. Moreover, in order to mitigate the influence of respiratory movement, patients are required to perform many breath holds which results in prolonged acquisition durations (e.g., ~30 mins using the existing technology). In this study, we propose a novel cascaded Convolutional Neural Networks (CNN) based compressive sensing (CS) technique and explore its applicability to improve DT-CMR acquisitions. Our simulation based studies have achieved high reconstruction fidelity and good agreement between DT-CMR parameters obtained with the proposed reconstruction and fully sampled ground truth. When compared to other state-of-the-art methods, our proposed deep cascaded CNN method and its stochastic variation demonstrated significant improvements. To the best of our knowledge, this is the first study using deep CNN based CS for the DT-CMR reconstruction. In addition, with relatively straightforward modifications to the acquisition scheme, our method can easily be translated into a method for online, at-the-scanner reconstruction enabling the deployment of accelerated DT-CMR in various clinical applications.
Abstract:In NMT, words are sometimes dropped from the source or generated repeatedly in the translation. We explore novel strategies to address the coverage problem that change only the attention transformation. Our approach allocates fertilities to source words, used to bound the attention each word can receive. We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. Empirical evaluation is provided in three languages pairs.
Abstract:Bayesian network structures are usually built using only the data and starting from an empty network or from a naive Bayes structure. Very often, in some domains, like medicine, a prior structure knowledge is already known. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers with a very small computational cost, while maintaining most of the intended meaning of the original model.