Abstract:Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new framework that composes kernels rather than functions, preserving the desirable properties of RKBS. We prove that any deep neural network function is a neural cRKBS function, and conversely, any neural cRKBS function defined on a finite dataset corresponds to a deep neural network. This approach provides a sparse solution to the empirical risk minimization problem, requiring no more than $N$ neurons per layer, where $N$ is the number of data points.
Abstract:Cardiovascular hemodynamic fields provide valuable medical decision markers for coronary artery disease. Computational fluid dynamics (CFD) is the gold standard for accurate, non-invasive evaluation of these quantities in vivo. In this work, we propose a time-efficient surrogate model, powered by machine learning, for the estimation of pulsatile hemodynamics based on steady-state priors. We introduce deep vectorised operators, a modelling framework for discretisation independent learning on infinite-dimensional function spaces. The underlying neural architecture is a neural field conditioned on hemodynamic boundary conditions. Importantly, we show how relaxing the requirement of point-wise action to permutation-equivariance leads to a family of models that can be parametrised by message passing and self-attention layers. We evaluate our approach on a dataset of 74 stenotic coronary arteries extracted from coronary computed tomography angiography (CCTA) with patient-specific pulsatile CFD simulations as ground truth. We show that our model produces accurate estimates of the pulsatile velocity and pressure while being agnostic to re-sampling of the source domain (discretisation independence). This shows that deep vectorised operators are a powerful modelling tool for cardiovascular hemodynamics estimation in coronary arteries and beyond.
Abstract:Classical model reduction techniques project the governing equations onto a linear subspace of the original state space. More recent data-driven techniques use neural networks to enable nonlinear projections. Whilst those often enable stronger compression, they may have redundant parameters and lead to suboptimal latent dimensionality. To overcome these, we propose a multistep algorithm that induces sparsity in the encoder-decoder networks for effective reduction in the number of parameters and additional compression of the latent space. This algorithm starts with sparsely initialized a network and training it using linearized Bregman iterations. These iterations have been very successful in computer vision and compressed sensing tasks, but have not yet been used for reduced-order modelling. After the training, we further compress the latent space dimensionality by using a form of proper orthogonal decomposition. Last, we use a bias propagation technique to change the induced sparsity into an effective reduction of parameters. We apply this algorithm to three representative PDE models: 1D diffusion, 1D advection, and 2D reaction-diffusion. Compared to conventional training methods like Adam, the proposed method achieves similar accuracy with 30% less parameters and a significantly smaller latent space.
Abstract:Personalized 3D vascular models can aid in a range of diagnostic, prognostic, and treatment-planning tasks relevant to cardiovascular disease management. Deep learning provides a means to automatically obtain such models. Ideally, a user should have control over the exact region of interest (ROI) to be included in a vascular model, and the model should be watertight and highly accurate. To this end, we propose a combination of a global controller leveraging voxel mask segmentations to provide boundary conditions for vessels of interest to a local, iterative vessel segmentation model. We introduce the preservation of scale- and rotational symmetries in the local segmentation model, leading to generalisation to vessels of unseen sizes and orientations. Combined with the global controller, this enables flexible 3D vascular model building, without additional retraining. We demonstrate the potential of our method on a dataset containing abdominal aortic aneurysms (AAAs). Our method performs on par with a state-of-the-art segmentation model in the segmentation of AAAs, iliac arteries and renal arteries, while providing a watertight, smooth surface segmentation. Moreover, we demonstrate that by adapting the global controller, we can easily extend vessel sections in the 3D model.
Abstract:This paper presents a method for finding a sparse representation of Barron functions. Specifically, given an $L^2$ function $f$, the inverse scale space flow is used to find a sparse measure $\mu$ minimising the $L^2$ loss between the Barron function associated to the measure $\mu$ and the function $f$. The convergence properties of this method are analysed in an ideal setting and in the cases of measurement noise and sampling bias. In an ideal setting the objective decreases strictly monotone in time to a minimizer with $\mathcal{O}(1/t)$, and in the case of measurement noise or sampling bias the optimum is achieved up to a multiplicative or additive constant. This convergence is preserved on discretization of the parameter space, and the minimizers on increasingly fine discretizations converge to the optimum on the full parameter space.
Abstract:Blood vessel orientation as visualized in 3D medical images is an important descriptor of its geometry that can be used for centerline extraction and subsequent segmentation and visualization. Arteries appear at many scales and levels of tortuosity, and determining their exact orientation is challenging. Recent works have used 3D convolutional neural networks (CNNs) for this purpose, but CNNs are sensitive to varying vessel sizes and orientations. We present SIRE: a scale-invariant, rotation-equivariant estimator for local vessel orientation. SIRE is modular and can generalise due to symmetry preservation. SIRE consists of a gauge equivariant mesh CNN (GEM-CNN) operating on multiple nested spherical meshes with different sizes in parallel. The features on each mesh are a projection of image intensities within the corresponding sphere. These features are intrinsic to the sphere and, in combination with the GEM-CNN, lead to SO(3)-equivariance. Approximate scale invariance is achieved by weight sharing and use of a symmetric maximum function to combine multi-scale predictions. Hence, SIRE can be trained with arbitrarily oriented vessels with varying radii to generalise to vessels with a wide range of calibres and tortuosity. We demonstrate the efficacy of SIRE using three datasets containing vessels of varying scales: the vascular model repository (VMR), the ASOCA coronary artery set, and a set of abdominal aortic aneurysms (AAAs). We embed SIRE in a centerline tracker which accurately tracks AAAs, regardless of the data SIRE is trained with. Moreover, SIRE can be used to track coronary arteries, even when trained only with AAAs. In conclusion, by incorporating SO(3) and scale symmetries, SIRE can determine the orientations of vessels outside of the training domain, forming a robust and data-efficient solution to geometric analysis of blood vessels in 3D medical images.
Abstract:The application of deep learning models to large-scale data sets requires means for automatic quality assurance. We have previously developed a fully automatic algorithm for carotid artery wall segmentation in black-blood MRI that we aim to apply to large-scale data sets. This method identifies nested artery walls in 3D patches centered on the carotid artery. In this study, we investigate to what extent the uncertainty in the model predictions for the contour location can serve as a surrogate for error detection and, consequently, automatic quality assurance. We express the quality of automatic segmentations using the Dice similarity coefficient. The uncertainty in the model's prediction is estimated using either Monte Carlo dropout or test-time data augmentation. We found that (1) including uncertainty measurements did not degrade the quality of the segmentations, (2) uncertainty metrics provide a good proxy of the quality of our contours if the center found during the first step is enclosed in the lumen of the carotid artery and (3) they could be used to detect low-quality segmentations at the participant level. This automatic quality assurance tool might enable the application of our model in large-scale data sets.
Abstract:Neural networks are prone to learn easy solutions from superficial statistics in the data, namely shortcut learning, which impairs generalization and robustness of models. We propose a data augmentation strategy, named DFM-X, that leverages knowledge about frequency shortcuts, encoded in Dominant Frequencies Maps computed for image classification models. We randomly select X% training images of certain classes for augmentation, and process them by retaining the frequencies included in the DFMs of other classes. This strategy compels the models to leverage a broader range of frequencies for classification, rather than relying on specific frequency sets. Thus, the models learn more deep and task-related semantics compared to their counterpart trained with standard setups. Unlike other commonly used augmentation techniques which focus on increasing the visual variations of training data, our method targets exploiting the original data efficiently, by distilling prior knowledge about destructive learning behavior of models from data. Our experimental results demonstrate that DFM-X improves robustness against common corruptions and adversarial attacks. It can be seamlessly integrated with other augmentation techniques to further enhance the robustness of models.
Abstract:Though modern microscopes have an autofocusing system to ensure optimal focus, out-of-focus images can still occur when cells within the medium are not all in the same focal plane, affecting the image quality for medical diagnosis and analysis of diseases. We propose a method that can deblur images as well as synthesize defocus blur. We train autoencoders with implicit and explicit regularization techniques to enforce linearity relations among the representations of different blur levels in the latent space. This allows for the exploration of different blur levels of an object by linearly interpolating/extrapolating the latent representations of images taken at different focal planes. Compared to existing works, we use a simple architecture to synthesize images with flexible blur levels, leveraging the linear latent space. Our regularized autoencoders can effectively mimic blur and deblur, increasing data variety as a data augmentation technique and improving the quality of microscopic images, which would be beneficial for further processing and analysis.
Abstract:Frequency analysis is useful for understanding the mechanisms of representation learning in neural networks (NNs). Most research in this area focuses on the learning dynamics of NNs for regression tasks, while little for classification. This study empirically investigates the latter and expands the understanding of frequency shortcuts. First, we perform experiments on synthetic datasets, designed to have a bias in different frequency bands. Our results demonstrate that NNs tend to find simple solutions for classification, and what they learn first during training depends on the most distinctive frequency characteristics, which can be either low- or high-frequencies. Second, we confirm this phenomenon on natural images. We propose a metric to measure class-wise frequency characteristics and a method to identify frequency shortcuts. The results show that frequency shortcuts can be texture-based or shape-based, depending on what best simplifies the objective. Third, we validate the transferability of frequency shortcuts on out-of-distribution (OOD) test sets. Our results suggest that frequency shortcuts can be transferred across datasets and cannot be fully avoided by larger model capacity and data augmentation. We recommend that future research should focus on effective training schemes mitigating frequency shortcut learning.