Abstract:Accurate volumetric image registration is highly relevant for clinical routines and computer-aided medical diagnosis. Recently, researchers have begun to use transformers in learning-based methods for medical image registration, and have achieved remarkable success. Due to the strong global modeling capability, Transformers are considered a better option than convolutional neural networks (CNNs) for registration. However, they use bulky models with huge parameter sets, which require high computation edge devices for deployment as portable devices or in hospitals. Transformers also need a large amount of training data to produce significant results, and it is often challenging to collect suitable annotated data. Although existing CNN-based image registration can offer rich local information, their global modeling capability is poor for handling long-distance information interaction and limits registration performance. In this work, we propose a CNN-based registration method with an enhanced receptive field, a low number of parameters, and significant results on a limited training dataset. For this, we propose a residual U-Net with embedded parallel dilated-convolutional blocks to enhance the receptive field. The proposed method is evaluated on inter-patient and atlas-based datasets. We show that the performance of the proposed method is comparable and slightly better than transformer-based methods by using only $\SI{1.5}{\percent}$ of its number of parameters.
Abstract:The recent emergence of deep learning has led to a great deal of work on designing supervised deep semantic segmentation algorithms. As in many tasks sufficient pixel-level labels are very difficult to obtain, we propose a method which combines a Gaussian mixture model (GMM) with unsupervised deep learning techniques. In the standard GMM the pixel values with each sub-region are modelled by a Gaussian distribution. In order to identify the different regions, the parameter vector that minimizes the negative log-likelihood (NLL) function regarding the GMM has to be approximated. For this task, usually iterative optimization methods such as the expectation-maximization (EM) algorithm are used. In this paper, we propose to estimate these parameters directly from the image using a convolutional neural network (CNN). We thus change the iterative procedure in the EM algorithm replacing the expectation-step by a gradient-step with regard to the networks parameters. This means that the network is trained to minimize the NLL function of the GMM which comes with at least two advantages. As once trained, the network is able to predict label probabilities very quickly compared with time consuming iterative optimization methods. Secondly, due to the deep image prior our method is able to partially overcome one of the main disadvantages of GMM, which is not taking into account correlation between neighboring pixels, as it assumes independence between them. We demonstrate the advantages of our method in various experiments on the example of myocardial infarct segmentation on multi-sequence MRI images.
Abstract:Sparse-view computed tomography (CT) enables fast and low-dose CT imaging, an essential feature for patient-save medical imaging and rapid non-destructive testing. In sparse-view CT, only a few projection views are acquired, causing standard reconstructions to suffer from severe artifacts and noise. To address these issues, we propose a self-supervised image reconstruction strategy. Specifically, in contrast to the established Noise2Inverse, our proposed training strategy uses a loss function in the projection domain, thereby bypassing the otherwise prescribed nullspace component. We demonstrate the effectiveness of the proposed method in reducing stripe-artifacts and noise, even from highly sparse data.
Abstract:Significance: Compressed sensing (CS) uses special measurement designs combined with powerful mathematical algorithms to reduce the amount of data to be collected while maintaining image quality. This is relevant to almost any imaging modality, and in this paper we focus on CS in photoacoustic projection imaging (PAPI) with integrating line detectors (ILDs). Aim: Our previous research involved rather general CS measurements, where each ILD can contribute to any measurement. In the real world, however, the design of CS measurements is subject to practical constraints. In this research, we aim at a CS-PAPI system where each measurement involves only a subset of ILDs, and which can be implemented in a cost-effective manner. Approach: We extend the existing PAPI with a self-developed CS unit. The system provides structured CS matrices for which the existing recovery theory cannot be applied directly. A random search strategy is applied to select the CS measurement matrix within this class for which we obtain exact sparse recovery. Results: We implement a CS PAPI system for a compression factor of $4:3$, where specific measurements are made on separate groups of 16 ILDs. We algorithmically design optimal CS measurements that have proven sparse CS capabilities. Numerical experiments are used to support our results. Conclusions: CS with proven sparse recovery capabilities can be integrated into PAPI, and numerical results support this setup. Future work will focus on applying it to experimental data and utilizing data-driven approaches to enhance the compression factor and generalize the signal class.
Abstract:Purpose: To develop a neural network architecture for improved calibrationless reconstruction of radial data when no ground truth is available for training. Methods: NLINV-Net is a model-based neural network architecture that directly estimates images and coil sensitivities from (radial) k-space data via non-linear inversion (NLINV). Combined with a training strategy using self-supervision via data undersampling (SSDU), it can be used for imaging problems where no ground truth reconstructions are available. We validated the method for (1) real-time cardiac imaging and (2) single-shot subspace-based quantitative T1 mapping. Furthermore, region-optimized virtual (ROVir) coils were used to suppress artifacts stemming from outside the FoV and to focus the k-space based SSDU loss on the region of interest. NLINV-Net based reconstructions were compared with conventional NLINV and PI-CS (parallel imaging + compressed sensing) reconstruction and the effect of the region-optimized virtual coils and the type of training loss was evaluated qualitatively. Results: NLINV-Net based reconstructions contain significantly less noise than the NLINV-based counterpart. ROVir coils effectively suppress streakings which are not suppressed by the neural networks while the ROVir-based focussed loss leads to visually sharper time series for the movement of the myocardial wall in cardiac real-time imaging. For quantitative imaging, T1-maps reconstructed using NLINV-Net show similar quality as PI-CS reconstructions, but NLINV-Net does not require slice-specific tuning of the regularization parameter. Conclusion: NLINV-Net is a versatile tool for calibrationless imaging which can be used in challenging imaging scenarios where a ground truth is not available.
Abstract:Medical image processing has been highlighted as an area where deep learning-based models have the greatest potential. However, in the medical field in particular, problems of data availability and privacy are hampering research progress and thus rapid implementation in clinical routine. The generation of synthetic data not only ensures privacy, but also allows to \textit{draw} new patients with specific characteristics, enabling the development of data-driven models on a much larger scale. This work demonstrates that three-dimensional generative adversarial networks (GANs) can be efficiently trained to generate high-resolution medical volumes with finely detailed voxel-based architectures. In addition, GAN inversion is successfully implemented for the three-dimensional setting and used for extensive research on model interpretability and applications such as image morphing, attribute editing and style mixing. The results are comprehensively validated on a database of three-dimensional HR-pQCT instances representing the bone micro-architecture of the distal radius.
Abstract:We investigate resolution in photoacoustic tomography (PAT). Using Shannon theory, we investigate the theoretical resolution limit of sparse view PAT theoretically, and empirically demonstrate that all reconstruction methods used exceed this limit.
Abstract:In this work, we develop an unsupervised method for the joint segmentation and denoising of a single image. To this end, we combine the advantages of a variational segmentation method with the power of a self-supervised, single-image based deep learning approach. One major strength of our method lies in the fact, that in contrast to data-driven methods, where huge amounts of labeled samples are necessary, our model can segment an image into multiple meaningful regions without any training database. Further, we introduce a novel energy functional in which denoising and segmentation are coupled in a way that both tasks benefit from each other. The limitations of existing single-image based variational segmentation methods, which are not capable of dealing with high noise or generic texture, are tackled by this specific combination with self-supervised image denoising. We propose a unified optimisation strategy and show that, especially for very noisy images available in microscopy, our proposed joint approach outperforms its sequential counterpart as well as alternative methods focused purely on denoising or segmentation. Another comparison is conducted with a supervised deep learning approach designed for the same application, highlighting the good performance of our approach.
Abstract:Late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) imaging is considered the in vivo reference standard for assessing infarct size (IS) and microvascular obstruction (MVO) in ST-elevation myocardial infarction (STEMI) patients. However, the exact quantification of those markers of myocardial infarct severity remains challenging and very time-consuming. As LGE distribution patterns can be quite complex and hard to delineate from the blood pool or epicardial fat, automatic segmentation of LGE CMR images is challenging. In this work, we propose a cascaded framework of two-dimensional and three-dimensional convolutional neural networks (CNNs) which enables to calculate the extent of myocardial infarction in a fully automated way. By artificially generating segmentation errors which are characteristic for 2D CNNs during training of the cascaded framework we are enforcing the detection and correction of 2D segmentation errors and hence improve the segmentation accuracy of the entire method. The proposed method was trained and evaluated in a five-fold cross validation using the training dataset from the EMIDEC challenge. We perform comparative experiments where our framework outperforms state-of-the-art methods of the EMIDEC challenge, as well as 2D and 3D nnU-Net. Furthermore, in extensive ablation studies we show the advantages that come with the proposed error correcting cascaded method.
Abstract:Recently, the use of deep equilibrium methods has emerged as a new approach for solving imaging and other ill-posed inverse problems. While learned components may be a key factor in the good performance of these methods in practice, a theoretical justification from a regularization point of view is still lacking. In this paper, we address this issue by providing stability and convergence results for the class of equilibrium methods. In addition, we derive convergence rates and stability estimates in the symmetric Bregman distance. We strengthen our results for regularization operators with contractive residuals. Furthermore, we use the presented analysis to gain insight into the practical behavior of these methods, including a lower bound on the performance of the regularized solutions. In addition, we show that the convergence analysis leads to the design of a new type of loss function which has several advantages over previous ones. Numerical simulations are used to support our findings.