Abstract:Magnetic Resonance Fingerprinting (MRF) is a time-efficient approach to quantitative MRI, enabling the mapping of multiple tissue properties from a single, accelerated scan. However, achieving accurate reconstructions remains challenging, particularly in highly accelerated and undersampled acquisitions, which are crucial for reducing scan times. While deep learning techniques have advanced image reconstruction, the recent introduction of diffusion models offers new possibilities for imaging tasks, though their application in the medical field is still emerging. Notably, diffusion models have not yet been explored for the MRF problem. In this work, we propose for the first time a conditional diffusion probabilistic model for MRF image reconstruction. Qualitative and quantitative comparisons on in-vivo brain scan data demonstrate that the proposed approach can outperform established deep learning and compressed sensing algorithms for MRF reconstruction. Extensive ablation studies also explore strategies to improve computational efficiency of our approach.
Abstract:Managing fluid balance in dialysis patients is crucial, as improper management can lead to severe complications. In this paper, we propose a multimodal approach that integrates visual features from lung ultrasound images with clinical data to enhance the prediction of excess body fluid. Our framework employs independent encoders to extract features for each modality and combines them through a cross-domain attention mechanism to capture complementary information. By framing the prediction as a classification task, the model achieves significantly better performance than regression. The results demonstrate that multimodal models consistently outperform single-modality models, particularly when attention mechanisms prioritize tabular data. Pseudo-sample generation further contributes to mitigating the imbalanced classification problem, achieving the highest accuracy of 88.31%. This study underscores the effectiveness of multimodal learning for fluid overload management in dialysis patients, offering valuable insights for improved clinical outcomes.
Abstract:We present Sparse R-CNN OBB, a novel framework for the detection of oriented objects in SAR images leveraging sparse learnable proposals. The Sparse R-CNN OBB has streamlined architecture and ease of training as it utilizes a sparse set of 300 proposals instead of training a proposals generator on hundreds of thousands of anchors. To the best of our knowledge, Sparse R-CNN OBB is the first to adopt the concept of sparse learnable proposals for the detection of oriented objects, as well as for the detection of ships in Synthetic Aperture Radar (SAR) images. The detection head of the baseline model, Sparse R-CNN, is re-designed to enable the model to capture object orientation. We also fine-tune the model on RSDD-SAR dataset and provide a performance comparison to state-of-the-art models. Experimental results shows that Sparse R-CNN OBB achieves outstanding performance, surpassing other models on both inshore and offshore scenarios. The code is available at: www.github.com/ka-mirul/Sparse-R-CNN-OBB.
Abstract:Optical coherence tomography (OCT) and confocal microscopy are pivotal in retinal imaging, offering distinct advantages and limitations. In vivo OCT offers rapid, non-invasive imaging but can suffer from clarity issues and motion artifacts, while ex vivo confocal microscopy, providing high-resolution, cellular-detailed color images, is invasive and raises ethical concerns. To bridge the benefits of both modalities, we propose a novel framework based on unsupervised 3D CycleGAN for translating unpaired in vivo OCT to ex vivo confocal microscopy images. This marks the first attempt to exploit the inherent 3D information of OCT and translate it into the rich, detailed color domain of confocal microscopy. We also introduce a unique dataset, OCT2Confocal, comprising mouse OCT and confocal retinal images, facilitating the development of and establishing a benchmark for cross-modal image translation research. Our model has been evaluated both quantitatively and qualitatively, achieving Fr\'echet Inception Distance (FID) scores of 0.766 and Kernel Inception Distance (KID) scores as low as 0.153, and leading subjective Mean Opinion Scores (MOS). Our model demonstrated superior image fidelity and quality with limited data over existing methods. Our approach effectively synthesizes color information from 3D confocal images, closely approximating target outcomes and suggesting enhanced potential for diagnostic and monitoring applications in ophthalmology.
Abstract:In the realm of medical image fusion, integrating information from various modalities is crucial for improving diagnostics and treatment planning, especially in retinal health, where the important features exhibit differently in different imaging modalities. Existing deep learning-based approaches insufficiently focus on retinal image fusion, and thus fail to preserve enough anatomical structure and fine vessel details in retinal image fusion. To address this, we propose the Topology-Aware Graph Attention Network (TaGAT) for multi-modal retinal image fusion, leveraging a novel Topology-Aware Encoder (TAE) with Graph Attention Networks (GAT) to effectively enhance spatial features with retinal vasculature's graph topology across modalities. The TAE encodes the base and detail features, extracted via a Long-short Range (LSR) encoder from retinal images, into the graph extracted from the retinal vessel. Within the TAE, the GAT-based Graph Information Update (GIU) block dynamically refines and aggregates the node features to generate topology-aware graph features. The updated graph features with base and detail features are combined and decoded as a fused image. Our model outperforms state-of-the-art methods in Fluorescein Fundus Angiography (FFA) with Color Fundus (CF) and Optical Coherence Tomography (OCT) with confocal microscopy retinal image fusion. The source code can be accessed via https://github.com/xintian-99/TaGAT.
Abstract:The identification of artefacts, particularly B-lines, in lung ultrasound (LUS), is crucial for assisting clinical diagnosis, prompting the development of innovative methodologies. While the Cauchy proximal splitting (CPS) algorithm has demonstrated effective performance in B-line detection, the process is slow and has limited generalization. This paper addresses these issues with a novel unsupervised deep unfolding network structure (DUCPS). The framework utilizes deep unfolding procedures to merge traditional model-based techniques with deep learning approaches. By unfolding the CPS algorithm into a deep network, DUCPS enables the parameters in the optimization algorithm to be learnable, thus enhancing generalization performance and facilitating rapid convergence. We conducted entirely unsupervised training using the Neighbor2Neighbor (N2N) and the Structural Similarity Index Measure (SSIM) losses. When combined with an improved line identification method proposed in this paper, state-of-the-art performance is achieved, with the recall and F2 score reaching 0.70 and 0.64, respectively. Notably, DUCPS significantly improves computational efficiency eliminating the need for extensive data labeling, representing a notable advancement over both traditional algorithms and existing deep learning approaches.
Abstract:Microscopy images acquired by multiple camera lenses or sensors in biological experiments offer a comprehensive understanding of the objects from diverse aspects. However, setups for multiple microscopes raise the possibility of misalignment of identical target features through different modalities. Thus, multimodal image registration is essential. In this work, we employed previous successes in biological image translation (XAcGAN) and mono-modal image registration (RoTIR) and created a deep-learning-based model, Dual-Domain RoTIR (DD_RoTIR), to address the challenges. However, it is believed that GAN-based translation models are inadequate for multimodal image registration. We facilitated the registration utilizing the feature-matching algorithm based on Transformers and rotation equivariant networks. Furthermore, hierarchical feature-matching was employed as multimodal image registration is more challenging. Results show the DD_RoTIR model presents good applicability and robustness in multiple microscopy image datasets.
Abstract:Driven by the filtering challenges in linear systems disturbed by non-Gaussian heavy-tailed noise, the robust Kalman filters (RKFs) leveraging diverse heavy-tailed distributions have been introduced. However, the RKFs rely on precise noise models, and large model errors can degrade their filtering performance. Also, the posterior approximation by the employed variational Bayesian (VB) method can further decrease the estimation precision. Here, we introduce an innovative RKF method, the RKFNet, which combines the heavy-tailed-distribution-based RKF framework with the deep learning (DL) technique and eliminates the need for the precise parameters of the heavy-tailed distributions. To reduce the VB approximation error, the mixing-parameter-based function and the scale matrix are estimated by the incorporated neural network structures. Also, the stable training process is achieved by our proposed unsupervised scheduled sampling (USS) method, where a loss function based on the Student's t (ST) distribution is utilised to overcome the disturbance of the noise outliers and the filtering results of the traditional RKFs are employed as reference sequences. Furthermore, the RKFNet is evaluated against various RKFs and recurrent neural networks (RNNs) under three kinds of heavy-tailed measurement noises, and the simulation results showcase its efficacy in terms of estimation accuracy and efficiency.
Abstract:We present a novel ship wake simulation system for generating S-band Synthetic Aperture Radar (SAR) images, and demonstrate the use of such imagery for the classification of ships based on their wake signatures via a deep learning approach. Ship wakes are modeled through the linear superposition of wind-induced sea elevation and the Kelvin wakes model of a moving ship. Our SAR imaging simulation takes into account frequency-dependent radar parameters, i.e., the complex dielectric constant ($\varepsilon$) and the relaxation rate ($\mu$) of seawater. The former was determined through the Debye model while the latter was estimated for S-band SAR based on pre-existing values for the L, C, and X-bands. The results show good agreement between simulated and real imagery upon visual inspection. The results of implementing different training strategies are also reported, showcasing a notable improvement in accuracy of classifier achieved by integrating real and simulated SAR images during the training.
Abstract:Image registration is an essential process for aligning features of interest from multiple images. With the recent development of deep learning techniques, image registration approaches have advanced to a new level. In this work, we present 'Rotation-Equivariant network and Transformers for Image Registration' (RoTIR), a deep-learning-based method for the alignment of fish scale images captured by light microscopy. This approach overcomes the challenge of arbitrary rotation and translation detection, as well as the absence of ground truth data. We employ feature-matching approaches based on Transformers and general E(2)-equivariant steerable CNNs for model creation. Besides, an artificial training dataset is employed for semi-supervised learning. Results show RoTIR successfully achieves the goal of fish scale image registration.