Abstract:This paper presents a novel shape prior segmentation method guided by the Harmonic Beltrami Signature (HBS). The HBS is a shape representation fully capturing 2D simply connected shapes, exhibiting resilience against perturbations and invariance to translation, rotation, and scaling. The proposed method integrates the HBS within a quasi-conformal topology preserving segmentation framework, leveraging shape prior knowledge to significantly enhance segmentation performance, especially for low-quality or occluded images. The key innovation lies in the bifurcation of the optimization process into two iterative stages: 1) The computation of a quasi-conformal deformation map, which transforms the unit disk into the targeted segmentation area, driven by image data and other regularization terms; 2) The subsequent refinement of this map is contingent upon minimizing the $L_2$ distance between its Beltrami coefficient and the reference HBS. This shape-constrained refinement ensures that the segmentation adheres to the reference shape(s) by exploiting the inherent invariance, robustness, and discerning shape discriminative capabilities afforded by the HBS. Extensive experiments on synthetic and real-world images validate the method's ability to improve segmentation accuracy over baselines, eliminate preprocessing requirements, resist noise corruption, and flexibly acquire and apply shape priors. Overall, the HBS segmentation framework offers an efficient strategy to robustly incorporate the shape prior knowledge, thereby advancing critical low-level vision tasks.
Abstract:This paper addresses the problem of modeling and estimating dynamic multi-valued mappings. While most mathematical models provide a unique solution for a given input, real-world applications often lack deterministic solutions. In such scenarios, estimating dynamic multi-valued mappings is necessary to suggest different reasonable solutions for each input. This paper introduces a deep neural network framework incorporating a generative network and a classification component. The objective is to model the dynamic multi-valued mapping between the input and output by providing a reliable uncertainty measurement. Generating multiple solutions for a given input involves utilizing a discrete codebook comprising finite variables. These variables are fed into a generative network along with the input, producing various output possibilities. The discreteness of the codebook enables efficient estimation of the output's conditional probability distribution for any given input using a classifier. By jointly optimizing the discrete codebook and its uncertainty estimation during training using a specially designed loss function, a highly accurate approximation is achieved. The effectiveness of our proposed framework is demonstrated through its application to various imaging problems, using both synthetic and real imaging data. Experimental results show that our framework accurately estimates the dynamic multi-valued mapping with uncertainty estimation.
Abstract:3D face registration is an important process in which a 3D face model is aligned and mapped to a template face. However, the task of 3D face registration becomes particularly challenging when dealing with partial face data, where only limited facial information is available. To address this challenge, this paper presents a novel deep learning-based approach that combines quasi-conformal geometry with deep neural networks for partial face registration. The proposed framework begins with a Landmark Detection Network that utilizes curvature information to detect the presence of facial features and estimate their corresponding coordinates. These facial landmark features serve as essential guidance for the registration process. To establish a dense correspondence between the partial face and the template surface, a registration network based on quasiconformal theories is employed. The registration network establishes a bijective quasiconformal surface mapping aligning corresponding partial faces based on detected landmarks and curvature values. It consists of the Coefficients Prediction Network, which outputs the optimal Beltrami coefficient representing the surface mapping. The Beltrami coefficient quantifies the local geometric distortion of the mapping. By controlling the magnitude of the Beltrami coefficient through a suitable activation function, the bijectivity and geometric distortion of the mapping can be controlled. The Beltrami coefficient is then fed into the Beltrami solver network to reconstruct the corresponding mapping. The surface registration enables the acquisition of corresponding regions and the establishment of point-wise correspondence between different partial faces, facilitating precise shape comparison through the evaluation of point-wise geometric differences at these corresponding regions. Experimental results demonstrate the effectiveness of the proposed method.
Abstract:Image segmentation plays a crucial role in extracting important objects of interest from images, enabling various applications. While existing methods have shown success in segmenting clean images, they often struggle to produce accurate segmentation results when dealing with degraded images, such as those containing noise or occlusions. To address this challenge, interactive segmentation has emerged as a promising approach, allowing users to provide meaningful input to guide the segmentation process. However, an important problem in interactive segmentation lies in determining how to incorporate minimal yet meaningful user guidance into the segmentation model. In this paper, we propose the quasi-conformal interactive segmentation (QIS) model, which incorporates user input in the form of positive and negative clicks. Users mark a few pixels belonging to the object region as positive clicks, indicating that the segmentation model should include a region around these clicks. Conversely, negative clicks are provided on pixels belonging to the background, instructing the model to exclude the region near these clicks from the segmentation mask. Additionally, the segmentation mask is obtained by deforming a template mask with the same topology as the object of interest using an orientation-preserving quasiconformal mapping. This approach helps to avoid topological errors in the segmentation results. We provide a thorough analysis of the proposed model, including theoretical support for the ability of QIS to include or exclude regions of interest or disinterest based on the user's indication. To evaluate the performance of QIS, we conduct experiments on synthesized images, medical images, natural images and noisy natural images. The results demonstrate the efficacy of our proposed method.
Abstract:Accurate analysis and classification of facial attributes are essential in various applications, from human-computer interaction to security systems. In this work, a novel approach to enhance facial classification and recognition tasks through the integration of 3D facial models with deep learning methods was proposed. We extract the most useful information for various tasks using the 3D Facial Model, leading to improved classification accuracy. Combining 3D facial insights with ResNet architecture, our approach achieves notable results: 100% individual classification, 95.4% gender classification, and 83.5% expression classification accuracy. This method holds promise for advancing facial analysis and recognition research.
Abstract:Images degraded by geometric distortions pose a significant challenge to imaging and computer vision tasks such as object recognition. Deep learning-based imaging models usually fail to give accurate performance for geometrically distorted images. In this paper, we propose the deformation-invariant neural network (DINN), a framework to address the problem of imaging tasks for geometrically distorted images. The DINN outputs consistent latent features for images that are geometrically distorted but represent the same underlying object or scene. The idea of DINN is to incorporate a simple component, called the quasiconformal transformer network (QCTN), into other existing deep networks for imaging tasks. The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images. It first outputs a Beltrami coefficient, which measures the quasiconformality of the output deformation map. By controlling the Beltrami coefficient, the local geometric distortion under the quasiconformal mapping can be controlled. The QCTN is lightweight and simple, which can be readily integrated into other existing deep neural networks to enhance their performance. Leveraging our framework, we have developed an image classification network that achieves accurate classification of distorted images. Our proposed framework has been applied to restore geometrically distorted images by atmospheric turbulence and water turbulence. DINN outperforms existing GAN-based restoration methods under these scenarios, demonstrating the effectiveness of the proposed framework. Additionally, we apply our proposed framework to the 1-1 verification of human face images under atmospheric turbulence and achieve satisfactory performance, further demonstrating the efficacy of our approach.
Abstract:Fast and accurate MRI reconstruction is a key concern in modern clinical practice. Recently, numerous Deep-Learning methods have been proposed for MRI reconstruction, however, they usually fail to reconstruct sharp details from the subsampled k-space data. To solve this problem, we propose a lightweight and accurate Edge Attention MRI Reconstruction Network (EAMRI) to reconstruct images with edge guidance. Specifically, we design an efficient Edge Prediction Network to directly predict accurate edges from the blurred image. Meanwhile, we propose a novel Edge Attention Module (EAM) to guide the image reconstruction utilizing the extracted edge priors, as inspired by the popular self-attention mechanism. EAM first projects the input image and edges into Q_image, K_edge, and V_image, respectively. Then EAM pairs the Q_image with K_edge along the channel dimension, such that 1) it can search globally for the high-frequency image features that are activated by the edge priors; 2) the overall computation burdens are largely reduced compared with the traditional spatial-wise attention. With the help of EAM, the predicted edge priors can effectively guide the model to reconstruct high-quality MR images with accurate edges. Extensive experiments show that our proposed EAMRI outperforms other methods with fewer parameters and can recover more accurate edges.
Abstract:Medical image segmentation aims to automatically extract anatomical or pathological structures in the human body. Most objects or regions of interest are of similar patterns. For example, the relative location and the relative size of the lung and the kidney differ little among subjects. Incorporating these morphology rules as prior knowledge into the segmentation model is believed to be an effective way to enhance the accuracy of the segmentation results. Motivated by this, we propose in this work the Topology-Preserving Segmentation Network (TPSN) which can predict segmentation masks with the same topology prescribed for specific tasks. TPSN is a deformation-based model that yields a deformation map through an encoder-decoder architecture to warp the template masks into a target shape approximating the region to segment. Comparing to the segmentation framework based on pixel-wise classification, deformation-based segmentation models that warp a template to enclose the regions are more convenient to enforce geometric constraints. In our framework, we carefully design the ReLU Jacobian regularization term to enforce the bijectivity of the deformation map. As such, the predicted mask by TPSN has the same topology as that of the template prior mask.
Abstract:In medical imaging, surface registration is extensively used for performing systematic comparisons between anatomical structures, with a prime example being the highly convoluted brain cortical surfaces. To obtain a meaningful registration, a common approach is to identify prominent features on the surfaces and establish a low-distortion mapping between them with the feature correspondence encoded as landmark constraints. Prior registration works have primarily focused on using manually labeled landmarks and solving highly nonlinear optimization problems, which are time-consuming and hence hinder practical applications. In this work, we propose a novel framework for the automatic landmark detection and registration of brain cortical surfaces using quasi-conformal geometry and convolutional neural networks. We first develop a landmark detection network (LD-Net) that allows for the automatic extraction of landmark curves given two prescribed starting and ending points based on the surface geometry. We then utilize the detected landmarks and quasi-conformal theory for achieving the surface registration. Specifically, we develop a coefficient prediction network (CP-Net) for predicting the Beltrami coefficients associated with the desired landmark-based registration and a mapping network called the disk Beltrami solver network (DBS-Net) for generating quasi-conformal mappings from the predicted Beltrami coefficients, with the bijectivity guaranteed by quasi-conformal theory. Experimental results are presented to demonstrate the effectiveness of our proposed framework. Altogether, our work paves a new way for surface-based morphometry and medical shape analysis.
Abstract:Medical image segmentation, which aims to automatically extract anatomical or pathological structures, plays a key role in computer-aided diagnosis and disease analysis. Despite the problem has been widely studied, existing methods are prone to topological errors. In medical imaging, the topology of the structure, such as the kidney or lung, is usually known. Preserving the topology of the structure in the segmentation process is of utmost importance for accurate image analysis. In this work, a novel learning-based segmentation model is proposed. A {\it topology-preserving segmentation network (TPSN)} is trained to give an accurate segmentation result of an input image that preserves the prescribed topology. TPSN is a deformation-based model that yields a deformation map through a UNet, which takes the medical image and a template mask as inputs. The main idea is to deform a template mask describing the prescribed topology by a diffeomorphism to segment the object in the image. The topology of the shape in the template mask is well preserved under the diffeomorphic map. The diffeomorphic property of the map is controlled by introducing a regularization term related to the Jacobian in the loss function. As such, a topology-preserving segmentation result can be guaranteed. Furthermore, a multi-scale TPSN is developed in this paper that incorporates multi-level information of images to produce more precise segmentation results. To evaluate our method, we applied the 2D TPSN on Ham10000 and 3D TPSN on KiTS21. Experimental results illustrate our method outperforms the baseline UNet segmentation model with/without connected-component analysis (CCA) by both the dice score and IoU score. Besides, results show that our method can produce reliable results even in challenging cases, where pixel-wise segmentation models by UNet and CCA fail to obtain accurate results.