Abstract:Efficiently and completely capturing the three-dimensional data of an object is a fundamental problem in industrial and robotic applications. The task of next-best-view (NBV) planning is to infer the pose of the next viewpoint based on the current data, and gradually realize the complete three-dimensional reconstruction. Many existing algorithms, however, suffer a large computational burden due to the use of ray-casting. To address this, this paper proposes a projection-based NBV planning framework. It can select the next best view at an extremely fast speed while ensuring the complete scanning of the object. Specifically, this framework refits different types of voxel clusters into ellipsoids based on the voxel structure.Then, the next best view is selected from the candidate views using a projection-based viewpoint quality evaluation function in conjunction with a global partitioning strategy. This process replaces the ray-casting in voxel structures, significantly improving the computational efficiency. Comparative experiments with other algorithms in a simulation environment show that the framework proposed in this paper can achieve 10 times efficiency improvement on the basis of capturing roughly the same coverage. The real-world experimental results also prove the efficiency and feasibility of the framework.
Abstract:Fourier Ptychographic Microscopy (FPM) is a computational imaging technique that enables high-resolution imaging over a large field of view. However, its application in the biomedical field has been limited due to the long image reconstruction time and poor noise robustness. In this paper, we propose a fast and robust FPM reconstruction method based on physical neural networks with batch update stochastic gradient descent (SGD) optimization strategy, capable of achieving attractive results with low single-to-noise ratio and correcting multiple system parameters simultaneously. Our method leverages a random batch optimization approach, breaks away from the fixed sequential iterative order and gives greater attention to high-frequency information. The proposed method has better convergence performance even for low signal-to-noise ratio data sets, such as low exposure time dark-field images. As a result, it can greatly increase the image recording and result reconstruction speed without any additional hardware modifications. By utilizing advanced deep learning optimizers and perform parallel computational scheme, our method enhances GPU computational efficiency, significantly reducing reconstruction costs. Experimental results demonstrate that our method achieves near real-time digital refocusing of a 1024 x 1024 pixels region of interest on consumer-grade GPUs. This approach significantly improves temporal resolution (by reducing the exposure time of dark-field images), noise resistance, and reconstruction speed, and therefore can efficiently promote the practical application of FPM in clinical diagnostics, digital pathology, and biomedical research, etc. In addition, we believe our algorithm scheme can help researchers quickly validate and implement FPM-related ideas. We invite requests for the full code via email.
Abstract:Single-pixel imaging (SPI) using a single-pixel detector is an unconventional imaging method, which has great application prospects in many fields to realize high-performance imaging. In especial, the recent proposed catadioptric panoramic ghost imaging (CPGI) extends the application potential of SPI to high-performance imaging at a wide field of view (FOV) with recent growing demands. However, the resolution of CPGI is limited by hardware parameters of the digital micromirror device (DMD), which may not meet ultrahigh-resolution panoramic imaging needs that require detailed information. Therefore, to overcome the resolution limitation of CPGI, we propose a panoramic SPI based on rotational subdivision (RSPSI). The key of the proposed RSPSI is to obtain the entire panoramic scene by the rotation-scanning with a rotating mirror tilted 45{\deg}, so that one single pattern that only covers one sub-Fov with a small FOV can complete a uninterrupted modulation on the entire panoramic FOV during a once-through pattern projection. Then, based on temporal resolution subdivision, images sequence of sub-Fovs subdivided from the entire panoramic FOV can be reconstructed with pixels-level or even subpixels-level horizontal shifting adjacently. Experimental results using a proof-of-concept setup show that the panoramic image can be obtained with 10428*543 of 5,662,404 pixels, which is more than 9.6 times higher than the resolution limit of the CPGI using the same DMD. To our best knowledge, the RSPSI is the first to achieve a megapixel resolution via SPI, which can provide potential applications in fields requiring the imaging with ultrahigh-resolution and wide FOV.
Abstract:Infrared small target detection based on deep learning offers unique advantages in separating small targets from complex and dynamic backgrounds. However, the features of infrared small targets gradually weaken as the depth of convolutional neural network (CNN) increases. To address this issue, we propose a novel method for detecting infrared small targets called improved dense nested attention network (IDNANet), which is based on the transformer architecture. We preserve the dense nested structure of dense nested attention network (DNANet) and introduce the Swin-transformer during feature extraction stage to enhance the continuity of features. Furthermore, we integrate the ACmix attention structure into the dense nested structure to enhance the features of intermediate layers. Additionally, we design a weighted dice binary cross-entropy (WD-BCE) loss function to mitigate the negative impact of foreground-background imbalance in the samples. Moreover, we develop a dataset specifically for infrared small targets, called BIT-SIRST. The dataset comprises a significant amount of real-world targets and manually annotated labels, as well as synthetic data and corresponding labels. We have evaluated the effectiveness of our method through experiments conducted on public datasets. In comparison to other state-of-the-art methods, our approach outperforms in terms of probability of detection (P_d), false-alarm rate (F_a), and mean intersection of union ($mIoU$). The $mIoU$ reaches 90.89 on the NUDT-SIRST dataset and 79.72 on the NUAA-SIRST dataset.
Abstract:Fourier single-pixel imaging (FSI) is a data-efficient single-pixel imaging (SPI). However, there is still a serious challenge to obtain higher imaging quality using fewer measurements, which limits the development of real-time SPI. In this work, a uniform-sampling foveated FSI (UFFSI) is proposed with three features, uniform sampling, effective sampling and flexible fovea, to achieve under-sampling high-efficiency and high-quality SPI, even in a large-scale scene. First, by flexibly using the three proposed foveated pattern structures, data redundancy is reduced significantly to only require high resolution (HR) on regions of interest (ROIs), which radically reduces the need of total data number. Next, by the non-uniform weight distribution processing, non-uniform spatial sampling is transformed into uniform sampling, then the fast Fourier transform is used accurately and directly to obtain under-sampling high imaging quality with further reduced measurements. At a sampling ratio of 0.0084 referring to HR FSI with 1024*768 pixels, experimentally, by UFFSI with 255*341 cells of 89% reduction in data redundancy, the ROI has a significantly better imaging quality to meet imaging needs. We hope this work can provide a breakthrough for future real-time SPI.
Abstract:Fourier Ptychographic Microscopy (FPM) is a computational technique that achieves a large space-bandwidth product imaging. It addresses the challenge of balancing a large field of view and high resolution by fusing information from multiple images taken with varying illumination angles. Nevertheless, conventional FPM framework always suffers from long acquisition time and a heavy computational burden. In this paper, we propose a novel physical neural network that generates an adaptive illumination mode by incorporating temporally-encoded illumination modes as a distinct layer, aiming to improve the acquisition and calculation efficiency. Both simulations and experiments have been conducted to validate the feasibility and effectiveness of the proposed method. It is worth mentioning that, unlike previous works that obtain the intensity of a multiplexed illumination by post-combination of each sequentially illuminated and obtained low-resolution images, our experimental data is captured directly by turning on multiple LEDs with a coded illumination pattern. Our method has exhibited state-of-the-art performance in terms of both detail fidelity and imaging velocity when assessed through a multitude of evaluative aspects.
Abstract:Illumination patterns of computational ghost imaging (CGI) systems suffer from reduced contrast when passing through a scattering medium, which causes the effective information in the reconstruction result to be drowned out by noise. A two-dimensional (2D) Gaussian filter performs linear smoothing operation on the whole image for image denoising. It can be combined with linear reconstruction algorithms of CGI to obtain the noise-reduced results directly, without post-processing. However, it results in blurred image edges while performing denoising and, in addition, a suitable standard deviation is difficult to choose in advance, especially in an unknown scattering environment. In this work, we subtly exploit the characteristics of CGI to solve these two problems very well. A kind of modified Hadamard pattern based on the 2D Gaussian filter and the differential operation features of Hadamard-based CGI is developed. We analyze and demonstrate that using Hadamard patterns for illumination but using our developed modified Hadamard patterns for reconstruction (MHCGI) can enhance the robustness of CGI against turbid scattering medium. Our method not only helps directly obtain noise-reduced results without blurred edges but also requires only an approximate standard deviation, i.e., it can be set in advance. The experimental results on transmitted and reflected targets demonstrate the feasibility of our method. Our method helps to promote the practical application of CGI in the scattering environment.
Abstract:We propose a single-shot quantitative differential phase contrast (DPC) method with polarization multiplexing illumination. In the illumination module of our system, the programmable LED array is divided into four quadrants and covered with polarizing films of four different polarization angles. We use a polarization camera with polarizers before the pixels in the imaging module. By matching the polarization angle between the polarizing films over the custom LED array and the polarizers in the camera, two sets of asymmetric illumination acquisition images can be calculated from a single-shot acquisition image. Combined with the phase transfer function, we can calculate the quantitative phase of the sample. We present the design, implementation, and experimental image data demonstrating the ability of our method to obtain quantitative phase images of the phase resolution target, as well as Hela cells.
Abstract:We propose an image resolution improvement method for optical coherence tomography (OCT) based on sparse continuous deconvolution. Traditional deconvolution techniques such as Lucy-Richardson deconvolution suffers from the artifact convergence problem after a small number of iterations, which brings limitation to practical applications. In this work, we take advantage of the prior knowledge about the sample sparsity and continuity to constrain the deconvolution iteration. Sparsity is used to achieve the resolution improvement through the resolution preserving regularization term. And the continuity based on the correlation of the grayscale values in different directions is introduced to mitigate excessive image sparsity and noise reduction through the continuity regularization term. The Bregman splitting technique is then used to solve the resulting optimization problem. Both the numerical simulation study and experimental study on phantoms and biological samples show that our method can suppress artefacts of traditional deconvolution techniques effectively. Meanwhile, clear resolution improvement is demonstrated. It achieved nearly twofold resolution improvement for phantom beads image that can be quantitatively evaluated
Abstract:Extensive research works demonstrate that the attention mechanism in convolutional neural networks (CNNs) effectively improves accuracy. But little works design attention mechanisms using large receptive fields. In this work, we propose a novel attention method named Rega-net to increase CNN accuracy by enlarging the receptive field. Inspired by the mechanism of the human retina, we design convolutional kernels to resemble the non-uniformly distributed structure of the human retina. Then, we sample variable-resolution values in the Gabor function distribution and fill these values in retina-like kernels. This distribution allows important features to be more visible in the center position of the receptive field. We further design an attention module including these retina-like kernels. Experiments demonstrate that our Rega-Net achieves 79.963\% top-1 accuracy on ImageNet-1K classification and 43.1\% mAP on COCO2017 object detection. The mAP of the Rega-Net increased by up to 3.5\% compared to baseline networks.