Abstract:Coded Aperture Snapshot Spectral Imaging (CASSI) system has great advantages over traditional methods in dynamically acquiring Hyper-Spectral Image (HSI), but there are the following problems. 1) Traditional mask relies on random patterns or analytical design, both of which limit the performance improvement of CASSI. 2) Existing high-quality reconstruction algorithms are slow in reconstruction and can only reconstruct scene information offline. To address the above two problems, this paper designs the AMDC-CASSI system, introducing RGB camera with CASSI based on Adaptive-Mask as multimodal input to improve the reconstruction quality. The existing SOTA reconstruction schemes are based on transformer, but the operation of self-attention pulls down the operation efficiency of the network. In order to improve the inference speed of the reconstruction network, this paper proposes An MLP Architecture for Adaptive-Mask-based Dual-Camera (MLP-AMDC) to replace the transformer structure of the network. Numerous experiments have shown that MLP performs no less well than transformer-based structures for HSI reconstruction, while MLP greatly improves the network inference speed and has less number of parameters and operations, our method has a 8 db improvement over SOTA and at least a 5-fold improvement in reconstruction speed. (https://github.com/caizeyu1992/MLP-AMDC.)
Abstract:Deep learning methods are developing rapidly in coded aperture snapshot spectral imaging (CASSI). The number of parameters and FLOPs of existing state-of-the-art methods (SOTA) continues to increase, but the reconstruction accuracy improves slowly. Current methods still face two problems: 1) The performance of the spatial light modulator (SLM) is not fully developed due to the limitation of fixed Mask coding. 2) The single input limits the network performance. In this paper we present a dynamic-mask-based dual camera system, which consists of an RGB camera and a CASSI system running in parallel. First, the system learns the spatial feature distribution of the scene based on the RGB images, then instructs the SLM to encode each scene, and finally sends both RGB and CASSI images to the network for reconstruction. We further designed the DMDC-net, which consists of two separate networks, a small-scale CNN-based dynamic mask network for dynamic adjustment of the mask and a multimodal reconstruction network for reconstruction using RGB and CASSI measurements. Extensive experiments on multiple datasets show that our method achieves more than 9 dB improvement in PSNR over the SOTA. (https://github.com/caizeyu1992/DMDC)
Abstract:Spectral image reconstruction is an important task in snapshot compressed imaging. This paper aims to propose a new end-to-end framework with iterative capabilities similar to a deep unfolding network to improve reconstruction accuracy, independent of optimization conditions, and to reduce the number of parameters. A novel framework called the reversible-prior-based method is proposed. Inspired by the reversibility of the optical path, the reversible-prior-based framework projects the reconstructions back into the measurement space, and then the residuals between the projected data and the real measurements are fed into the network for iteration. The reconstruction subnet in the network then learns the mapping of the residuals to the true values to improve reconstruction accuracy. In addition, a novel spectral-spatial transformer is proposed to account for the global correlation of spectral data in both spatial and spectral dimensions while balancing network depth and computational complexity, in response to the shortcomings of existing transformer-based denoising modules that ignore spatial texture features or learn local spatial features at the expense of global spatial features. Extensive experiments show that our SST-ReversibleNet significantly outperforms state-of-the-art methods on simulated and real HSI datasets, while requiring lower computational and storage costs. https://github.com/caizeyu1992/SST