Abstract:Novel-view synthesis based on visible light has been extensively studied. In comparison to visible light imaging, thermal infrared imaging offers the advantage of all-weather imaging and strong penetration, providing increased possibilities for reconstruction in nighttime and adverse weather scenarios. However, thermal infrared imaging is influenced by physical characteristics such as atmospheric transmission effects and thermal conduction, hindering the precise reconstruction of intricate details in thermal infrared scenes, manifesting as issues of floaters and indistinct edge features in synthesized images. To address these limitations, this paper introduces a physics-induced 3D Gaussian splatting method named Thermal3D-GS. Thermal3D-GS begins by modeling atmospheric transmission effects and thermal conduction in three-dimensional media using neural networks. Additionally, a temperature consistency constraint is incorporated into the optimization objective to enhance the reconstruction accuracy of thermal infrared images. Furthermore, to validate the effectiveness of our method, the first large-scale benchmark dataset for this field named Thermal Infrared Novel-view Synthesis Dataset (TI-NSD) is created. This dataset comprises 20 authentic thermal infrared video scenes, covering indoor, outdoor, and UAV(Unmanned Aerial Vehicle) scenarios, totaling 6,664 frames of thermal infrared image data. Based on this dataset, this paper experimentally verifies the effectiveness of Thermal3D-GS. The results indicate that our method outperforms the baseline method with a 3.03 dB improvement in PSNR and significantly addresses the issues of floaters and indistinct edge features present in the baseline method. Our dataset and codebase will be released in \href{https://github.com/mzzcdf/Thermal3DGS}{\textcolor{red}{Thermal3DGS}}.
Abstract:Most existing deep learning-based image restoration methods usually aim to remove degradation with uniform spatial distribution and constant intensity, making insufficient use of degradation prior knowledge. Here we bootstrap the deep neural networks to suppress complex image degradation whose intensity is spatially variable, through utilizing prior knowledge from degraded images. Specifically, we propose an ingenious and efficient multi-frame image restoration network (DparNet) with wide & deep architecture, which integrates degraded images and prior knowledge of degradation to reconstruct images with ideal clarity and stability. The degradation prior is directly learned from degraded images in form of key degradation parameter matrix, with no requirement of any off-site knowledge. The wide & deep architecture in DparNet enables the learned parameters to directly modulate the final restoring results, boosting spatial & intensity adaptive image restoration. We demonstrate the proposed method on two representative image restoration applications: image denoising and suppression of atmospheric turbulence effects in images. Two large datasets, containing 109,536 and 49,744 images respectively, were constructed to support our experiments. The experimental results show that our DparNet significantly outperform SoTA methods in restoration performance and network efficiency. More importantly, by utilizing the learned degradation parameters via wide & deep learning, we can improve the PSNR of image restoration by 0.6~1.1 dB with less than 2% increasing in model parameter numbers and computational complexity. Our work suggests that degraded images may hide key information of the degradation process, which can be utilized to boost spatial & intensity adaptive image restoration.
Abstract:The Segment Anything Model (SAM) is a promptable segmentation model recently introduced by Meta AI that has demonstrated its prowess across various fields beyond just image segmentation. SAM can accurately segment images across diverse fields, and generating various masks. We discovered that this ability of SAM can be leveraged to pretrain models for specific fields. Accordingly, we have proposed a framework that utilizes SAM to generate pseudo labels for pretraining thermal infrared image segmentation tasks. Our proposed framework can effectively improve the accuracy of segmentation results of specific categories beyond the SOTA ImageNet pretrained model. Our framework presents a novel approach to collaborate with models trained with large data like SAM to address problems in special fields. Also, we generated a large scale thermal infrared segmentation dataset used for pretaining, which contains over 100,000 images with pixel-annotation labels. This approach offers an effective solution for working with large models in special fields where label annotation is challenging. Our code is available at https://github.com/chenjzBUAA/SATIR