Abstract:These recent years have witnessed that convolutional neural network (CNN)-based methods for detecting infrared small targets have achieved outstanding performance. However, these methods typically employ standard convolutions, neglecting to consider the spatial characteristics of the pixel distribution of infrared small targets. Therefore, we propose a novel pinwheel-shaped convolution (PConv) as a replacement for standard convolutions in the lower layers of the backbone network. PConv better aligns with the pixel Gaussian spatial distribution of dim small targets, enhances feature extraction, significantly increases the receptive field, and introduces only a minimal increase in parameters. Additionally, while recent loss functions combine scale and location losses, they do not adequately account for the varying sensitivity of these losses across different target scales, limiting detection performance on dim-small targets. To overcome this, we propose a scale-based dynamic (SD) Loss that dynamically adjusts the influence of scale and location losses based on target size, improving the network's ability to detect targets of varying scales. We construct a new benchmark, SIRST-UAVB, which is the largest and most challenging dataset to date for real-shot single-frame infrared small target detection. Lastly, by integrating PConv and SD Loss into the latest small target detection algorithms, we achieved significant performance improvements on IRSTD-1K and our SIRST-UAVB dataset, validating the effectiveness and generalizability of our approach. Code -- https://github.com/JN-Yang/PConv-SDloss-Data
Abstract:Photoacoustic imaging (PAI) combines the high contrast of optical imaging with the deep penetration depth of ultrasonic imaging, showing great potential in cerebrovascular disease detection. However, the ultrasonic wave suffers strong attenuation and multi-scattering when it passes through the skull tissue, resulting in the distortion of the collected photoacoustic (PA) signal. In this paper, inspired by the principles of deep learning and non-line-of-sight (NLOS) imaging, we propose an image reconstruction framework named HDN (Hybrid Deep-learning and Non-line-of-sight), which consists of the signal extraction part and difference utilization part. The signal extraction part is used to correct the distorted signal and reconstruct an initial image. The difference utilization part is used to make further use of the signal difference between the distorted signal and corrected signal, reconstructing the residual image between the initial image and the target image. The test results on a PA digital brain simulation dataset show that compared with the traditional delay-and-sum (DAS) method and deep-learning-based method, HDN achieved superior performance in both signal correction and image reconstruction. Specifically for the SSIM index, the HDN reached 0.606 in imaging results, compared to 0.154 for the DAS method and 0.307 for the deep-learning-based method.