Abstract:Recently, zero-shot anomaly detection (ZSAD) has emerged as a pivotal paradigm for identifying defects in unseen categories without requiring target samples in training phase. However, existing ZSAD methods struggle with the boundary of small and complex defects due to insufficient representations. Most of them use the single manually designed prompts, failing to work for diverse objects and anomalies. In this paper, we propose MFP-CLIP, a novel prompt-based CLIP framework which explores the efficacy of multi-form prompts for zero-shot industrial anomaly detection. We employ an image to text prompting(I2TP) mechanism to better represent the object in the image. MFP-CLIP enhances perception to multi-scale and complex anomalies by self prompting(SP) and a multi-patch feature aggregation(MPFA) module. To precisely localize defects, we introduce the mask prompting(MP) module to guide model to focus on potential anomaly regions. Extensive experiments are conducted on two wildly used industrial anomaly detection benchmarks, MVTecAD and VisA, demonstrating MFP-CLIP's superiority in ZSAD.
Abstract:Space-based visible camera is an important sensor for space situation awareness during proximity operations. However, visible camera can be easily affected by the low illumination in the space environment. Recently, deep learning approaches have achieved remarkable success in image enhancement of natural images datasets, but seldom applied in space due to the data bottleneck. In this article, we propose a data-driven method for low-light image enhancement (LLIE) of spin targets in space environment based on diffusion model. Firstly, a dataset collection scheme is devised. To reduce the domain gap and improve the diversity and quality of the dataset, we collect the data with the camera on a ground-test system imitating the low lighting conditions and relative attitude change of satellite in space. The satellite motion is controlled by a 6-DoF robot. To generate different poses, a advanced sampling method is combined with collision detection in physical simulation. The entire process is automated. Based on our dataset, a novel diffusion model is proposed. The diffusion and denoising process are directly conducted on the grayscale channel to save computational resources. To take advantage of the inner information of RGB channels, we rescale the RGB feature maps and insert them into the downsampling layers to help feature extraction. The enhanced results with our method have been verified to be better in image light enhancement and competitive in image quality compared with previous methods. To the best of our knowledge, this is the first work of LLIE using diffusion model.