Abstract:Automatic visual inspection using machine learning-based methods plays a key role in achieving zero-defect policies in industry. Research on anomaly detection approaches is constrained by the availability of datasets that represent complex defect appearances and imperfect imaging conditions, which are typical to industrial processes. Recent benchmarks indicate that most publicly available datasets are biased towards optimal imaging conditions, leading to an overestimation of the methods' applicability to real-world industrial scenarios. To address this gap, we introduce the Industrial Screen Printing Anomaly Detection dataset (ISP-AD). It presents challenging small and weakly contrasted surface defects embedded within structured patterns exhibiting high permitted design variability. To the best of our knowledge, it is the largest publicly available industrial dataset to date, including both synthetic and real defects collected directly from the factory floor. In addition to the evaluation of defect detection performance of recent unsupervised anomaly detection methods, experiments on a mixed supervised training approach, incorporating both synthesized and real defects, were conducted. Even small amounts of injected real defects prove beneficial for model generalization. Furthermore, starting from training on purely synthetic defects, emerging real defective samples can be efficiently integrated into subsequent scalable training. Research findings indicate that supervision by means of both synthetic and accumulated real defects can complement each other, meeting demanded industrial inspection requirements such as low false positive rates and high recall. The presented unsupervised and supervised dataset splits are designed to emphasize research on unsupervised, self-supervised, and supervised approaches, enhancing their applicability to industrial settings.