Abstract:With the rapid growth of the PCB manufacturing industry, there is an increasing demand for computer vision inspection to detect defects during production. Improving the accuracy and generalization of PCB defect detection models remains a significant challenge. This paper proposes a high-precision, robust, and real-time end-to-end method for PCB defect detection based on deep Convolutional Neural Networks (CNN). Traditional methods often suffer from low accuracy and limited applicability. We propose a novel approach combining YOLOv5 and multiscale modules for hierarchical residual-like connections. In PCB defect detection, noise can confuse the background and small targets. The YOLOv5 model provides a strong foundation with its real-time processing and accurate object detection capabilities. The multi-scale module extends traditional approaches by incorporating hierarchical residual-like connections within a single block, enabling multiscale feature extraction. This plug-and-play module significantly enhances performance by extracting features at multiple scales and levels, which are useful for identifying defects of varying sizes and complexities. Our multi-scale architecture integrates feature extraction, defect localization, and classification into a unified network. Experiments on a large-scale PCB dataset demonstrate significant improvements in precision, recall, and F1-score compared to existing methods. This work advances computer vision inspection for PCB defect detection, providing a reliable solution for high-precision, robust, real-time, and domain-adaptive defect detection in the PCB manufacturing industry.
Abstract:Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to a target domain using only unlabeled target data. Current SFDA methods face challenges in effectively leveraging pre-trained knowledge and exploiting target domain data. Multimodal Large Language Models (MLLMs) offer remarkable capabilities in understanding visual and textual information, but their applicability to SFDA poses challenges such as instruction-following failures, intensive computational demands, and difficulties in performance measurement prior to adaptation. To alleviate these issues, we propose Reliability-based Curriculum Learning (RCL), a novel framework that integrates multiple MLLMs for knowledge exploitation via pseudo-labeling in SFDA. Our framework incorporates proposed Reliable Knowledge Transfer, Self-correcting and MLLM-guided Knowledge Expansion, and Multi-hot Masking Refinement to progressively exploit unlabeled data in the target domain. RCL achieves state-of-the-art (SOTA) performance on multiple SFDA benchmarks, e.g., $\textbf{+9.4%}$ on DomainNet, demonstrating its effectiveness in enhancing adaptability and robustness without requiring access to source data. Code: https://github.com/Dong-Jie-Chen/RCL.
Abstract:Generative models have shown promising results in capturing human mobility characteristics and generating synthetic trajectories. However, it remains challenging to ensure that the generated geospatial mobility data is semantically realistic, including consistent location sequences, and reflects real-world characteristics, such as constraining on geospatial limits. To address these issues, we reformat human mobility modeling as an autoregressive generation task, leveraging Generative Pre-trained Transformer (GPT). To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT. We propose a gravity-based sampling method to train a transformer for semantic sequence similarity. Then, we constrained the training process via a road connectivity matrix that provides the connectivity of sequences in trajectory generation, thereby keeping generated trajectories in geospatial limits. Lastly, we constructed a Reinforcement Learning from Trajectory Feedback (RLTF) to minimize the travel distance between training and the synthetically generated trajectories. Our experiments on real-world datasets demonstrate that MobilityGPT outperforms state-of-the-art methods in generating high-quality mobility trajectories that are closest to real data in terms of origin-destination similarity, trip length, travel radius, link, and gravity distributions.
Abstract:High-dimensional data are widely used in the era of deep learning with numerous applications. However, certain data which has sensitive information are not allowed to be shared without privacy protection. In this paper, we propose a novel differentially private data releasing method called Differentially Private Data Publishing with Gaussian Optimized Model Inversion (DPGOMI) to address this issue. Our approach involves mapping private data to the latent space using a public generator, followed by a lower-dimensional DP-GAN with better convergence properties. We evaluate the performance of DPGOMI on standard datasets CIFAR10 and SVHN. Our results show that DPGOMI outperforms the standard DP-GAN method in terms of Inception Score, Fr\'echet Inception Distance, and classification performance, while providing the same level of privacy. Our proposed approach offers a promising solution for protecting sensitive data in GAN training while maintaining high-quality results.
Abstract:To protect sensitive data in training a Generative Adversarial Network (GAN), the standard approach is to use differentially private (DP) stochastic gradient descent method in which controlled noise is added to the gradients. The quality of the output synthetic samples can be adversely affected and the training of the network may not even converge in the presence of these noises. We propose Differentially Private Model Inversion (DPMI) method where the private data is first mapped to the latent space via a public generator, followed by a lower-dimensional DP-GAN with better convergent properties. Experimental results on standard datasets CIFAR10 and SVHN as well as on a facial landmark dataset for Autism screening show that our approach outperforms the standard DP-GAN method based on Inception Score, Fr\'echet Inception Distance, and classification accuracy under the same privacy guarantee.