Abstract:Change Detection (CD) is an essential field in remote sensing, with a primary focus on identifying areas of change in bi-temporal image pairs captured at varying intervals of the same region by a satellite. The data annotation process for the CD task is both time-consuming and labor-intensive. To make better use of the scarce labeled data and abundant unlabeled data, we present an adaptive dynamic semi-supervised learning method, AdaSemiCD, to improve the use of pseudo-labels and optimize the training process. Initially, due to the extreme class imbalance inherent in CD, the model is more inclined to focus on the background class, and it is easy to confuse the boundary of the target object. Considering these two points, we develop a measurable evaluation metric for pseudo-labels that enhances the representation of information entropy by class rebalancing and amplification of confusing areas to give a larger weight to prospects change objects. Subsequently, to enhance the reliability of sample-wise pseudo-labels, we introduce the AdaFusion module, which is capable of dynamically identifying the most uncertain region and substituting it with more trustworthy content. Lastly, to ensure better training stability, we introduce the AdaEMA module, which updates the teacher model using only batches of trusted samples. Experimental results from LEVIR-CD, WHU-CD, and CDD datasets validate the efficacy and universality of our proposed adaptive training framework.
Abstract:Fraudulent activities related to online advertising can potentially harm the trust advertisers put in advertising networks and sour the gaming experience for users. Pay-Per-Click/Install (PPC/I) advertising is one of the main revenue models in game monetization. Widespread use of the PPC/I model has led to a rise in click/install fraud events in games. The majority of traffic in ad networks is non-fraudulent, which imposes difficulties on machine learning based fraud detection systems to deal with highly skewed labels. From the ad network standpoint, user activities are multi-type sequences of temporal events consisting of event types and corresponding time intervals. Time Long Short-Term Memory (Time-LSTM) network cells have been proved effective in modeling intrinsic hidden patterns with non-uniform time intervals. In this study, we propose using a variant of Time-LSTM cells in combination with a modified version of Sequence Generative Adversarial Generative (SeqGAN)to generate artificial sequences to mimic the fraudulent user patterns in ad traffic. We also propose using a Critic network instead of Monte-Carlo (MC) roll-out in training SeqGAN to reduce computational costs. The GAN-generated sequences can be used to enhance the classification ability of event-based fraud detection classifiers. Our extensive experiments based on synthetic data have shown the trained generator has the capability to generate sequences with desired properties measured by multiple criteria.