Abstract:Capturing the changing trade pattern is critical in customs fraud detection. As new goods are imported and novel frauds arise, a drift-aware fraud detection system is needed to detect both known frauds and unknown frauds within a limited budget. The current paper proposes ADAPT, an adaptive selection method that controls the balance between exploitation and exploration strategies used for customs fraud detection. ADAPT makes use of the model performance trends and the amount of concept drift to determine the best exploration ratio at every time. Experiments on data from four countries over several years show that each country requires a different amount of exploration for maintaining its fraud detection system. We find the system with ADAPT can gradually adapt to the dataset and find the appropriate amount of exploration ratio with high performance.
Abstract:Continual labeling of training examples is a costly task in supervised learning. Active learning strategies mitigate this cost by identifying unlabeled data that are considered the most useful for training a predictive model. However, sample selection via active learning may lead to an exploitation-exploration dilemma. In online settings, profitable items can be neglected when uncertain items are annotated instead. To illustrate this dilemma, we study a human-in-the-loop customs selection scenario where an AI-based system supports customs officers by providing a set of imports to be inspected. If the inspected items are fraud, officers levy extra duties, and these items will be used as additional training data for the next iterations. Inspecting highly suspicious items will inevitably lead to additional customs revenue, yet they may not give any extra knowledge to customs officers. On the other hand, inspecting uncertain items will help customs officers to acquire new knowledge, which will be used as supplementary training resources to update their selection systems. Through years of customs selection simulation, we show that some exploration is needed to cope with the domain shift, and our hybrid strategy of selecting fraud and uncertain items will eventually outperform the performance of the exploitation strategy.