Abstract:This research explores Cost-Sensitive Learning (CSL) in the fraud detection domain to decrease the fraud class's incorrect predictions and increase its accuracy. Notably, we concentrate on shill bidding fraud that is challenging to detect because the behavior of shill and legitimate bidders are similar. We investigate CSL within the Semi-Supervised Classification (SSC) framework to address the scarcity of labeled fraud data. Our paper is the first attempt to integrate CSL with SSC for fraud detection. We adopt a meta-CSL approach to manage the costs of misclassification errors, while SSC algorithms are trained with imbalanced data. Using an actual shill bidding dataset, we assess the performance of several hybrid models of CSL and SSC and then compare their misclassification error and accuracy rates statistically. The most efficient CSL+SSC model was able to detect 99% of fraudsters and with the lowest total cost.
Abstract:Given the magnitude of online auction transactions, it is difficult to safeguard consumers from dishonest sellers, such as shill bidders. To date, the application of machine learning to auction fraud detection has been limited. Shill Bidding (SB) is a severe auction fraud, which is driven by modern-day technologies and clever scammers. The difficulty of identifying the behavior of sophisticated fraudsters and the unavailability of training datasets hinder the research on SB detection. The aim of this study is to develop a high-quality SB dataset. To do so, first, we crawled and preprocessed a large number of commercial auctions and bidders' history as well. We thoroughly preprocessed both datasets to make them usable for the computation of the SB metrics. Nevertheless, this operation requires a deep understanding of the behavior of auctions and bidders. Second, we introduced two new SB patterns and implemented other existing ones. Finally, we removed outliers to improve the quality of the training fraud data.