Abstract:Handling clustering problems are important in data statistics, pattern recognition and image processing. The mean-shift algorithm, a common unsupervised algorithms, is widely used to solve clustering problems. However, the mean-shift algorithm is restricted by its huge computational resource cost. In previous research[10], we proposed a novel GPU-accelerated Faster Mean-shift algorithm, which greatly speed up the cosine-embedding clustering problem. In this study, we extend and improve the previous algorithm to handle Euclidean distance metrics. Different from conventional GPU-based mean-shift algorithms, our algorithm adopts novel Seed Selection & Early Stopping approaches, which greatly increase computing speed and reduce GPU memory consumption. In the simulation testing, when processing a 200K points clustering problem, our algorithm achieved around 3 times speedup compared to the state-of-the-art GPU-based mean-shift algorithms with optimized GPU memory consumption. Moreover, in this study, we implemented a plug-and-play model for faster mean-shift algorithm, which can be easily deployed. (Plug-and-play model is available: https://github.com/masqm/Faster-Mean-Shift-Euc)
Abstract:User behaviour targeting is essential in online advertising. Compared with sponsored search keyword targeting and contextual advertising page content targeting, user behaviour targeting builds users' interest profiles via tracking their online behaviour and then delivers the relevant ads according to each user's interest, which leads to higher targeting accuracy and thus more improved advertising performance. The current user profiling methods include building keywords and topic tags or mapping users onto a hierarchical taxonomy. However, to our knowledge, there is no previous work that explicitly investigates the user online visits similarity and incorporates such similarity into their ad response prediction. In this work, we propose a general framework which learns the user profiles based on their online browsing behaviour, and transfers the learned knowledge onto prediction of their ad response. Technically, we propose a transfer learning model based on the probabilistic latent factor graphic models, where the users' ad response profiles are generated from their online browsing profiles. The large-scale experiments based on real-world data demonstrate significant improvement of our solution over some strong baselines.