Abstract:As the digitization of travel industry accelerates, analyzing and understanding travelers' behaviors becomes increasingly important. However, traveler data frequently exhibit high data sparsity due to the relatively low frequency of user interactions with travel providers. Compounding this effect the multiplication of devices, accounts and platforms while browsing travel products online also leads to data dispersion. To deal with these challenges, probabilistic traveler matching can be used. Most existing solutions for user matching are not suitable for traveler matching as a traveler's browsing history is typically short and URLs in the travel industry are very heterogeneous with many tokens. To deal with these challenges, we propose the similarity based multi-view information fusion to learn a better user representation from URLs by treating the URLs as multi-view data. The experimental results show that the proposed multi-view user representation learning can take advantage of the complementary information from different views, highlight the key information in URLs and perform significantly better than other representation learning solutions for the user matching task.
Abstract:Face recognition has been used more and more in real world applications in recent years. However, when the skin color bias is coupled with intra-personal variations like harsh illumination, the face recognition task is more likely to fail, even during human inspection. Face normalization methods try to deal with such challenges by removing intra-personal variations from an input image while keeping the identity the same. However, most face normalization methods can only remove one or two variations and ignore dataset biases such as skin color bias. The outputs of many face normalization methods are also not realistic to human observers. In this work, a style based face normalization model (StyleFNM) is proposed to remove most intra-personal variations including large changes in pose, bad or harsh illumination, low resolution, blur, facial expressions, and accessories like sunglasses among others. The dataset bias is also dealt with in this paper by controlling a pretrained GAN to generate a balanced dataset of passport-like images. The experimental results show that StyleFNM can generate more realistic outputs and can improve significantly the accuracy and fairness of face recognition systems.
Abstract:In this paper, we initiate the study of local model reconstruction attacks for federated learning, where a honest-but-curious adversary eavesdrops the messages exchanged between a targeted client and the server, and then reconstructs the local/personalized model of the victim. The local model reconstruction attack allows the adversary to trigger other classical attacks in a more effective way, since the local model only depends on the client's data and can leak more private information than the global model learned by the server. Additionally, we propose a novel model-based attribute inference attack in federated learning leveraging the local model reconstruction attack. We provide an analytical lower-bound for this attribute inference attack. Empirical results using real world datasets confirm that our local reconstruction attack works well for both regression and classification tasks. Moreover, we benchmark our novel attribute inference attack against the state-of-the-art attacks in federated learning. Our attack results in higher reconstruction accuracy especially when the clients' datasets are heterogeneous. Our work provides a new angle for designing powerful and explainable attacks to effectively quantify the privacy risk in FL.
Abstract:With the digitization of travel industry, it is more and more important to understand users from their online behaviors. However, online travel industry data are more challenging to analyze due to extra sparseness, dispersed user history actions, fast change of user interest and lack of direct or indirect feedbacks. In this work, a new similarity method is proposed to measure the destination similarity in terms of implicit user interest. By comparing the proposed method to several other widely used similarity measures in recommender systems, the proposed method achieves a significant improvement on travel data. Key words: Destination similarity, Travel industry, Recommender System, Implicit user interest