Abstract:Passenger Name Records (PNRs) are at the heart of the travel industry. Created when an itinerary is booked, they contain travel and passenger information. It is usual for airlines and other actors in the industry to inter-exchange and access each other's PNR, creating the challenge of using them without infringing data ownership laws. To address this difficulty, we propose a method to generate realistic synthetic PNRs using Generative Adversarial Networks (GANs). Unlike other GAN applications, PNRs consist of categorical and numerical features with missing/NaN values, which makes the use of GANs challenging. We propose a solution based on Cram\'{e}r GANs, categorical feature embedding and a Cross-Net architecture. The method was tested on a real PNR dataset, and evaluated in terms of distribution matching, memorization, and performance of predictive models for two real business problems: client segmentation and passenger nationality prediction. Results show that the generated data matches well with the real PNRs without memorizing them, and that it can be used to train models for real business applications.
Abstract:Travel providers such as airlines and on-line travel agents are becoming more and more interested in understanding how passengers choose among alternative itineraries when searching for flights. This knowledge helps them better display and adapt their offer, taking into account market conditions and customer needs. Some common applications are not only filtering and sorting alternatives, but also changing certain attributes in real-time (e.g., changing the price). In this paper, we concentrate with the problem of modeling air passenger choices of flight itineraries. This problem has historically been tackled using classical Discrete Choice Modelling techniques. Traditional statistical approaches, in particular the Multinomial Logit model (MNL), is widely used in industrial applications due to its simplicity and general good performance. However, MNL models present several shortcomings and assumptions that might not hold in real applications. To overcome these difficulties, we present a new choice model based on Pointer Networks. Given an input sequence, this type of deep neural architecture combines Recurrent Neural Networks with the Attention Mechanism to learn the conditional probability of an output whose values correspond to positions in an input sequence. Therefore, given a sequence of different alternatives presented to a customer, the model can learn to point to the one most likely to be chosen by the customer. The proposed method was evaluated on a real dataset that combines on-line user search logs and airline flight bookings. Experimental results show that the proposed model outperforms the traditional MNL model on several metrics.