Abstract:Search is a prominent channel for discovering products on an e-commerce platform. Ranking products retrieved from search becomes crucial to address customer's need and optimize for business metrics. While learning to Rank (LETOR) models have been extensively studied and have demonstrated efficacy in the context of web search; it is a relatively new research area to be explored in the e-commerce. In this paper, we present a framework for building LETOR model for an e-commerce platform. We analyze user queries and propose a mechanism to segment queries between broad and narrow based on user's intent. We discuss different types of features - query, product and query-product and discuss challenges in using them. We show that sparsity in product features can be tackled through a denoising auto-encoder while skip-gram based word embeddings help solve the query-product sparsity issues. We also present various target metrics that can be employed for evaluating search results and compare their robustness. Further, we build and compare performances of both pointwise and pairwise LETOR models on fashion category data set. We also build and compare distinct models for broad and narrow queries, analyze feature importance across these and show that these specialized models perform better than a combined model in the fashion world.
Abstract:With the rapid growth in fashion e-commerce and customer-friendly product return policies, the cost to handle returned products has become a significant challenge. E-tailers incur huge losses in terms of reverse logistics costs, liquidation cost due to damaged returns or fraudulent behavior. Accurate prediction of product returns prior to order placement can be critical for companies. It can facilitate e-tailers to take preemptive measures even before the order is placed, hence reducing overall returns. Furthermore, finding return probability for millions of customers at the cart page in real-time can be difficult. To address this problem we propose a novel approach based on Deep Neural Network. Users' taste & products' latent hidden features were captured using product embeddings based on Bayesian Personalized Ranking (BPR). Another set of embeddings was used which captured users' body shape and size by using skip-gram based model. The deep neural network incorporates these embeddings along with the engineered features to predict return probability. Using this return probability, several live experiments were conducted on one of the major fashion e-commerce platform in order to reduce overall returns.
Abstract:Online shopping caters to the needs of millions of users daily. Search, recommendations, personalization have become essential building blocks for serving customer needs. Efficacy of such systems is dependent on a thorough understanding of products and their representation. Multiple information sources and data types provide a complete picture of the product on the platform. While each of these tasks shares some common characteristics, typically product embeddings are trained and used in isolation. In this paper, we propose a framework to combine multiple data sources and learn unified embeddings for products on our e-commerce platform. Our product embeddings are built from three types of data sources - catalog text data, a user's clickstream session data and product images. We use various techniques like denoising auto-encoders for text, Bayesian personalized ranking (BPR) for clickstream data, Siamese neural network architecture for image data and combined ensemble over the above methods for unified embeddings. Further, we compare and analyze the performance of these embeddings across three unrelated real-world e-commerce tasks specifically checking product attribute coverage, finding similar products and predicting returns. We show that unified product embeddings perform uniformly well across all these tasks.
Abstract:Fashion preference is a fuzzy concept that depends on customer taste, prevailing norms in fashion product/style, henceforth used interchangeably, and a customer's perception of utility or fashionability, yet fashion e-retail relies on algorithmically generated search and recommendation systems that process structured data and images to best match customer preference. Retailers study tastes solely as a function of what sold vs what did not, and take it to represent customer preference. Such explicit modeling, however, belies the underlying user preference, which is a complicated interplay of preference and commercials such as brand, price point, promotions, other sale events, and competitor push/marketing. It is hard to infer a notion of utility or even customer preference by looking at sales data. In search and recommendation systems for fashion e-retail, customer preference is implicitly derived by user-user similarity or item-item similarity. In this work, we aim to derive a metric that separates the buying preferences of users from the commercials of the merchandise (price, promotions, etc). We extend our earlier work on explicit signals to gauge sellability or preference with implicit signals from user behaviour.