Abstract:We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. "woman" vs. "women"), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method.
Abstract:We introduce an approach for the real-time (2Hz) creation of a dense map and alignment of a moving robotic agent within that map by rendering using a Graphics Processing Unit (GPU). This is done by recasting the scan alignment part of the dense mapping process as a rendering task. Alignment errors are computed from rendering the scene, comparing with range data from the sensors, and minimized by an optimizer. The proposed approach takes advantage of the advances in rendering techniques for computer graphics and GPU hardware to accelerate the algorithm. Moreover, it allows one to exploit information not used in classic dense mapping algorithms such as Iterative Closest Point (ICP) by rendering interfaces between the free space, occupied space and the unknown. The proposed approach leverages directly the rendering capabilities of the GPU, in contrast to other GPU-based approaches that deploy the GPU as a general purpose parallel computation platform. We argue that the proposed concept is a general consequence of treating perception problems as inverse problems of rendering. Many perception problems can be recast into a form where much of the computation is replaced by render operations. This is not only efficient since rendering is fast, but also simpler to implement and will naturally benefit from future advancements in GPU speed and rendering techniques. Furthermore, this general concept can go beyond addressing perception problems and can be used for other problem domains such as path planning.