Abstract:In recommendation systems, there has been a growth in the num-ber of recommendable items (# of movies, music, products). Whenthe set of recommendable items is large, training and evaluationof item recommendation models becomes computationally expen-sive. To lower this cost, it has become common to sample negativeitems. However, the recommendation quality can suffer from biasesintroduced by traditional negative sampling mechanisms.In this work, we demonstrate the benefits from correcting thebias introduced by sampling of negatives. We first provide sampledbatch version of the well-studied WARP and LambdaRank methods.Then, we present how these methods can benefit from improvedranking estimates. Finally, we evaluate the recommendation qualityas a result of correcting rank estimates and demonstrate that WARPand LambdaRank can be learned efficiently with negative samplingand our proposed correction technique.
Abstract:Large-language Models (LLMs) have been extremely successful at tasks like complex dialogue understanding, reasoning and coding due to their emergent abilities. These emergent abilities have been extended with multi-modality to include image, audio, and video capabilities. Recommender systems, on the other hand, have been critical for information seeking and item discovery needs. Recently, there have been attempts to apply LLMs for recommendations. One difficulty of current attempts is that the underlying LLM is usually not trained on the recommender system data, which largely contains user interaction signals and is often not publicly available. Another difficulty is user interaction signals often have a different pattern from natural language text, and it is currently unclear if the LLM training setup can learn more non-trivial knowledge from interaction signals compared with traditional recommender system methods. Finally, it is difficult to train multiple LLMs for different use-cases, and to retain the original language and reasoning abilities when learning from recommender system data. To address these three limitations, we propose an Item-Language Model (ILM), which is composed of an item encoder to produce text-aligned item representations that encode user interaction signals, and a frozen LLM that can understand those item representations with preserved pretrained knowledge. We conduct extensive experiments which demonstrate both the importance of the language-alignment and of user interaction knowledge in the item encoder.