Abstract:As user content and queries become increasingly multi-modal, the need for effective multi-modal search systems has grown. Traditional search systems often rely on textual and metadata annotations for indexed images, while multi-modal embeddings like CLIP enable direct search using text and image embeddings. However, embedding-based approaches face challenges in integrating contextual features such as user locale and recency. Building a scalable multi-modal search system requires fine-tuning several components. This paper presents a multi-modal search architecture and a series of AB tests that optimize embeddings and multi-modal technologies in Adobe Express template search. We address considerations such as embedding model selection, the roles of embeddings in matching and ranking, and the balance between dense and sparse embeddings. Our iterative approach demonstrates how utilizing sparse, dense, and contextual features enhances short and long query search, significantly reduces null rates (over 70\%), and increases click-through rates (CTR). Our findings provide insights into developing robust multi-modal search systems, thereby enhancing relevance for complex queries.
Abstract:Link prediction in graphs is a task that has been widely investigated. It has been applied in various domains such as knowledge graph completion, content/item recommendation, social network recommendations and so on. The initial focus of most research was on link prediction in static graphs. However, there has recently been abundant work on modeling temporal graphs, and consequently one of the tasks that has been researched is link prediction in temporal graphs. However, most of the existing work does not focus on the order of link formation, and only predicts the existence of links. In this study, we aim to predict the order of node interactions.