Abstract:In most of advertising and recommendation systems, multi-task learning (MTL) paradigm is widely employed to model diverse user behaviors (e.g., click, view, and purchase). Existing MTL models typically use task-shared networks with shared parameters or a routing mechanism to learn the commonalities between tasks while applying task-specific networks to learn the unique characteristics of each task. However, the potential relevance within task-specific networks is ignored, which is intuitively crucial for overall performance. In light of the fact that relevance is both task-complex and instance-specific, we present a novel learning paradigm to address these issues. In this paper, we propose Personalized Inter-task COntrastive Learning (PICO) framework, which can effectively model the inter-task relationship and is utilized to jointly estimate the click-through rate (CTR) and post-click conversion rate (CVR) in advertising systems. PICO utilizes contrastive learning to integrate inter-task knowledge implicitly from the task representations in task-specific networks. In addition, we introduce an auxiliary network to capture the inter-task relevance at instance-level and transform it into personalized temperature parameters for contrastive learning. With this method, fine-grained knowledge can be transferred to improve MTL performance without incurring additional inference costs. Both offline and online experiments show that PICO outperforms previous multi-task models significantly.
Abstract:Click-through rate (CTR) prediction is a fundamental technique in recommendation and advertising systems. Recent studies have proved that learning a unified model to serve multiple domains is effective to improve the overall performance. However, it is still challenging to improve generalization across domains under limited training data, and hard to deploy current solutions due to their computational complexity. In this paper, we propose a simple yet effective framework AdaSparse for multi-domain CTR prediction, which learns adaptively sparse structure for each domain, achieving better generalization across domains with lower computational cost. In AdaSparse, we introduce domain-aware neuron-level weighting factors to measure the importance of neurons, with that for each domain our model can prune redundant neurons to improve generalization. We further add flexible sparsity regularizations to control the sparsity ratio of learned structures. Offline and online experiments show that AdaSparse outperforms previous multi-domain CTR models significantly.