Abstract:Slate recommendation is a technique commonly used on streaming platforms and e-commerce sites to present multiple items together. A significant challenge with slate recommendation is managing the complex combinatorial choice space. Traditional methods often simplify this problem by assuming users engage with only one item at a time. However, this simplification does not reflect the reality, as users often interact with multiple items simultaneously. In this paper, we address the general slate recommendation problem, which accounts for simultaneous engagement with multiple items. We propose a generative approach using Diffusion Models, leveraging their ability to learn structures in high-dimensional data. Our model generates high-quality slates that maximize user satisfaction by overcoming the challenges of the combinatorial choice space. Furthermore, our approach enhances the diversity of recommendations. Extensive offline evaluations on applications such as music playlist generation and e-commerce bundle recommendations show that our model outperforms state-of-the-art baselines in both relevance and diversity.
Abstract:In-context learning is a promising approach for online policy learning of offline reinforcement learning (RL) methods, which can be achieved at inference time without gradient optimization. However, this method is hindered by significant computational costs resulting from the gathering of large training trajectory sets and the need to train large Transformer models. We address this challenge by introducing an In-context Exploration-Exploitation (ICEE) algorithm, designed to optimize the efficiency of in-context policy learning. Unlike existing models, ICEE performs an exploration-exploitation trade-off at inference time within a Transformer model, without the need for explicit Bayesian inference. Consequently, ICEE can solve Bayesian optimization problems as efficiently as Gaussian process biased methods do, but in significantly less time. Through experiments in grid world environments, we demonstrate that ICEE can learn to solve new RL tasks using only tens of episodes, marking a substantial improvement over the hundreds of episodes needed by the previous in-context learning method.
Abstract:Personalization of playlists is a common feature in music streaming services, but conventional techniques, such as collaborative filtering, rely on explicit assumptions regarding content quality to learn how to make recommendations. Such assumptions often result in misalignment between offline model objectives and online user satisfaction metrics. In this paper, we present a reinforcement learning framework that solves for such limitations by directly optimizing for user satisfaction metrics via the use of a simulated playlist-generation environment. Using this simulator we develop and train a modified Deep Q-Network, the action head DQN (AH-DQN), in a manner that addresses the challenges imposed by the large state and action space of our RL formulation. The resulting policy is capable of making recommendations from large and dynamic sets of candidate items with the expectation of maximizing consumption metrics. We analyze and evaluate agents offline via simulations that use environment models trained on both public and proprietary streaming datasets. We show how these agents lead to better user-satisfaction metrics compared to baseline methods during online A/B tests. Finally, we demonstrate that performance assessments produced from our simulator are strongly correlated with observed online metric results.
Abstract:Neural architecture search (NAS) is a recent methodology for automating the design of neural network architectures. Differentiable neural architecture search (DARTS) is a promising NAS approach that dramatically increases search efficiency. However, it has been shown to suffer from performance collapse, where the search often leads to detrimental architectures. Many recent works try to address this issue of DARTS by identifying indicators for early stopping, regularising the search objective to reduce the dominance of some operations, or changing the parameterisation of the search problem. In this work, we hypothesise that performance collapses can arise from poor local optima around typical initial architectures and weights. We address this issue by developing a more global optimisation scheme that is able to better explore the space without changing the DARTS problem formulation. Our experiments show that our changes in the search algorithm allow the discovery of architectures with both better test performance and fewer parameters.
Abstract:Complex systems may contain heterogeneous types of variables that interact in a multi-level and multi-scale manner. In this context, high-level layers may considered as groups of variables interacting in lower-level layers. This is particularly true in biology, where, for example, genes are grouped in pathways and two types of interactions are present: pathway-pathway interactions and gene-gene interactions. However, from data it is only possible to measure the expression of genes while it is impossible to directly measure the activity of pathways. Nevertheless, the knowledge on the inter-dependence between the groups and the variables allows for a multi-layer network inference, on both observed variables and groups, even if no direct information on the latter is present in the data (hence groups are considered as latent). In this paper, we propose an extension of the latent graphical lasso method that leverages on the knowledge of the inter-links between the hidden (groups) and observed layers. The method exploits the knowledge of group structure that influence the behaviour of observed variables to retrieve a two layers network. Its efficacy was tested on synthetic data to check its ability in retrieving the network structure compared to the ground truth. We present a case study on Neuroblastoma, which shows how our multi-level inference is relevant in real contexts to infer biologically meaningful connections.
Abstract:In many applications of finance, biology and sociology, complex systems involve entities interacting with each other. These processes have the peculiarity of evolving over time and of comprising latent factors, which influence the system without being explicitly measured. In this work we present latent variable time-varying graphical lasso (LTGL), a method for multivariate time-series graphical modelling that considers the influence of hidden or unmeasurable factors. The estimation of the contribution of the latent factors is embedded in the model which produces both sparse and low-rank components for each time point. In particular, the first component represents the connectivity structure of observable variables of the system, while the second represents the influence of hidden factors, assumed to be few with respect to the observed variables. Our model includes temporal consistency on both components, providing an accurate evolutionary pattern of the system. We derive a tractable optimisation algorithm based on alternating direction method of multipliers, and develop a scalable and efficient implementation which exploits proximity operators in closed form. LTGL is extensively validated on synthetic data, achieving optimal performance in terms of accuracy, structure learning and scalability with respect to ground truth and state-of-the-art methods for graphical inference. We conclude with the application of LTGL to real case studies, from biology and finance, to illustrate how our method can be successfully employed to gain insights on multivariate time-series data.