Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rodrigo Alves

The Future is Sparse: Embedding Compression for Scalable Retrieval in Recommender Systems

May 16, 2025

Petr Kasalický, Martin Spišák, Vojtěch Vančura, Daniel Bohuněk, Rodrigo Alves, Pavel Kordík

Abstract:Industry-scale recommender systems face a core challenge: representing entities with high cardinality, such as users or items, using dense embeddings that must be accessible during both training and inference. However, as embedding sizes grow, memory constraints make storage and access increasingly difficult. We describe a lightweight, learnable embedding compression technique that projects dense embeddings into a high-dimensional, sparsely activated space. Designed for retrieval tasks, our method reduces memory requirements while preserving retrieval performance, enabling scalable deployment under strict resource constraints. Our results demonstrate that leveraging sparsity is a promising approach for improving the efficiency of large-scale recommenders. We release our code at https://github.com/recombee/CompresSAE.

Via

Access Paper or Ask Questions

Reasoning-Grounded Natural Language Explanations for Language Models

Mar 14, 2025

Vojtech Cahlik, Rodrigo Alves, Pavel Kordik

Abstract:We propose a large language model explainability technique for obtaining faithful natural language explanations by grounding the explanations in a reasoning process. When converted to a sequence of tokens, the outputs of the reasoning process can become part of the model context and later be decoded to natural language as the model produces either the final answer or the explanation. To improve the faithfulness of the explanations, we propose to use a joint predict-explain approach, in which the answers and explanations are inferred directly from the reasoning sequence, without the explanations being dependent on the answers and vice versa. We demonstrate the plausibility of the proposed technique by achieving a high alignment between answers and explanations in several problem domains, observing that language models often simply copy the partial decisions from the reasoning sequence into the final answers or explanations. Furthermore, we show that the proposed use of reasoning can also improve the quality of the answers.

Via

Access Paper or Ask Questions

Bridging Offline-Online Evaluation with a Time-dependent and Popularity Bias-free Offline Metric for Recommenders

Aug 14, 2023

Petr Kasalický, Rodrigo Alves, Pavel Kordík

Abstract:The evaluation of recommendation systems is a complex task. The offline and online evaluation metrics for recommender systems are ambiguous in their true objectives. The majority of recently published papers benchmark their methods using ill-posed offline evaluation methodology that often fails to predict true online performance. Because of this, the impact that academic research has on the industry is reduced. The aim of our research is to investigate and compare the online performance of offline evaluation metrics. We show that penalizing popular items and considering the time of transactions during the evaluation significantly improves our ability to choose the best recommendation model for a live recommender system. Our results, averaged over five large-size real-world live data procured from recommenders, aim to help the academic community to understand better offline evaluation and optimization criteria that are more relevant for real applications of recommender systems.

* Accepted to evalRS 2023@KDD

Via

Access Paper or Ask Questions

Generalization Bounds for Inductive Matrix Completion in Low-noise Settings

Dec 16, 2022

Antoine Ledent, Rodrigo Alves, Yunwen Lei, Yann Guermeur, Marius Kloft

Abstract:We study inductive matrix completion (matrix completion with side information) under an i.i.d. subgaussian noise assumption at a low noise regime, with uniform sampling of the entries. We obtain for the first time generalization bounds with the following three properties: (1) they scale like the standard deviation of the noise and in particular approach zero in the exact recovery case; (2) even in the presence of noise, they converge to zero when the sample size approaches infinity; and (3) for a fixed dimension of the side information, they only have a logarithmic dependence on the size of the matrix. Differently from many works in approximate recovery, we present results both for bounded Lipschitz losses and for the absolute loss, with the latter relying on Talagrand-type inequalities. The proofs create a bridge between two approaches to the theoretical analysis of matrix completion, since they consist in a combination of techniques from both the exact recovery literature and the approximate recovery literature.

* AAAI 2023
* 30 Pages, 1 figure; Accepted for publication at AAAI 2023

Via

Access Paper or Ask Questions

Orthogonal Inductive Matrix Completion

Apr 23, 2020

Antoine Ledent, Rodrigo Alves, Marius Kloft

Figure 1 for Orthogonal Inductive Matrix Completion

Figure 2 for Orthogonal Inductive Matrix Completion

Figure 3 for Orthogonal Inductive Matrix Completion

Figure 4 for Orthogonal Inductive Matrix Completion

Abstract:We propose orthogonal inductive matrix completion (OMIC), an interpretable model composed of a sum of matrix completion terms, each with orthonormal side information. We can inject prior knowledge about the eigenvectors of the ground truth matrix, whilst maintaining the representation capability of the model. We present a provably converging algorithm that optimizes all components of the model simultaneously, using nuclear-norm regularisation. Our method is backed up by \textit{distribution-free} learning guarantees that improve with the quality of the injected knowledge. As a special case of our general framework, we study a model consisting of a sum of user and item biases (generic behaviour), a non-inductive term (specific behaviour), and an inductive term using side information. Our theoretical analysis shows that $\epsilon$-recovering the ground truth matrix requires at most $O\left( \frac{n+m+(\sqrt{n}+\sqrt{m})\sqrt{rmn}C}{\epsilon^2}\right)$ entries, where $r$ (resp. $C$) is the rank (resp. maximum entry) of the bias-free part of the ground truth matrix. We analyse the performance of OMIC on several synthetic and real datasets. On synthetic datasets with a sliding scale of user bias relevance, we show that OMIC better adapts to different regimes than other methods and can recover the ground truth. On real life datasets containing user/items recommendations and relevant side information, we find that OMIC surpasses the state of the art, with the added benefit of greater interpretability.

Via

Access Paper or Ask Questions