Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrey Gulin

Which Tricks are Important for Learning to Rank?

Apr 04, 2022

Ivan Lyzhin, Aleksei Ustimenko, Andrey Gulin, Liudmila Prokhorenkova

Figure 1 for Which Tricks are Important for Learning to Rank?

Figure 2 for Which Tricks are Important for Learning to Rank?

Figure 3 for Which Tricks are Important for Learning to Rank?

Figure 4 for Which Tricks are Important for Learning to Rank?

Abstract:Nowadays, state-of-the-art learning-to-rank (LTR) methods are based on gradient-boosted decision trees (GBDT). The most well-known algorithm is LambdaMART that was proposed more than a decade ago. Recently, several other GBDT-based ranking algorithms were proposed. In this paper, we conduct a thorough analysis of these methods in a unified setup. In particular, we address the following questions. Is direct optimization of a smoothed ranking loss preferable over optimizing a convex surrogate? How to properly construct and smooth surrogate ranking losses? To address these questions, we compare LambdaMART with YetiRank and StochasticRank methods and their modifications. We also improve the YetiRank approach to allow for optimizing specific ranking loss functions. As a result, we gain insights into learning-to-rank approaches and obtain a new state-of-the-art algorithm.

Via

Access Paper or Ask Questions

CatBoost: unbiased boosting with categorical features

Oct 31, 2018

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, Andrey Gulin

Figure 1 for CatBoost: unbiased boosting with categorical features

Figure 2 for CatBoost: unbiased boosting with categorical features

Figure 3 for CatBoost: unbiased boosting with categorical features

Figure 4 for CatBoost: unbiased boosting with categorical features

Abstract:This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets. Two critical algorithmic advances introduced in CatBoost are the implementation of ordered boosting, a permutation-driven alternative to the classic algorithm, and an innovative algorithm for processing categorical features. Both techniques were created to fight a prediction shift caused by a special kind of target leakage present in all currently existing implementations of gradient boosting algorithms. In this paper, we provide a detailed analysis of this problem and demonstrate that proposed algorithms solve it effectively, leading to excellent empirical results.

Via

Access Paper or Ask Questions

CatBoost: gradient boosting with categorical features support

Oct 24, 2018

Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin

Figure 1 for CatBoost: gradient boosting with categorical features support

Figure 2 for CatBoost: gradient boosting with categorical features support

Figure 3 for CatBoost: gradient boosting with categorical features support

Figure 4 for CatBoost: gradient boosting with categorical features support

Abstract:In this paper we present CatBoost, a new open-sourced gradient boosting library that successfully handles categorical features and outperforms existing publicly available implementations of gradient boosting in terms of quality on a set of popular publicly available datasets. The library has a GPU implementation of learning algorithm and a CPU implementation of scoring algorithm, which are significantly faster than other gradient boosting libraries on ensembles of similar sizes.

Via

Access Paper or Ask Questions