Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Angelo Cardoso

Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Jul 13, 2017

C. H. Bryan Liu, Benjamin Paul Chamberlain, Duncan A. Little, Angelo Cardoso

Figure 1 for Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Figure 2 for Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Figure 3 for Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Figure 4 for Generalising Random Forest Parameter Optimisation to Include Stability and Cost

Abstract:Random forests are among the most popular classification and regression methods used in industrial applications. To be effective, the parameters of random forests must be carefully tuned. This is usually done by choosing values that minimize the prediction error on a held out dataset. We argue that error reduction is only one of several metrics that must be considered when optimizing random forest parameters for commercial applications. We propose a novel metric that captures the stability of random forests predictions, which we argue is key for scenarios that require successive predictions. We motivate the need for multi-criteria optimization by showing that in practical applications, simply choosing the parameters that lead to the lowest error can introduce unnecessary costs and produce predictions that are not stable across independent runs. To optimize this multi-criteria trade-off, we present a new framework that efficiently finds a principled balance between these three considerations using Bayesian optimisation. The pitfalls of optimising forest parameters purely for error reduction are demonstrated using two publicly available real world datasets. We show that our framework leads to parameter settings that are markedly different from the values discovered by error reduction metrics.

* Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. LNCS vol 10536, pp. 102-113 (2017)
* To appear in ECML-PKDD 2017

Via

Access Paper or Ask Questions

Customer Lifetime Value Prediction Using Embeddings

Jul 06, 2017

Benjamin Paul Chamberlain, Angelo Cardoso, C. H. Bryan Liu, Roberto Pagliari, Marc Peter Deisenroth

Figure 1 for Customer Lifetime Value Prediction Using Embeddings

Figure 2 for Customer Lifetime Value Prediction Using Embeddings

Figure 3 for Customer Lifetime Value Prediction Using Embeddings

Figure 4 for Customer Lifetime Value Prediction Using Embeddings

Abstract:We describe the Customer LifeTime Value (CLTV) prediction system deployed at ASOS.com, a global online fashion retailer. CLTV prediction is an important problem in e-commerce where an accurate estimate of future value allows retailers to effectively allocate marketing spend, identify and nurture high value customers and mitigate exposure to losses. The system at ASOS provides daily estimates of the future value of every customer and is one of the cornerstones of the personalised shopping experience. The state of the art in this domain uses large numbers of handcrafted features and ensemble regressors to forecast value, predict churn and evaluate customer loyalty. Recently, domains including language, vision and speech have shown dramatic advances by replacing handcrafted features with features that are learned automatically from data. We detail the system deployed at ASOS and show that learning feature representations is a promising extension to the state of the art in CLTV modelling. We propose a novel way to generate embeddings of customers, which addresses the issue of the ever changing product catalogue and obtain a significant improvement over an exhaustive set of handcrafted features.

* Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pages 1753-1762, 2017
* 10 pages, 11 figures

Via

Access Paper or Ask Questions