Abstract:Since their introduction by Breiman, Random Forests (RFs) have proven to be useful for both classification and regression tasks. The RF prediction of a previously unseen observation can be represented as a weighted sum of all training sample observations. This nearest-neighbor-type representation is useful, among other things, for constructing forecast distributions (Meinshausen, 2006). In this paper, we consider simplifying RF-based forecast distributions by sparsifying them. That is, we focus on a small subset of nearest neighbors while setting the remaining weights to zero. This sparsification step greatly improves the interpretability of RF predictions. It can be applied to any forecasting task without re-training existing RF models. In empirical experiments, we document that the simplified predictions can be similar to or exceed the original ones in terms of forecasting performance. We explore the statistical sources of this finding via a stylized analytical model of RFs. The model suggests that simplification is particularly promising if the unknown true forecast distribution contains many small weights that are estimated imprecisely.
Abstract:In the practice of point prediction, it is desirable that forecasters receive a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. When evaluating and comparing competing forecasts, it is then critical that the scoring function used for these purposes be consistent for the functional at hand, in the sense that the expected score is minimized when following the directive. We show that any scoring function that is consistent for a quantile or an expectile functional, respectively, can be represented as a mixture of extremal scoring functions that form a linearly parameterized family. Scoring functions for the mean value and probability forecasts of binary events constitute important examples. The quantile and expectile functionals along with the respective extremal scoring functions admit appealing economic interpretations in terms of thresholds in decision making. The Choquet type mixture representations give rise to simple checks of whether a forecast dominates another in the sense that it is preferable under any consistent scoring function. In empirical settings it suffices to compare the average scores for only a finite number of extremal elements. Plots of the average scores with respect to the extremal scoring functions, which we call Murphy diagrams, permit detailed comparisons of the relative merits of competing forecasts.