Abstract:We propose and discuss Bayesian machine learning methods for mixed data sampling (MIDAS) regressions. This involves handling frequency mismatches with restricted and unrestricted MIDAS variants and specifying functional relationships between many predictors and the dependent variable. We use Gaussian processes (GP) and Bayesian additive regression trees (BART) as flexible extensions to linear penalized estimation. In a nowcasting and forecasting exercise we focus on quarterly US output growth and inflation in the GDP deflator. The new models leverage macroeconomic Big Data in a computationally efficient way and offer gains in predictive accuracy along several dimensions.
Abstract:This study conducts a benchmarking study, comparing 23 different statistical and machine learning methods in a credit scoring application. In order to do so, the models' performance is evaluated over four different data sets in combination with five data sampling strategies to tackle existing class imbalances in the data. Six different performance measures are used to cover different aspects of predictive performance. The results indicate a strong superiority of ensemble methods and show that simple sampling strategies deliver better results than more sophisticated ones.