Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaozhe Yin

Quantile Extreme Gradient Boosting for Uncertainty Quantification

Apr 23, 2023

Xiaozhe Yin, Masoud Fallah-Shorshani, Rob McConnell, Scott Fruin, Yao-Yi Chiang, Meredith Franklin

Figure 1 for Quantile Extreme Gradient Boosting for Uncertainty Quantification

Figure 2 for Quantile Extreme Gradient Boosting for Uncertainty Quantification

Figure 3 for Quantile Extreme Gradient Boosting for Uncertainty Quantification

Figure 4 for Quantile Extreme Gradient Boosting for Uncertainty Quantification

Abstract:As the availability, size and complexity of data have increased in recent years, machine learning (ML) techniques have become popular for modeling. Predictions resulting from applying ML models are often used for inference, decision-making, and downstream applications. A crucial yet often overlooked aspect of ML is uncertainty quantification, which can significantly impact how predictions from models are used and interpreted. Extreme Gradient Boosting (XGBoost) is one of the most popular ML methods given its simple implementation, fast computation, and sequential learning, which make its predictions highly accurate compared to other methods. However, techniques for uncertainty determination in ML models such as XGBoost have not yet been universally agreed among its varying applications. We propose enhancements to XGBoost whereby a modified quantile regression is used as the objective function to estimate uncertainty (QXGBoost). Specifically, we included the Huber norm in the quantile regression model to construct a differentiable approximation to the quantile regression error function. This key step allows XGBoost, which uses a gradient-based optimization algorithm, to make probabilistic predictions efficiently. QXGBoost was applied to create 90\% prediction intervals for one simulated dataset and one real-world environmental dataset of measured traffic noise. Our proposed method had comparable or better performance than the uncertainty estimates generated for regular and quantile light gradient boosting. For both the simulated and traffic noise datasets, the overall performance of the prediction intervals from QXGBoost were better than other models based on coverage width-based criterion.

* 15 pages, 7 figures and 4 tables

Via

Access Paper or Ask Questions