Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianning Xu

Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Jun 01, 2022

Yifan Chen, Tianning Xu, Dilek Hakkani-Tur, Di Jin, Yun Yang, Ruoqing Zhu

Figure 1 for Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Figure 2 for Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Figure 3 for Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Figure 4 for Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Abstract:To accelerate the training of graph convolutional networks (GCNs), many sampling-based methods have been developed for approximating the embedding aggregation. Among them, a layer-wise approach recursively performs importance sampling to select neighbors jointly for existing nodes in each layer. This paper revisits the approach from a matrix approximation perspective. We identify two issues in the existing layer-wise sampling methods: sub-optimal sampling probabilities and the approximation bias induced by sampling without replacement. We propose two remedies: new sampling probabilities and a debiasing algorithm, to address these issues, and provide the statistical analysis of the estimation variance. The improvements are demonstrated by extensive analyses and experiments on common benchmarks.

Via

Access Paper or Ask Questions

On Variance Estimation of Random Forests

Feb 18, 2022

Tianning Xu, Ruoqing Zhu, Xiaofeng Shao

Figure 1 for On Variance Estimation of Random Forests

Figure 2 for On Variance Estimation of Random Forests

Figure 3 for On Variance Estimation of Random Forests

Figure 4 for On Variance Estimation of Random Forests

Abstract:Ensemble methods based on subsampling, such as random forests, are popular in applications due to their high predictive accuracy. Existing literature views a random forest prediction as an infinite-order incomplete U-statistic to quantify its uncertainty. However, these methods focus on a small subsampling size of each tree, which is theoretically valid but practically limited. This paper develops an unbiased variance estimator based on incomplete U-statistics, which allows the tree size to be comparable with the overall sample size, making statistical inference possible in a broader range of real applications. Simulation results demonstrate that our estimators enjoy lower bias and more accurate confidence interval coverage without additional computational costs. We also propose a local smoothing procedure to reduce the variation of our estimator, which shows improved numerical performance when the number of trees is relatively small. Further, we investigate the ratio consistency of our proposed variance estimator under specific scenarios. In particular, we develop a new "double U-statistic" formulation to analyze the Hoeffding decomposition of the estimator's variance.

Via

Access Paper or Ask Questions