Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Natalia da Silva

A Projection Pursuit Forest Algorithm for Supervised Classification

Jul 25, 2018

Natalia da Silva, Dianne Cook, Eun-Kyung Lee

Figure 1 for A Projection Pursuit Forest Algorithm for Supervised Classification

Figure 2 for A Projection Pursuit Forest Algorithm for Supervised Classification

Figure 3 for A Projection Pursuit Forest Algorithm for Supervised Classification

Figure 4 for A Projection Pursuit Forest Algorithm for Supervised Classification

Abstract:This paper presents a new ensemble learning method for classification problems called projection pursuit random forest (PPF). PPF uses the PPtree algorithm introduced in Lee et al. (2013). In PPF, trees are constructed by splitting on linear combinations of randomly chosen variables. Projection pursuit is used to choose a projection of the variables that best separates the classes. Utilizing linear combinations of variables to separate classes takes the correlation between variables into account which allows PPF to outperform a traditional random forest when separations between groups occurs in combinations of variables. The method presented here can be used in multi-class problems and is implemented into an R (R Core Team, 2018) package, PPforest, which is available on CRAN, with development versions at https://github.com/natydasilva/PPforest.

Via

Access Paper or Ask Questions

Interactive Graphics for Visually Diagnosing Forest Classifiers in R

Apr 08, 2017

Natalia da Silva, Dianne Cook, Eun-Kyung Lee

Figure 1 for Interactive Graphics for Visually Diagnosing Forest Classifiers in R

Figure 2 for Interactive Graphics for Visually Diagnosing Forest Classifiers in R

Figure 3 for Interactive Graphics for Visually Diagnosing Forest Classifiers in R

Figure 4 for Interactive Graphics for Visually Diagnosing Forest Classifiers in R

Abstract:This paper describes structuring data and constructing plots to explore forest classification models interactively. A forest classifier is an example of an ensemble, produced by bagging multiple trees. The process of bagging and combining results from multiple trees, produces numerous diagnostics which, with interactive graphics, can provide a lot of insight into class structure in high dimensions. Various aspects are explored in this paper, to assess model complexity, individual model contributions, variable importance and dimension reduction, and uncertainty in prediction associated with individual observations. The ideas are applied to the random forest algorithm, and to the projection pursuit forest, but could be more broadly applied to other bagged ensembles. Interactive graphics are built in R, using the ggplot2, plotly, and shiny packages.

Via

Access Paper or Ask Questions