Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Werner Stuetzle

A Simple and Efficient Method to Compute a Single Linkage Dendrogram

Nov 01, 2019

Huanbiao Zhu, Werner Stuetzle

Figure 1 for A Simple and Efficient Method to Compute a Single Linkage Dendrogram

Figure 2 for A Simple and Efficient Method to Compute a Single Linkage Dendrogram

Figure 3 for A Simple and Efficient Method to Compute a Single Linkage Dendrogram

Figure 4 for A Simple and Efficient Method to Compute a Single Linkage Dendrogram

Abstract:We address the problem of computing a single linkage dendrogram. A possible approach is to: (i) Form an edge weighted graph $G$ over the data, with edge weights reflecting dissimilarities. (ii) Calculate the MST $T$ of $G$. (iii) Break the longest edge of $T$ thereby splitting it into subtrees $T_L$, $T_R$. (iv) Apply the splitting process recursively to the subtrees. This approach has the attractive feature that Prim's algorithm for MST construction calculates distances as needed, and hence there is no need to ever store the inter-point distance matrix. The recursive partitioning algorithm requires us to determine the vertices (and edges) of $T_L$ and $T_R$. We show how this can be done easily and efficiently using information generated by Prim's algorithm without any additional computational cost.

Via

Access Paper or Ask Questions

Smoothing Effects of Bagging: Von Mises Expansions of Bagged Statistical Functionals

Dec 08, 2016

Andreas Buja, Werner Stuetzle

Abstract:Bagging is a device intended for reducing the prediction error of learning algorithms. In its simplest form, bagging draws bootstrap samples from the training sample, applies the learning algorithm to each bootstrap sample, and then averages the resulting prediction rules. We extend the definition of bagging from statistics to statistical functionals and study the von Mises expansion of bagged statistical functionals. We show that the expansion is related to the Efron-Stein ANOVA expansion of the raw (unbagged) functional. The basic observation is that a bagged functional is always smooth in the sense that the von Mises expansion exists and is finite of length 1 + resample size $M$. This holds even if the raw functional is rough or unstable. The resample size $M$ acts as a smoothing parameter, where a smaller $M$ means more smoothing.

Via

Access Paper or Ask Questions