Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joe Eaton

cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Jun 28, 2023

Corey J. Nolet, Divye Gala, Alex Fender, Mahesh Doijade, Joe Eaton, Edward Raff, John Zedlewski, Brad Rees, Tim Oates

Figure 1 for cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Figure 2 for cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Figure 3 for cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Figure 4 for cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Abstract:In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only $O(Nk)$ space and uses a parameter $k$ to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for $k$-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision. Users can obtain cuSLINK at https://docs.rapids.ai/api/cuml/latest/api/#agglomerative-clustering

* To appear in ECML PKDD 2023 by Springer Nature

Via

Access Paper or Ask Questions

Semiring Primitives for Sparse Neighborhood Methods on the GPU

Apr 13, 2021

Corey J. Nolet, Divye Gala, Edward Raff, Joe Eaton, Brad Rees, John Zedlewski, Tim Oates

Figure 1 for Semiring Primitives for Sparse Neighborhood Methods on the GPU

Figure 2 for Semiring Primitives for Sparse Neighborhood Methods on the GPU

Figure 3 for Semiring Primitives for Sparse Neighborhood Methods on the GPU

Figure 4 for Semiring Primitives for Sparse Neighborhood Methods on the GPU

Abstract:High-performance primitives for mathematical operations on sparse vectors must deal with the challenges of skewed degree distributions and limits on memory consumption that are typically not issues in dense operations. We demonstrate that a sparse semiring primitive can be flexible enough to support a wide range of critical distance measures while maintaining performance and memory efficiency on the GPU. We further show that this primitive is a foundational component for enabling many neighborhood-based information retrieval and machine learning algorithms to accept sparse input. To our knowledge, this is the first work aiming to unify the computation of several critical distance measures on the GPU under a single flexible design paradigm and we hope that it provides a good baseline for future research in this area. Our implementation is fully open source and publicly available at https://github.com/rapidsai/cuml.

Via

Access Paper or Ask Questions

Attack Graph Convolutional Networks by Adding Fake Nodes

Oct 26, 2018

Xiaoyun Wang, Joe Eaton, Cho-Jui Hsieh, Felix Wu

Figure 1 for Attack Graph Convolutional Networks by Adding Fake Nodes

Figure 2 for Attack Graph Convolutional Networks by Adding Fake Nodes

Figure 3 for Attack Graph Convolutional Networks by Adding Fake Nodes

Figure 4 for Attack Graph Convolutional Networks by Adding Fake Nodes

Abstract:Graph convolutional networks (GCNs) have been widely used for classifying graph nodes in the semi-supervised setting. Previous work have shown that GCNs are vulnerable to the perturbation on adjacency and feature matrices of existing nodes. However, it is unrealistic to change existing nodes in many applications, such as existing users in social networks. In this paper, we design algorithms to attack GCNs by adding fake nodes. A greedy algorithm is proposed to generate adjacency and feature matrices of fake nodes, aiming to minimize the classification accuracy on the existing nodes. In additional, we introduce a discriminator to classify fake nodes from real nodes, and propose a Greedy-GAN attack to simultaneously update the discriminator and the attacker, to make fake nodes indistinguishable to the real ones. Our non-targeted attack decreases the accuracy of GCN down to 0.10, and our targeted attack reaches a success rate of 99% on the whole datasets, and 94% on average for attacking a single target node.

Via

Access Paper or Ask Questions