Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Satish Rao

Capacity Releasing Diffusion for Speed and Locality

Jun 10, 2018

Di Wang, Kimon Fountoulakis, Monika Henzinger, Michael W. Mahoney, Satish Rao

Figure 1 for Capacity Releasing Diffusion for Speed and Locality

Figure 2 for Capacity Releasing Diffusion for Speed and Locality

Figure 3 for Capacity Releasing Diffusion for Speed and Locality

Figure 4 for Capacity Releasing Diffusion for Speed and Locality

Abstract:Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass "too aggressively," thereby failing to find the "right" clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an $O(\log^2 n)$ factor, where $n$ is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good---but not very good---clusters.

* Appeared in ICML 2017. Current version added reference and discussion of work on generalized Cheeger's inequalities

Via

Access Paper or Ask Questions

Query Strategies for Evading Convex-Inducing Classifiers

Jul 03, 2010

Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Steven J. Lee, Satish Rao, J. D. Tygar

Figure 1 for Query Strategies for Evading Convex-Inducing Classifiers

Figure 2 for Query Strategies for Evading Convex-Inducing Classifiers

Figure 3 for Query Strategies for Evading Convex-Inducing Classifiers

Figure 4 for Query Strategies for Evading Convex-Inducing Classifiers

Abstract:Classifiers are often used to detect miscreant activities. We study how an adversary can systematically query a classifier to elicit information that allows the adversary to evade detection while incurring a near-minimal cost of modifying their intended malfeasance. We generalize the theory of Lowd and Meek (2005) to the family of convex-inducing classifiers that partition input space into two sets one of which is convex. We present query algorithms for this family that construct undetected instances of approximately minimal cost using only polynomially-many queries in the dimension of the space and in the level of approximation. Our results demonstrate that near-optimal evasion can be accomplished without reverse-engineering the classifier's decision boundary. We also consider general lp costs and show that near-optimal evasion on the family of convex-inducing classifiers is generally efficient for both positive and negative convexity for all levels of approximation if p=1.

Via

Access Paper or Ask Questions

Near-Optimal Evasion of Convex-Inducing Classifiers

Mar 14, 2010

Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Shing-hon Lau, Steven J. Lee, Satish Rao, Anthony Tran, J. D. Tygar

Abstract:Classifiers are often used to detect miscreant activities. We study how an adversary can efficiently query a classifier to elicit information that allows the adversary to evade detection at near-minimal cost. We generalize results of Lowd and Meek (2005) to convex-inducing classifiers. We present algorithms that construct undetected instances of near-minimal cost using only polynomially many queries in the dimension of the space and without reverse engineering the decision boundary.

* 8 pages; to appear at AISTATS'2010

Via

Access Paper or Ask Questions