Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hassan A. Kingravi

Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Jul 26, 2015

Hassan A. Kingravi, Patricio A. Vela, Alexandar Gray

Figure 1 for Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Figure 2 for Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Figure 3 for Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Figure 4 for Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Abstract:This paper presents a practical, and theoretically well-founded, approach to improve the speed of kernel manifold learning algorithms relying on spectral decomposition. Utilizing recent insights in kernel smoothing and learning with integral operators, we propose Reduced Set KPCA (RSKPCA), which also suggests an easy-to-implement method to remove or replace samples with minimal effect on the empirical operator. A simple data point selection procedure is given to generate a substitute density for the data, with accuracy that is governed by a user-tunable parameter . The effect of the approximation on the quality of the KPCA solution, in terms of spectral and operator errors, can be shown directly in terms of the density estimate error and as a function of the parameter . We show in experiments that RSKPCA can improve both training and evaluation time of KPCA by up to an order of magnitude, and compares favorably to the widely-used Nystrom and density-weighted Nystrom methods.

Via

Access Paper or Ask Questions

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Sep 12, 2014

M. Emre Celebi, Hassan A. Kingravi

Figure 1 for Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Figure 2 for Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Figure 3 for Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Figure 4 for Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Abstract:Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.

* 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.1960

Via

Access Paper or Ask Questions

Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

Apr 28, 2013

M. Emre Celebi, Hassan A. Kingravi

Figure 1 for Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

Figure 2 for Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

Figure 3 for Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

Figure 4 for Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

Abstract:K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. Many of these methods, however, have superlinear complexity in the number of data points, making them impractical for large data sets. On the other hand, linear methods are often random and/or order-sensitive, which renders their results unrepeatable. Recently, Su and Dy proposed two highly successful hierarchical initialization methods named Var-Part and PCA-Part that are not only linear, but also deterministic (non-random) and order-invariant. In this paper, we propose a discriminant analysis based approach that addresses a common deficiency of these two methods. Experiments on a large and diverse collection of data sets from the UCI Machine Learning Repository demonstrate that Var-Part and PCA-Part are highly competitive with one of the best random initialization methods to date, i.e., k-means++, and that the proposed approach significantly improves the performance of both hierarchical methods.

* International Journal of Pattern Recognition and Artificial Intelligence 26 (2012) 1250018
* 23 pages, 3 figures, 10 tables. arXiv admin note: substantial text overlap with arXiv:1209.1960

Via

Access Paper or Ask Questions

A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Sep 10, 2012

M. Emre Celebi, Hassan A. Kingravi, Patricio A. Vela

Figure 1 for A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Figure 2 for A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Figure 3 for A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

Figure 4 for A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

* Expert Systems with Applications 40 (2013) 200-210
* 17 pages, 1 figure, 7 tables

Via

Access Paper or Ask Questions

Comments on "On Approximating Euclidean Metrics by Weighted t-Cost Distances in Arbitrary Dimension"

Jun 10, 2012

M. Emre Celebi, Hassan A. Kingravi, Fatih Celiker

Figure 1 for Comments on "On Approximating Euclidean Metrics by Weighted t-Cost Distances in Arbitrary Dimension"

Figure 2 for Comments on "On Approximating Euclidean Metrics by Weighted t-Cost Distances in Arbitrary Dimension"

Figure 3 for Comments on "On Approximating Euclidean Metrics by Weighted t-Cost Distances in Arbitrary Dimension"

Figure 4 for Comments on "On Approximating Euclidean Metrics by Weighted t-Cost Distances in Arbitrary Dimension"

Abstract:Mukherjee (Pattern Recognition Letters, vol. 32, pp. 824-831, 2011) recently introduced a class of distance functions called weighted t-cost distances that generalize m-neighbor, octagonal, and t-cost distances. He proved that weighted t-cost distances form a family of metrics and derived an approximation for the Euclidean norm in $\mathbb{Z}^n$. In this note we compare this approximation to two previously proposed Euclidean norm approximations and demonstrate that the empirical average errors given by Mukherjee are significantly optimistic in $\mathbb{R}^n$. We also propose a simple normalization scheme that improves the accuracy of his approximation substantially with respect to both average and maximum relative errors.

* Pattern Recognition Letters 33 (2012) 1422--1425
* 7 pages, 1 figure, 3 tables. arXiv admin note: substantial text overlap with arXiv:1008.4870

Via

Access Paper or Ask Questions

Nonlinear Vector Filtering for Impulsive Noise Removal from Color Images

Sep 06, 2010

M. Emre Celebi, Hassan A. Kingravi, Y. Alp Aslandogan

Figure 1 for Nonlinear Vector Filtering for Impulsive Noise Removal from Color Images

Figure 2 for Nonlinear Vector Filtering for Impulsive Noise Removal from Color Images

Figure 3 for Nonlinear Vector Filtering for Impulsive Noise Removal from Color Images

Figure 4 for Nonlinear Vector Filtering for Impulsive Noise Removal from Color Images

Abstract:In this paper, a comprehensive survey of 48 filters for impulsive noise removal from color images is presented. The filters are formulated using a uniform notation and categorized into 8 families. The performance of these filters is compared on a large set of images that cover a variety of domains using three effectiveness and one efficiency criteria. In order to ensure a fair efficiency comparison, a fast and accurate approximation for the inverse cosine function is introduced. In addition, commonly used distance measures (Minkowski, angular, and directional-distance) are analyzed and evaluated. Finally, suggestions are provided on how to choose a filter given certain requirements.

* Journal of Electronic Imaging 16 (2007) 033008

Via

Access Paper or Ask Questions

A Fast Switching Filter for Impulsive Noise Removal from Color Images

Sep 06, 2010

M. Emre Celebi, Hassan A. Kingravi, Bakhtiyar Uddin, Y. Alp Aslandogan

Figure 1 for A Fast Switching Filter for Impulsive Noise Removal from Color Images

Figure 2 for A Fast Switching Filter for Impulsive Noise Removal from Color Images

Figure 3 for A Fast Switching Filter for Impulsive Noise Removal from Color Images

Figure 4 for A Fast Switching Filter for Impulsive Noise Removal from Color Images

Abstract:In this paper, we present a fast switching filter for impulsive noise removal from color images. The filter exploits the HSL color space, and is based on the peer group concept, which allows for the fast detection of noise in a neighborhood without resorting to pairwise distance computations between each pixel. Experiments on large set of diverse images demonstrate that the proposed approach is not only extremely fast, but also gives excellent results in comparison to various state-of-the-art filters.

* Journal of Imaging Science and Technology 51 (2007) 155-165

Via

Access Paper or Ask Questions

Cost-Effective Implementation of Order-Statistics Based Vector Filters Using Minimax Approximations

Sep 06, 2010

M. Emre Celebi, Hassan A. Kingravi, Rastislav Lukac, Fatih Celiker

Figure 1 for Cost-Effective Implementation of Order-Statistics Based Vector Filters Using Minimax Approximations

Figure 2 for Cost-Effective Implementation of Order-Statistics Based Vector Filters Using Minimax Approximations

Figure 3 for Cost-Effective Implementation of Order-Statistics Based Vector Filters Using Minimax Approximations

Figure 4 for Cost-Effective Implementation of Order-Statistics Based Vector Filters Using Minimax Approximations

Abstract:Vector operators based on robust order statistics have proved successful in digital multichannel imaging applications, particularly color image filtering and enhancement, in dealing with impulsive noise while preserving edges and fine image details. These operators often have very high computational requirements which limits their use in time-critical applications. This paper introduces techniques to speed up vector filters using the minimax approximation theory. Extensive experiments on a large and diverse set of color images show that proposed approximations achieve an excellent balance among ease of implementation, accuracy, and computational speed.

* Journal of the Optical Society of America A 26 (2009) 1518-1524

Via

Access Paper or Ask Questions

On Euclidean Norm Approximations

Aug 28, 2010

M. Emre Celebi, Fatih Celiker, Hassan A. Kingravi

Figure 1 for On Euclidean Norm Approximations

Figure 2 for On Euclidean Norm Approximations

Figure 3 for On Euclidean Norm Approximations

Figure 4 for On Euclidean Norm Approximations

Abstract:Euclidean norm calculations arise frequently in scientific and engineering applications. Several approximations for this norm with differing complexity and accuracy have been proposed in the literature. Earlier approaches were based on minimizing the maximum error. Recently, Seol and Cheun proposed an approximation based on minimizing the average error. In this paper, we first examine these approximations in detail, show that they fit into a single mathematical formulation, and compare their average and maximum errors. We then show that the maximum errors given by Seol and Cheun are significantly optimistic.

* 9 pages, 1 figure, Pattern Recognition

Via

Access Paper or Ask Questions