Abstract:Sparse subspace clustering (SSC) using greedy-based neighbor selection, such as orthogonal matching pursuit (OMP), has been known as a popular computationally-efficient alternative to the popular L1-minimization based methods. This paper proposes a new SSC scheme using generalized OMP (GOMP), a soup-up of OMP whereby multiple neighbors are identified per iteration, along with a new stopping rule requiring nothing more than a knowledge of the ambient signal dimension. Compared to conventional OMP, which identifies one neighbor per iteration, the proposed GOMP method involves fewer iterations, thereby enjoying lower algorithmic complexity; advantageously, the proposed stopping rule is free from off-line estimation of subspace dimension and noise power. Under the semi-random model, analytic performance guarantees, in terms of neighbor recovery rates, are established to justify the advantage of the proposed GOMP. The results show that, with a high probability, GOMP (i) is halted by the proposed stopping rule, and (ii) can retrieve more true neighbors than OMP, consequently yielding higher final data clustering accuracy. Computer simulations using both synthetic data and real human face data are provided to validate our analytic study and evidence the effectiveness of the proposed approach.
Abstract:Sparse subspace clustering (SSC) using greedy-based neighbor selection, such as matching pursuit (MP) and orthogonal matching pursuit (OMP), has been known as a popular computationally-efficient alternative to the conventional L1-minimization based methods. Under deterministic bounded noise corruption, in this paper we derive coherence-based sufficient conditions guaranteeing correct neighbor identification using MP/OMP. Our analyses exploit the maximum/minimum inner product between two noisy data points subject to a known upper bound on the noise level. The obtained sufficient condition clearly reveals the impact of noise on greedy-based neighbor recovery. Specifically, it asserts that, as long as noise is sufficiently small so that the resultant perturbed residual vectors stay close to the desired subspace, both MP and OMP succeed in returning a correct neighbor subset. A striking finding is that, when the ground truth subspaces are well-separated from each other and noise is not large, MP-based iterations, while enjoying lower algorithmic complexity, yield smaller perturbation of residuals, thereby better able to identify correct neighbors and, in turn, achieving higher global data clustering accuracy. Extensive numerical experiments are used to corroborate our theoretical study.