Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nicolas Gillis

Identifiability of Nonnegative Tucker Decompositions -- Part I: Theory

May 19, 2025

Subhayan Saha, Giovanni Barbarino, Nicolas Gillis

Abstract:Tensor decompositions have become a central tool in data science, with applications in areas such as data analysis, signal processing, and machine learning. A key property of many tensor decompositions, such as the canonical polyadic decomposition, is identifiability: the factors are unique, up to trivial scaling and permutation ambiguities. This allows one to recover the groundtruth sources that generated the data. The Tucker decomposition (TD) is a central and widely used tensor decomposition model. However, it is in general not identifiable. In this paper, we study the identifiability of the nonnegative TD (nTD). By adapting and extending identifiability results of nonnegative matrix factorization (NMF), we provide uniqueness results for nTD. Our results require the nonnegative matrix factors to have some degree of sparsity (namely, satisfy the separability condition, or the sufficiently scattered condition), while the core tensor only needs to have some slices (or linear combinations of them) or unfoldings with full column rank (but does not need to be nonnegative). Under such conditions, we derive several procedures, using either unfoldings or slices of the input tensor, to obtain identifiable nTDs by minimizing the volume of unfoldings or slices of the core tensor.

* 40 pages, 2 figures

Via

Access Paper or Ask Questions

Efficient algorithms for the Hadamard decomposition

Apr 22, 2025

Samuel Wertz, Arnaud Vandaele, Nicolas Gillis

Abstract:The Hadamard decomposition is a powerful technique for data analysis and matrix compression, which decomposes a given matrix into the element-wise product of two or more low-rank matrices. In this paper, we develop an efficient algorithm to solve this problem, leveraging an alternating optimization approach that decomposes the global non-convex problem into a series of convex sub-problems. To improve performance, we explore advanced initialization strategies inspired by the singular value decomposition (SVD) and incorporate acceleration techniques by introducing momentum-based updates. Beyond optimizing the two-matrix case, we also extend the Hadamard decomposition framework to support more than two low-rank matrices, enabling approximations with higher effective ranks while preserving computational efficiency. Finally, we conduct extensive experiments to compare our method with the existing gradient descent-based approaches for the Hadamard decomposition and with traditional low-rank approximation techniques. The results highlight the effectiveness of our proposed method across diverse datasets.

* 7 pages, preprint submitted to IEEE MLSP 2025, code available from https://github.com/WertzSamuel/HadamardDecompositions

Via

Access Paper or Ask Questions

An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU function

Mar 31, 2025

Nicolas Gillis, Margherita Porcelli, Giovanni Seraghiti

Abstract:Nonlinear matrix decomposition (NMD) with the ReLU function, denoted ReLU-NMD, is the following problem: given a sparse, nonnegative matrix $X$ and a factorization rank $r$, identify a rank-$r$ matrix $\Theta$ such that $X\approx \max(0,\Theta)$. This decomposition finds application in data compression, matrix completion with entries missing not at random, and manifold learning. The standard ReLU-NMD model minimizes the least squares error, that is, $\|X - \max(0,\Theta)\|_F^2$. The corresponding optimization problem is nondifferentiable and highly nonconvex. This motivated Saul to propose an alternative model, Latent-ReLU-NMD, where a latent variable $Z$ is introduced and satisfies $\max(0,Z)=X$ while minimizing $\|Z - \Theta\|_F^2$ (``A nonlinear matrix decomposition for mining the zeros of sparse data'', SIAM J. Math. Data Sci., 2022). Our first contribution is to show that the two formulations may yield different low-rank solutions $\Theta$; in particular, we show that Latent-ReLU-NMD can be ill-posed when ReLU-NMD is not, meaning that there are instances in which the infimum of Latent-ReLU-NMD is not attained while that of ReLU-NMD is. We also consider another alternative model, called 3B-ReLU-NMD, which parameterizes $\Theta=WH$, where $W$ has $r$ columns and $H$ has $r$ rows, allowing one to get rid of the rank constraint in Latent-ReLU-NMD. Our second contribution is to prove the convergence of a block coordinate descent (BCD) applied to 3B-ReLU-NMD and referred to as BCD-NMD. Our third contribution is a novel extrapolated variant of BCD-NMD, dubbed eBCD-NMD, which we prove is also convergent under mild assumptions. We illustrate the significant acceleration effect of eBCD-NMD compared to BCD-NMD, and also show that eBCD-NMD performs well against the state of the art on synthetic and real-world data sets.

* 27 pages. Codes and data available from https://github.com/giovanniseraghiti/ReLU-NMD

Via

Access Paper or Ask Questions

On the Robustness of the Successive Projection Algorithm

Nov 25, 2024

Giovanni Barbarino, Nicolas Gillis

Figure 1 for On the Robustness of the Successive Projection Algorithm

Figure 2 for On the Robustness of the Successive Projection Algorithm

Figure 3 for On the Robustness of the Successive Projection Algorithm

Figure 4 for On the Robustness of the Successive Projection Algorithm

Abstract:The successive projection algorithm (SPA) is a workhorse algorithm to learn the $r$ vertices of the convex hull of a set of $(r-1)$-dimensional data points, a.k.a. a latent simplex, which has numerous applications in data science. In this paper, we revisit the robustness to noise of SPA and several of its variants. In particular, when $r \geq 3$, we prove the tightness of the existing error bounds for SPA and for two more robust preconditioned variants of SPA. We also provide significantly improved error bounds for SPA, by a factor proportional to the conditioning of the $r$ vertices, in two special cases: for the first extracted vertex, and when $r \leq 2$. We then provide further improvements for the error bounds of a translated version of SPA proposed by Arora et al. (''A practical algorithm for topic modeling with provable guarantees'', ICML, 2013) in two special cases: for the first two extracted vertices, and when $r \leq 3$. Finally, we propose a new more robust variant of SPA that first shifts and lifts the data points in order to minimize the conditioning of the problem. We illustrate our results on synthetic data.

* 23 pages

Via

Access Paper or Ask Questions

Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence

Oct 10, 2024

Jean Pacifique Nkurunziza, Fulgence Nahayo, Nicolas Gillis

Figure 1 for Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence

Figure 2 for Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence

Figure 3 for Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence

Figure 4 for Orthogonal Nonnegative Matrix Factorization with the Kullback-Leibler divergence

Abstract:Orthogonal nonnegative matrix factorization (ONMF) has become a standard approach for clustering. As far as we know, most works on ONMF rely on the Frobenius norm to assess the quality of the approximation. This paper presents a new model and algorithm for ONMF that minimizes the Kullback-Leibler (KL) divergence. As opposed to the Frobenius norm which assumes Gaussian noise, the KL divergence is the maximum likelihood estimator for Poisson-distributed data, which can model better vectors of word counts in document data sets and photo counting processes in imaging. We have developed an algorithm based on alternating optimization, KL-ONMF, and show that it performs favorably with the Frobenius-norm based ONMF for document classification and hyperspectral image unmixing.

* 10 pages

Via

Access Paper or Ask Questions

Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Mar 29, 2024

Maryam Abdolali, Giovanni Barbarino, Nicolas Gillis

Figure 1 for Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Figure 2 for Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Figure 3 for Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Figure 4 for Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Abstract:Simplex-structured matrix factorization (SSMF) is a generalization of nonnegative matrix factorization, a fundamental interpretable data analysis model, and has applications in hyperspectral unmixing and topic modeling. To obtain identifiable solutions, a standard approach is to find minimum-volume solutions. By taking advantage of the duality/polarity concept for polytopes, we convert minimum-volume SSMF in the primal space to a maximum-volume problem in the dual space. We first prove the identifiability of this maximum-volume dual problem. Then, we use this dual formulation to provide a novel optimization approach which bridges the gap between two existing families of algorithms for SSMF, namely volume minimization and facet identification. Numerical experiments show that the proposed approach performs favorably compared to the state-of-the-art SSMF algorithms.

* 31 pages, 10 figures

Via

Access Paper or Ask Questions

Checking the Sufficiently Scattered Condition using a Global Non-Convex Optimization Software

Feb 08, 2024

Nicolas Gillis, Robert Luce

Abstract:The sufficiently scattered condition (SSC) is a key condition in the study of identifiability of various matrix factorization problems, including nonnegative, minimum-volume, symmetric, simplex-structured, and polytopic matrix factorizations. The SSC allows one to guarantee that the computed matrix factorization is unique/identifiable, up to trivial ambiguities. However, this condition is NP-hard to check in general. In this paper, we show that it can however be checked in a reasonable amount of time in realistic scenarios, when the factorization rank is not too large. This is achieved by formulating the problem as a non-convex quadratic optimization problem over a bounded set. We use the global non-convex optimization software Gurobi, and showcase the usefulness of this code on synthetic data sets and on real-world hyperspectral images.

* 14 pages, code available from https://gitlab.com/ngillis/check-ssc

Via

Access Paper or Ask Questions

Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Jan 12, 2024

Le Thi Khanh Hien, Valentin Leplat, Nicolas Gillis

Figure 1 for Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Figure 2 for Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Figure 3 for Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Figure 4 for Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Abstract:We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, we establish subsequential convergence for BMMe. We use this method to design efficient algorithms to tackle nonnegative matrix factorization problems with the $\beta$-divergences ($\beta$-NMF) for $\beta\in [1,2]$. These algorithms, which are multiplicative updates with extrapolation, benefit from our novel results that offer convergence guarantees. We also empirically illustrate the significant acceleration of BMMe for $\beta$-NMF through extensive experiments.

* 23 pages, code available from https://github.com/vleplat/BMMe

Via

Access Paper or Ask Questions

Subtractive Mixture Models via Squaring: Representation and Learning

Oct 01, 2023

Lorenzo Loconte, Aleksanteri M. Sladek, Stefan Mengel, Martin Trapp, Arno Solin, Nicolas Gillis, Antonio Vergari

Figure 1 for Subtractive Mixture Models via Squaring: Representation and Learning

Figure 2 for Subtractive Mixture Models via Squaring: Representation and Learning

Figure 3 for Subtractive Mixture Models via Squaring: Representation and Learning

Figure 4 for Subtractive Mixture Models via Squaring: Representation and Learning

Abstract:Mixture models are traditionally represented and learned by adding several distributions as components. Allowing mixtures to subtract probability mass or density can drastically reduce the number of components needed to model complex distributions. However, learning such subtractive mixtures while ensuring they still encode a non-negative function is challenging. We investigate how to learn and perform inference on deep subtractive mixtures by squaring them. We do this in the framework of probabilistic circuits, which enable us to represent tensorized mixtures and generalize several other subtractive models. We theoretically prove that the class of squared circuits allowing subtractions can be exponentially more expressive than traditional additive mixtures; and, we empirically show this increased expressiveness on a series of real-world distribution estimation tasks.

Via

Access Paper or Ask Questions

Deep Nonnegative Matrix Factorization with Beta Divergences

Sep 15, 2023

Valentin Leplat, Le Thi Khanh Hien, Akwum Onwunta, Nicolas Gillis

Abstract:Deep Nonnegative Matrix Factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse datasets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that $\beta$-divergences offer a more suitable alternative. In this paper, we develop new models and algorithms for deep NMF using $\beta$-divergences. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.

* 30 pages, 11 figures, 4 tables

Via

Access Paper or Ask Questions