Abstract:Recent advances in large language models (LLMs) have provided new opportunities for decision-making, particularly in the task of automated feature selection. In this paper, we first comprehensively evaluate LLM-based feature selection methods, covering the state-of-the-art DeepSeek-R1, GPT-o3-mini, and GPT-4.5. Then, we propose a novel hybrid strategy called LLM4FS that integrates LLMs with traditional data-driven methods. Specifically, input data samples into LLMs, and directly call traditional data-driven techniques such as random forest and forward sequential selection. Notably, our analysis reveals that the hybrid strategy leverages the contextual understanding of LLMs and the high statistical reliability of traditional data-driven methods to achieve excellent feature selection performance, even surpassing LLMs and traditional data-driven methods. Finally, we point out the limitations of its application in decision-making.
Abstract:Although federated learning has gained prominence as a privacy-preserving framework tailored for distributed Internet of Things (IoT) environments, current federated principal component analysis (PCA) methods lack integration of sparsity, a critical feature for robust anomaly detection. To address this limitation, we propose a novel federated structured sparse PCA (FedSSP) approach for anomaly detection in IoT networks. The proposed model uniquely integrates double sparsity regularization: (1) row-wise sparsity governed by $\ell_{2,p}$-norm with $p\in[0,1)$ to eliminate redundant feature dimensions, and (2) element-wise sparsity via $\ell_{q}$-norm with $q\in[0,1)$ to suppress noise-sensitive components. To efficiently solve this non-convex optimization problem in a distributed setting, we devise a proximal alternating minimization (PAM) algorithm with rigorous theoretical proofs establishing its convergence guarantees. Experiments on real datasets validate that incorporating structured sparsity enhances both model interpretability and detection accuracy.
Abstract:Hyperspectral unmixing (HU) is a critical yet challenging task in remote sensing. However, existing nonnegative matrix factorization (NMF) methods with graph learning mostly focus on first-order or second-order nearest neighbor relationships and usually require manual parameter tuning, which fails to characterize intrinsic data structures. To address the above issues, we propose a novel adaptive multi-order graph regularized NMF method (MOGNMF) with three key features. First, multi-order graph regularization is introduced into the NMF framework to exploit global and local information comprehensively. Second, these parameters associated with the multi-order graph are learned adaptively through a data-driven approach. Third, dual sparsity is embedded to obtain better robustness, i.e., $\ell_{1/2}$-norm on the abundance matrix and $\ell_{2,1}$-norm on the noise matrix. To solve the proposed model, we develop an alternating minimization algorithm whose subproblems have explicit solutions, thus ensuring effectiveness. Experiments on simulated and real hyperspectral data indicate that the proposed method delivers better unmixing results.
Abstract:Sparse principal component analysis (PCA) is a well-established dimensionality reduction technique that is often used for unsupervised feature selection (UFS). However, determining the regularization parameters is rather challenging, and conventional approaches, including grid search and Bayesian optimization, not only bring great computational costs but also exhibit high sensitivity. To address these limitations, we first establish a structured sparse PCA formulation by integrating $\ell_1$-norm and $\ell_{2,1}$-norm to capture the local and global structures, respectively. Building upon the off-the-shelf alternating direction method of multipliers (ADMM) optimization framework, we then design an interpretable deep unfolding network that translates iterative optimization steps into trainable neural architectures. This innovation enables automatic learning of the regularization parameters, effectively bypassing the empirical tuning requirements of conventional methods. Numerical experiments on benchmark datasets validate the advantages of our proposed method over the existing state-of-the-art methods. Our code will be accessible at https://github.com/xianchaoxiu/SPCA-Net.
Abstract:Unsupervised feature selection (UFS) is widely applied in machine learning and pattern recognition. However, most of the existing methods only consider a single sparsity, which makes it difficult to select valuable and discriminative feature subsets from the original high-dimensional feature set. In this paper, we propose a new UFS method called DSCOFS via embedding double sparsity constrained optimization into the classical principal component analysis (PCA) framework. Double sparsity refers to using $\ell_{2,0}$-norm and $\ell_0$-norm to simultaneously constrain variables, by adding the sparsity of different types, to achieve the purpose of improving the accuracy of identifying differential features. The core is that $\ell_{2,0}$-norm can remove irrelevant and redundant features, while $\ell_0$-norm can filter out irregular noisy features, thereby complementing $\ell_{2,0}$-norm to improve discrimination. An effective proximal alternating minimization method is proposed to solve the resulting nonconvex nonsmooth model. Theoretically, we rigorously prove that the sequence generated by our method globally converges to a stationary point. Numerical experiments on three synthetic datasets and eight real-world datasets demonstrate the effectiveness, stability, and convergence of the proposed method. In particular, the average clustering accuracy (ACC) and normalized mutual information (NMI) are improved by at least 3.34% and 3.02%, respectively, compared with the state-of-the-art methods. More importantly, two common statistical tests and a new feature similarity metric verify the advantages of double sparsity. All results suggest that our proposed DSCOFS provides a new perspective for feature selection.
Abstract:To efficiently deal with high-dimensional datasets in many areas, unsupervised feature selection (UFS) has become a rising technique for dimension reduction. Even though there are many UFS methods, most of them only consider the global structure of datasets by embedding a single sparse regularization or constraint. In this paper, we introduce a novel bi-sparse UFS method, called BSUFS, to simultaneously characterize both global and local structures. The core idea of BSUFS is to incorporate $\ell_{2,p}$-norm and $\ell_q$-norm into the classical principal component analysis (PCA), which enables our proposed method to select relevant features and filter out irrelevant noise accurately. Here, the parameters $p$ and $q$ are within the range of [0,1). Therefore, BSUFS not only constructs a unified framework for bi-sparse optimization, but also includes some existing works as special cases. To solve the resulting non-convex model, we propose an efficient proximal alternating minimization (PAM) algorithm using Riemannian manifold optimization and sparse optimization techniques. Theoretically, PAM is proven to have global convergence, i.e., for any random initial point, the generated sequence converges to a critical point that satisfies the first-order optimality condition. Extensive numerical experiments on synthetic and real-world datasets demonstrate the effectiveness of our proposed BSUFS. Specifically, the average accuracy (ACC) is improved by at least 4.71% and the normalized mutual information (NMI) is improved by at least 3.14% on average compared to the existing UFS competitors. The results validate the advantages of bi-sparse optimization in feature selection and show its potential for other fields in image processing. Our code will be available at https://github.com/xianchaoxiu.