Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Muma

Cross-Channel Unlabeled Sensing over a Union of Signal Subspaces

Jun 11, 2025

Taulant Koka, Manolis C. Tsakiris, Benjamín Béjar Haro, Michael Muma

Abstract:Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches between samples and channels often arise in applications such as whole-brain calcium imaging of freely moving organisms or multi-target tracking. We improve over previous models by deriving tighter bounds on the required number of samples for unique reconstruction, while supporting more general signal types. The approach is validated through an application in whole-brain calcium imaging, where organism movements disrupt sample-to-neuron mappings. This demonstrates the utility of our framework in real-world settings with imprecise sample-channel associations, achieving accurate signal reconstruction.

* ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
* Accepted to ICASSP 2025. \copyright 2025 IEEE. Personal use of this material is permitted

Via

Access Paper or Ask Questions

The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

Oct 07, 2024

Jasin Machkour, Michael Muma, Daniel P. Palomar

Figure 1 for The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

Figure 2 for The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

Figure 3 for The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

Figure 4 for The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

Abstract:Modern genomics research relies on genome-wide association studies (GWAS) to identify the few genetic variants among potentially millions that are associated with diseases of interest. Only reproducible discoveries of groups of associations improve our understanding of complex polygenic diseases and enable the development of new drugs and personalized medicine. Thus, fast multivariate variable selection methods that have a high true positive rate (TPR) while controlling the false discovery rate (FDR) are crucial. Recently, the T-Rex+GVS selector, a version of the T-Rex selector that uses the elastic net (EN) as a base selector to perform grouped variable election, was proposed. Although it significantly increased the TPR in simulated GWAS compared to the original T-Rex, its comparably high computational cost limits scalability. Therefore, we propose the informed elastic net (IEN), a new base selector that significantly reduces computation time while retaining the grouped variable selection property. We quantify its grouping effect and derive its formulation as a Lasso-type optimization problem, which is solved efficiently within the T-Rex framework by the terminated LARS algorithm. Numerical simulations and a GWAS study demonstrate that the proposed T-Rex+GVS (IEN) exhibits the desired grouping effect, reduces computation time, and achieves the same TPR as T-Rex+GVS (EN) but with lower FDR, which makes it a promising method for large-scale GWAS.

* Published in IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 10-13 December 2023, Los Sue\~nos, Costa Rica

Via

Access Paper or Ask Questions

Solving FDR-Controlled Sparse Regression Problems with Five Million Variables on a Laptop

Sep 27, 2024

Fabian Scheidt, Jasin Machkour, Michael Muma

Figure 1 for Solving FDR-Controlled Sparse Regression Problems with Five Million Variables on a Laptop

Figure 2 for Solving FDR-Controlled Sparse Regression Problems with Five Million Variables on a Laptop

Figure 3 for Solving FDR-Controlled Sparse Regression Problems with Five Million Variables on a Laptop

Figure 4 for Solving FDR-Controlled Sparse Regression Problems with Five Million Variables on a Laptop

Abstract:Currently, there is an urgent demand for scalable multivariate and high-dimensional false discovery rate (FDR)-controlling variable selection methods to ensure the repro-ducibility of discoveries. However, among existing methods, only the recently proposed Terminating-Random Experiments (T-Rex) selector scales to problems with millions of variables, as encountered in, e.g., genomics research. The T-Rex selector is a new learning framework based on early terminated random experiments with computer-generated dummy variables. In this work, we propose the Big T-Rex, a new implementation of T-Rex that drastically reduces its Random Access Memory (RAM) consumption to enable solving FDR-controlled sparse regression problems with millions of variables on a laptop. We incorporate advanced memory-mapping techniques to work with matrices that reside on solid-state drive and two new dummy generation strategies based on permutations of a reference matrix. Our nu-merical experiments demonstrate a drastic reduction in memory demand and computation time. We showcase that the Big T-Rex can efficiently solve FDR-controlled Lasso-type problems with five million variables on a laptop in thirty minutes. Our work empowers researchers without access to high-performance clusters to make reproducible discoveries in large-scale high-dimensional data.

* 2023 IEEE 9th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)
* Conference article (IEEE CAMSAP 2023), 5 pages, 7 figures

Via

Access Paper or Ask Questions

High-Dimensional False Discovery Rate Control for Dependent Variables

Jan 30, 2024

Jasin Machkour, Michael Muma, Daniel P. Palomar

Abstract:Algorithms that ensure reproducible findings from large-scale, high-dimensional data are pivotal in numerous signal processing applications. In recent years, multivariate false discovery rate (FDR) controlling methods have emerged, providing guarantees even in high-dimensional settings where the number of variables surpasses the number of samples. However, these methods often fail to reliably control the FDR in the presence of highly dependent variable groups, a common characteristic in fields such as genomics and finance. To tackle this critical issue, we introduce a novel framework that accounts for general dependency structures. Our proposed dependency-aware T-Rex selector integrates hierarchical graphical models within the T-Rex framework to effectively harness the dependency structure among variables. Leveraging martingale theory, we prove that our variable penalization mechanism ensures FDR control. We further generalize the FDR-controlling framework by stating and proving a clear condition necessary for designing both graphical and non-graphical models that capture dependencies. Additionally, we formulate a fully integrated optimal calibration algorithm that concurrently determines the parameters of the graphical model and the T-Rex framework, such that the FDR is controlled while maximizing the number of selected variables. Numerical experiments and a breast cancer survival analysis use-case demonstrate that the proposed method is the only one among the state-of-the-art benchmark methods that controls the FDR and reliably detects genes that have been previously identified to be related to breast cancer. An open-source implementation is available within the R package TRexSelector on CRAN.

Via

Access Paper or Ask Questions

FDR-Controlled Portfolio Optimization for Sparse Financial Index Tracking

Jan 30, 2024

Jasin Machkour, Daniel P. Palomar, Michael Muma

Abstract:In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff method or the T-Rex selector. To address this issue, we have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables. This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level. A real-world example of sparse index tracking demonstrates the proposed method's ability to accurately track the S&P 500 index over the past 20 years based on a small number of stocks. An open-source implementation is provided within the R package TRexSelector on CRAN.

Via

Access Paper or Ask Questions

False Discovery Rate Control for Gaussian Graphical Models via Neighborhood Screening

Jan 18, 2024

Taulant Koka, Jasin Machkour, Michael Muma

Abstract:Gaussian graphical models emerge in a wide range of fields. They model the statistical relationships between variables as a graph, where an edge between two variables indicates conditional dependence. Unfortunately, well-established estimators, such as the graphical lasso or neighborhood selection, are known to be susceptible to a high prevalence of false edge detections. False detections may encourage inaccurate or even incorrect scientific interpretations, with major implications in applications, such as biomedicine or healthcare. In this paper, we introduce a nodewise variable selection approach to graph learning and provably control the false discovery rate of the selected edge set at a self-estimated level. A novel fusion method of the individual neighborhoods outputs an undirected graph estimate. The proposed method is parameter-free and does not require tuning by the user. Benchmarks against competing false discovery rate controlling methods in numerical experiments considering different graph topologies show a significant gain in performance.

Via

Access Paper or Ask Questions

Sparse PCA with False Discovery Rate Controlled Variable Selection

Jan 16, 2024

Jasin Machkour, Arnaud Breloy, Michael Muma, Daniel P. Palomar, Frédéric Pascal

Figure 1 for Sparse PCA with False Discovery Rate Controlled Variable Selection

Figure 2 for Sparse PCA with False Discovery Rate Controlled Variable Selection

Figure 3 for Sparse PCA with False Discovery Rate Controlled Variable Selection

Figure 4 for Sparse PCA with False Discovery Rate Controlled Variable Selection

Abstract:Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection. Sparse PCA algorithms are usually expressed as a trade-off between explained variance and sparsity of the loading vectors (i.e., number of selected variables). As a high explained variance is not necessarily synonymous with relevant information, these methods are prone to select irrelevant variables. To overcome this issue, we propose an alternative formulation of sparse PCA driven by the false discovery rate (FDR). We then leverage the Terminating-Random Experiments (T-Rex) selector to automatically determine an FDR-controlled support of the loading vectors. A major advantage of the resulting T-Rex PCA is that no sparsity parameter tuning is required. Numerical experiments and a stock market data example demonstrate a significant performance improvement.

* Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), scheduled for 14-19 April 2024 in Seoul, Korea

Via

Access Paper or Ask Questions

Fast and Robust Sparsity-Aware Block Diagonal Representation

Dec 02, 2023

Aylin Tastan, Michael Muma, Abdelhak M. Zoubir

Figure 1 for Fast and Robust Sparsity-Aware Block Diagonal Representation

Figure 2 for Fast and Robust Sparsity-Aware Block Diagonal Representation

Figure 3 for Fast and Robust Sparsity-Aware Block Diagonal Representation

Figure 4 for Fast and Robust Sparsity-Aware Block Diagonal Representation

Abstract:The block diagonal structure of an affinity matrix is a commonly desired property in cluster analysis because it represents clusters of feature vectors by non-zero coefficients that are concentrated in blocks. However, recovering a block diagonal affinity matrix is challenging in real-world applications, in which the data may be subject to outliers and heavy-tailed noise that obscure the hidden cluster structure. To address this issue, we first analyze the effect of different fundamental outlier types in graph-based cluster analysis. A key idea that simplifies the analysis is to introduce a vector that represents a block diagonal matrix as a piece-wise linear function of the similarity coefficients that form the affinity matrix. We reformulate the problem as a robust piece-wise linear fitting problem and propose a Fast and Robust Sparsity-Aware Block Diagonal Representation (FRS-BDR) method, which jointly estimates cluster memberships and the number of blocks. Comprehensive experiments on a variety of real-world applications demonstrate the effectiveness of FRS-BDR in terms of clustering accuracy, robustness against corrupted features, computation time and cluster enumeration performance.

* 16 pages article, 4 pages supplementary, 51 pages accompanying material

Via

Access Paper or Ask Questions

Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Jun 01, 2023

Martin Gölz, Tanuj Hasija, Michael Muma, Abdelhak M. Zoubir

Figure 1 for Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Figure 2 for Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Figure 3 for Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Figure 4 for Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Abstract:The identification of the dependent components in multiple data sets is a fundamental problem in many practical applications. The challenge in these applications is that often the data sets are high-dimensional with few observations or available samples and contain latent components with unknown probability distributions. A novel mathematical formulation of this problem is proposed, which enables the inference of the underlying correlation structure with strict false positive control. In particular, the false discovery rate is controlled at a pre-defined threshold on two levels simultaneously. The deployed test statistics originate in the sample coherence matrix. The required probability models are learned from the data using the bootstrap. Local false discovery rates are used to solve the multiple hypothesis testing problem. Compared to the existing techniques in the literature, the developed technique does not assume an a priori correlation structure and work well when the number of data sets is large while the number of observations is small. In addition, it can handle the presence of distributional uncertainties, heavy-tailed noise, and outliers.

Via

Access Paper or Ask Questions

Emergency Response Person Localization and Vital Sign Estimation Using a Semi-Autonomous Robot Mounted SFCW Radar

May 25, 2023

Christian A. Schroth, Christian Eckrich, Ibrahim Kakouche, Stefan Fabian, Oskar von Stryk, Abdelhak M. Zoubir, Michael Muma

Figure 1 for Emergency Response Person Localization and Vital Sign Estimation Using a Semi-Autonomous Robot Mounted SFCW Radar

Figure 2 for Emergency Response Person Localization and Vital Sign Estimation Using a Semi-Autonomous Robot Mounted SFCW Radar

Figure 3 for Emergency Response Person Localization and Vital Sign Estimation Using a Semi-Autonomous Robot Mounted SFCW Radar

Figure 4 for Emergency Response Person Localization and Vital Sign Estimation Using a Semi-Autonomous Robot Mounted SFCW Radar

Abstract:The large number and scale of natural and man-made disasters have led to an urgent demand for technologies that enhance the safety and efficiency of search and rescue teams. Semi-autonomous rescue robots are beneficial, especially when searching inaccessible terrains, or dangerous environments, such as collapsed infrastructures. For search and rescue missions in degraded visual conditions or non-line of sight scenarios, radar-based approaches may contribute to acquire valuable, and otherwise unavailable information. This article presents a complete signal processing chain for radar-based multi-person detection, 2D-MUSIC localization and breathing frequency estimation. The proposed method shows promising results on a challenging emergency response dataset that we collected using a semi-autonomous robot equipped with a commercially available through-wall radar system. The dataset is composed of 62 scenarios of various difficulty levels with up to five persons captured in different postures, angles and ranges including wooden and stone obstacles that block the radar line of sight. Ground truth data for reference locations, respiration, electrocardiogram, and acceleration signals are included. The full emergency response benchmark data set as well as all codes to reproduce our results, are publicly available at https://doi.org/10.21227/4bzd-jm32.

* Dataset availabe at https://doi.org/10.21227/4bzd-jm32, code available at https://github.com/schrchr/radar-vitals-estimation

Via

Access Paper or Ask Questions