Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ding Zhu

An Efficient Training Algorithm for Models with Block-wise Sparsity

Mar 27, 2025

Ding Zhu, Zhiqun Zuo, Mohammad Mahdi Khalili

Abstract:Large-scale machine learning (ML) models are increasingly being used in critical domains like education, lending, recruitment, healthcare, criminal justice, etc. However, the training, deployment, and utilization of these models demand substantial computational resources. To decrease computation and memory costs, machine learning models with sparse weight matrices are widely used in the literature. Among sparse models, those with special sparse structures (e.g., models with block-wise sparse weight matrices) fit better with the hardware accelerators and can decrease the memory and computation costs during the inference. Unfortunately, while there are several efficient training methods, none of them are designed to train a block-wise sparse model efficiently. As a result, the current methods for training block-wise sparse models start with full and dense models leading to inefficient training. In this work, we focus on training models with \textit{block-wise sparse matrices} and propose an efficient training algorithm to decrease both computation and memory costs during training and inference. In addition, we will show that our proposed method enables us to efficiently find the right block size for the sparsity pattern during the training process. Our extensive empirical and theoretical analyses show that our algorithms can decrease the computation and memory costs significantly without a performance drop compared to baselines.

* 24 pages, submitted on Transactions on Machine Learning Research

Via

Access Paper or Ask Questions

ECG Signal Denoising Using Multi-scale Patch Embedding and Transformers

Jul 12, 2024

Ding Zhu, Vishnu Kabir Chhabra, Mohammad Mahdi Khalili

Figure 1 for ECG Signal Denoising Using Multi-scale Patch Embedding and Transformers

Figure 2 for ECG Signal Denoising Using Multi-scale Patch Embedding and Transformers

Figure 3 for ECG Signal Denoising Using Multi-scale Patch Embedding and Transformers

Abstract:Cardiovascular disease is a major life-threatening condition that is commonly monitored using electrocardiogram (ECG) signals. However, these signals are often contaminated by various types of noise at different intensities, significantly interfering with downstream tasks. Therefore, denoising ECG signals and increasing the signal-to-noise ratio is crucial for cardiovascular monitoring. In this paper, we propose a deep learning method that combines a one-dimensional convolutional layer with transformer architecture for denoising ECG signals. The convolutional layer processes the ECG signal by various kernel/patch sizes and generates an embedding called multi-scale patch embedding. The embedding then is used as the input of a transformer network and enhances the capability of the transformer for denoising the ECG signal.

Via

Access Paper or Ask Questions