Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Low-Rank Compression for IMC Arrays

Feb 10, 2025

Kang Eun Jeon, Johnny Rhe, Jong Hwan Ko

Figure 1 for Low-Rank Compression for IMC Arrays

Figure 2 for Low-Rank Compression for IMC Arrays

Figure 3 for Low-Rank Compression for IMC Arrays

Figure 4 for Low-Rank Compression for IMC Arrays

Share this with someone who'll enjoy it:

Abstract:In this study, we address the challenge of low-rank model compression in the context of in-memory computing (IMC) architectures. Traditional pruning approaches, while effective in model size reduction, necessitate additional peripheral circuitry to manage complex dataflows and mitigate dislocation issues, leading to increased area and energy overheads. To circumvent these drawbacks, we propose leveraging low-rank compression techniques, which, unlike pruning, streamline the dataflow and seamlessly integrate with IMC architectures. However, low-rank compression presents its own set of challenges, namely i) suboptimal IMC array utilization and ii) compromised accuracy. To address these issues, we introduce a novel approach i) employing shift and duplicate kernel (SDK) mapping technique, which exploits idle IMC columns for parallel processing, and ii) group low-rank convolution, which mitigates the information imbalance in the decomposed matrices. Our experimental results demonstrate that our proposed method achieves up to 2.5x speedup or +20.9% accuracy boost over existing pruning techniques.

* Accepted to appear at DATE'25 (Lyon, France)

View paper on

Share this with someone who'll enjoy it:

Title:Low-Rank Compression for IMC Arrays

Paper and Code