Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jose Cubero-Cascante

Mixed-Precision Training and Compilation for RRAM-based Computing-in-Memory Accelerators

Jan 29, 2026

Rebecca Pelke, Joel Klein, Jose Cubero-Cascante, Nils Bosbach, Jan Moritz Joseph, Rainer Leupers

Abstract:Computing-in-Memory (CIM) accelerators are a promising solution for accelerating Machine Learning (ML) workloads, as they perform Matrix-Vector Multiplications (MVMs) on crossbar arrays directly in memory. Although the bit widths of the crossbar inputs and cells are very limited, most CIM compilers do not support quantization below 8 bit. As a result, a single MVM requires many compute cycles, and weights cannot be efficiently stored in a single crossbar cell. To address this problem, we propose a mixed-precision training and compilation framework for CIM architectures. The biggest challenge is the massive search space, that makes it difficult to find good quantization parameters. This is why we introduce a reinforcement learning-based strategy to find suitable quantization configurations that balance latency and accuracy. In the best case, our approach achieves up to a 2.48x speedup over existing state-of-the-art solutions, with an accuracy loss of only 0.086 %.

Via

Access Paper or Ask Questions

CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Jan 17, 2024

Rebecca Pelke, Jose Cubero-Cascante, Nils Bosbach, Felix Staudigl, Rainer Leupers, Jan Moritz Joseph

Figure 1 for CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Figure 2 for CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Figure 3 for CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Figure 4 for CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures

Abstract:The demand for efficient machine learning (ML) accelerators is growing rapidly, driving the development of novel computing concepts such as resistive random access memory (RRAM)-based tiled computing-in-memory (CIM) architectures. CIM allows to compute within the memory unit, resulting in faster data processing and reduced power consumption. Efficient compiler algorithms are essential to exploit the potential of tiled CIM architectures. While conventional ML compilers focus on code generation for CPUs, GPUs, and other von Neumann architectures, adaptations are needed to cover CIM architectures. Cross-layer scheduling is a promising approach, as it enhances the utilization of CIM cores, thereby accelerating computations. Although similar concepts are implicitly used in previous work, there is a lack of clear and quantifiable algorithmic definitions for cross-layer scheduling for tiled CIM architectures. To close this gap, we present CLSA-CIM, a cross-layer scheduling algorithm for tiled CIM architectures. We integrate CLSA-CIM with existing weight-mapping strategies and compare performance against state-of-the-art (SOTA) scheduling algorithms. CLSA-CIM improves the utilization by up to 17.9 x , resulting in an overall speedup increase of up to 29.2 x compared to SOTA.

Via

Access Paper or Ask Questions