Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanghyeon Park

CURing Large Models: Compression via CUR Decomposition

Jan 08, 2025

Sanghyeon Park, Soo-Mook Moon

Abstract:Large deep learning models have achieved remarkable success but are resource-intensive, posing challenges in computational cost and memory usage. We introduce CURing, a novel model compression method based on CUR matrix decomposition, which approximates weight matrices as the product of selected columns (C) and rows (R), and a small linking matrix (U). We apply this decomposition to weights chosen based on the combined influence of their magnitudes and activations. By identifying and retaining informative rows and columns, CURing significantly reduces model size with minimal performance loss. It preserves the original network's input/output structures, retains important features such as non-negativity, and the compressed model's activation patterns align with the original, thereby enhancing interpretability.

Via

Access Paper or Ask Questions

A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models

May 06, 2023

Sanghyeon Park, Junmo Lee, Soo-Mook Moon

Figure 1 for A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models

Figure 2 for A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models

Figure 3 for A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models

Figure 4 for A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models

Abstract:As artificial intelligence (AI) continues to permeate various domains, concerns surrounding trust and transparency in AI-driven inference and training processes have emerged, particularly with respect to potential biases and traceability challenges. Decentralized solutions such as blockchain have been proposed to tackle these issues, but they often struggle when dealing with large-scale models, leading to time-consuming inference and inefficient training verification. To overcome these limitations, we introduce BRAIN, a Blockchain-based Reliable AI Network, a novel platform specifically designed to ensure reliable inference and training of large models. BRAIN harnesses a unique two-phase transaction mechanism, allowing real-time processing via pipelining by separating request and response transactions. Each randomly-selected inference committee commits and reveals the inference results, and upon reaching an agreement through a smart contract, then the requested operation is executed using the consensus result. Additionally, BRAIN carries out training by employing a randomly-selected training committee. They submit commit and reveal transactions along with their respective scores, enabling local model aggregation based on the median value of the scores. Experimental results demonstrate that BRAIN delivers considerably higher inference throughput at reasonable gas fees. In particular, BRAIN's tasks-per-second performance is 454.4293 times greater than that of a naive single-phase implementation.

* 12 pages, 2 figures

Via

Access Paper or Ask Questions