Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Paolo Faraboschi

Reducing the Barriers to Entry for Foundation Model Training

Apr 12, 2024

Paolo Faraboschi, Ellis Giles, Justin Hotard, Konstanty Owczarek, Andrew Wheeler

Abstract:The world has recently witnessed an unprecedented acceleration in demands for Machine Learning and Artificial Intelligence applications. This spike in demand has imposed tremendous strain on the underlying technology stack in supply chain, GPU-accelerated hardware, software, datacenter power density, and energy consumption. If left on the current technological trajectory, future demands show insurmountable spending trends, further limiting market players, stifling innovation, and widening the technology gap. To address these challenges, we propose a fundamental change in the AI training infrastructure throughout the technology ecosystem. The changes require advancements in supercomputing and novel AI training approaches, from high-end software to low-level hardware, microprocessor, and chip design, while advancing the energy efficiency required by a sustainable infrastructure. This paper presents the analytical framework that quantitatively highlights the challenges and points to the opportunities to reduce the barriers to entry for training large language models.

Via

Access Paper or Ask Questions

X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Apr 05, 2023

Giacomo Pedretti, John Moon, Pedro Bruel, Sergey Serebryakov, Ron M. Roth, Luca Buonanno, Tobias Ziegler, Cong Xu, Martin Foltin, Paolo Faraboschi(+2 more)

Figure 1 for X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Figure 2 for X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Figure 3 for X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Figure 4 for X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs

Abstract:Structured, or tabular, data is the most common format in data science. While deep learning models have proven formidable in learning from unstructured data such as images or speech, they are less accurate than simpler approaches when learning from tabular data. In contrast, modern tree-based Machine Learning (ML) models shine in extracting relevant information from structured data. An essential requirement in data science is to reduce model inference latency in cases where, for example, models are used in a closed loop with simulation to accelerate scientific discovery. However, the hardware acceleration community has mostly focused on deep neural networks and largely ignored other forms of machine learning. Previous work has described the use of an analog content addressable memory (CAM) component for efficiently mapping random forests. In this work, we focus on an overall analog-digital architecture implementing a novel increased precision analog CAM and a programmable network on chip allowing the inference of state-of-the-art tree-based ML models, such as XGBoost and CatBoost. Results evaluated in a single chip at 16nm technology show 119x lower latency at 9740x higher throughput compared with a state-of-the-art GPU, with a 19W peak power consumption.

Via

Access Paper or Ask Questions