Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Larry Ruckman

Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

Apr 26, 2024

Julia Gonski, Aseem Gupta, Haoyi Jia, Hyunjoon Kim, Lorenzo Rota, Larry Ruckman, Angelo Dragone, Ryan Herbst

Figure 1 for Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

Figure 2 for Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

Figure 3 for Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

Figure 4 for Embedded FPGA Developments in 130nm and 28nm CMOS for Machine Learning in Particle Detector Readout

Abstract:Embedded field programmable gate array (eFPGA) technology allows the implementation of reconfigurable logic within the design of an application-specific integrated circuit (ASIC). This approach offers the low power and efficiency of an ASIC along with the ease of FPGA configuration, particularly beneficial for the use case of machine learning in the data pipeline of next-generation collider experiments. An open-source framework called "FABulous" was used to design eFPGAs using 130 nm and 28 nm CMOS technology nodes, which were subsequently fabricated and verified through testing. The capability of an eFPGA to act as a front-end readout chip was tested using simulation of high energy particles passing through a silicon pixel sensor. A machine learning-based classifier, designed for reduction of sensor data at the source, was synthesized and configured onto the eFPGA. A successful proof-of-concept was demonstrated through reproduction of the expected algorithm result on the eFPGA with perfect accuracy. Further development of the eFPGA technology and its application to collider detector readout is discussed.

* 15 pages, 12 figures

Via

Access Paper or Ask Questions

Implementation of a framework for deploying AI inference engines in FPGAs

May 30, 2023

Ryan Herbst, Ryan Coffee, Nathan Fronk, Kukhee Kim, Kuktae Kim, Larry Ruckman, J. J. Russell

Abstract:The LCLS2 Free Electron Laser FEL will generate xray pulses to beamline experiments at up to 1Mhz These experimentals will require new ultrahigh rate UHR detectors that can operate at rates above 100 kHz and generate data throughputs upwards of 1 TBs a data velocity which requires prohibitively large investments in storage infrastructure Machine Learning has demonstrated the potential to digest large datasets to extract relevant insights however current implementations show latencies that are too high for realtime data reduction objectives SLAC has endeavored on the creation of a software framework which translates MLs structures for deployment on Field Programmable Gate Arrays FPGAs deployed at the Edge of the data chain close to the instrumentation This framework leverages Xilinxs HLS framework presenting an API modeled after the open source Keras interface to the TensorFlow library This SLAC Neural Network Library SNL framework is designed with a streaming data approach optimizing the data flow between layers while minimizing the buffer data buffering requirements The goal is to ensure the highest possible framerate while keeping the maximum latency constrained to the needs of the experiment Our framework is designed to ensure the RTL implementation of the network layers supporting full redeployment of weights and biases without requiring resynthesis after training The ability to reduce the precision of the implemented networks through quantization is necessary to optimize the use of both DSP and memory resources in the FPGA We currently have a preliminary version of the toolset and are experimenting with both general purpose example networks and networks being designed for specific LCLS2 experiments.

* 15 pages, 1 figure 2 tables

Via

Access Paper or Ask Questions