Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reuse Kernels or Activations? A Flexible Dataflow for Low-latency Spectral CNN Acceleration

Oct 17, 2023

Yue Niu, Rajgopal Kannan, Ajitesh Srivastava, Viktor Prasanna

Share this with someone who'll enjoy it:

Abstract:Spectral-domain CNNs have been shown to be more efficient than traditional spatial CNNs in terms of reducing computation complexity. However they come with a `kernel explosion' problem that, even after compression (pruning), imposes a high memory burden and off-chip bandwidth requirement for kernel access. This creates a performance gap between the potential acceleration offered by compression and actual FPGA implementation performance, especially for low-latency CNN inference. In this paper, we develop a principled approach to overcoming this performance gap and designing a low-latency, low-bandwidth, spectral sparse CNN accelerator on FPGAs. First, we analyze the bandwidth-storage tradeoff of sparse convolutional layers and locate communication bottlenecks. We then develop a dataflow for flexibly optimizing data reuse in different layers to minimize off-chip communication. Finally, we propose a novel scheduling algorithm to optimally schedule the on-chip memory access of multiple sparse kernels and minimize read conflicts. On a state-of-the-art FPGA platform, our design reduces data transfers by 42\% with DSP utilization up to 90\% and achieves inference latency of 9 ms for VGG16, compared to the baseline state-of-the-art latency of 68 ms.

* 11 pages, 11 figures Accepted to ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2020

View paper on

Share this with someone who'll enjoy it:

Title:Reuse Kernels or Activations? A Flexible Dataflow for Low-latency Spectral CNN Acceleration

Paper and Code