Abstract:Hyperspectral imaging is gathering significant attention due to its potential in various domains such as geology, agriculture, ecology, and surveillance. However, the associated processing algorithms, which are essential for enhancing output quality and extracting relevant information, are often computationally intensive and have to deal with substantial data volumes. Our focus lies on reconfigurable hardware, particularly recent FPGAs. While FPGA design can be complex, High Level Synthesis (HLS) workflows have emerged as a solution, abstracting low-level design intricacies and enhancing productivity. Despite successful prior efforts using HLS for hyperspectral imaging acceleration, we lack a comprehensive research to benchmark various algorithms and architectures within a unified framework. This study aims to quantitatively evaluate performance across different inversion algorithms and design architectures, providing insights for optimal trade-offs for specific applications. We apply this analysis to the case study of spectrum reconstruction processed from interferometric acquisitions taken by Fourier transform spectrometers.
Abstract:Computer vision applications constitute one of the key drivers for embedded multicore architectures. Although the number of available cores is increasing in new architectures, designing an application to maximize the utilization of the platform is still a challenge. In this sense, parallel performance prediction tools can aid developers in understanding the characteristics of an application and finding the most adequate parallelization strategy. In this work, we present a method for early parallel performance estimation on embedded multiprocessors from sequential application traces. We describe its implementation in Parana, a fast trace-driven simulator targeting OpenMP applications on the STMicroelectronics' STxP70 Application-Specific Multiprocessor (ASMP). Results for the FAST key point detector application show an error margin of less than 10% compared to the reference cycle-approximate simulator, with lower modeling effort and up to 20x faster execution time.