Abstract:Gaze estimation is a valuable technology with numerous applications in fields such as human-computer interaction, virtual reality, and medicine. This report presents the implementation of a gaze estimation system using the Sony Spresense microcontroller board and explores its performance in latency, MAC/cycle, and power consumption. The report also provides insights into the system's architecture, including the gaze estimation model used. Additionally, a demonstration of the system is presented, showcasing its functionality and performance. Our lightweight model TinyTrackerS is a mere 169Kb in size, using 85.8k parameters and runs on the Spresense platform at 3 FPS.
Abstract:Intelligent edge vision tasks encounter the critical challenge of ensuring power and latency efficiency due to the typically heavy computational load they impose on edge platforms.This work leverages one of the first "AI in sensor" vision platforms, IMX500 by Sony, to achieve ultra-fast and ultra-low-power end-to-end edge vision applications. We evaluate the IMX500 and compare it to other edge platforms, such as the Google Coral Dev Micro and Sony Spresense, by exploring gaze estimation as a case study. We propose TinyTracker, a highly efficient, fully quantized model for 2D gaze estimation designed to maximize the performance of the edge vision systems considered in this study. TinyTracker achieves a 41x size reduction (600Kb) compared to iTracker [1] without significant loss in gaze estimation accuracy (maximum of 0.16 cm when fully quantized). TinyTracker's deployment on the Sony IMX500 vision sensor results in end-to-end latency of around 19ms. The camera takes around 17.9ms to read, process and transmit the pixels to the accelerator. The inference time of the network is 0.86ms with an additional 0.24 ms for retrieving the results from the sensor. The overall energy consumption of the end-to-end system is 4.9 mJ, including 0.06 mJ for inference. The end-to-end study shows that IMX500 is 1.7x faster than CoralMicro (19ms vs 34.4ms) and 7x more power efficient (4.9mJ VS 34.2mJ)