Abstract:Detecting persons in images or video with neural networks is a well-studied subject in literature. However, such works usually assume the availability of a camera of decent resolution and a high-performance processor or GPU to run the detection algorithm, which significantly increases the cost of a complete detection system. However, many applications require low-cost solutions, composed of cheap sensors and simple microcontrollers. In this paper, we demonstrate that even on such hardware we are not condemned to simple classic image processing techniques. We propose a novel ultra-lightweight CNN-based person detector that processes thermal video from a low-cost 32x24 pixel static imager. Trained and compressed on our own recorded dataset, our model achieves up to 91.62% accuracy (F1-score), has less than 10k parameters, and runs as fast as 87ms and 46ms on low-cost microcontrollers STM32F407 and STM32F746, respectively.
Abstract:Using hand gestures to answer a call or to control the radio while driving a car, is nowadays an established feature in more expensive cars. High resolution time-of-flight cameras and powerful embedded processors usually form the heart of these gesture recognition systems. This however comes with a price tag. We therefore investigate the possibility to design an algorithm that predicts hand gestures using a cheap low-resolution thermal camera with only 32x24 pixels, which is light-weight enough to run on a low-cost processor. We recorded a new dataset of over 1300 video clips for training and evaluation and propose a light-weight low-latency prediction algorithm. Our best model achieves 95.9% classification accuracy and 83% mAP detection accuracy while its processing pipeline has a latency of only one frame.