Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jorge Leonid Aching Samatelo

Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

Apr 10, 2019

Clebeson Canuto dos Santos, Jorge Leonid Aching Samatelo, Raquel Frizera Vassallo

Figure 1 for Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

Figure 2 for Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

Figure 3 for Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

Figure 4 for Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

Abstract:With the advance of technologies, machines are increasingly present in people's daily lives. Thus, there has been more and more effort for developing interfaces, such as dynamic gestures, that provide an intuitive way of interaction. Currently, the most common trend is to use multimodal data, as depth and skeleton information, to try to recognize dynamic gestures. However, the use of only color information would be more interesting, once RGB cameras are usually found in almost every public place, and could be used for gesture recognition without the need to install other equipment. The main problem with this approach is the difficulty of representing spatio-temporal information using just color. With this in mind, we propose a technique that we called Star RGB, capable of describing a videoclip containing a dynamic gesture as an RGB image. This image is then passed to a classifier formed by two Resnet CNN's, a soft-attention ensemble, and a multilayer perceptron, which returns the predicted class label that indicates to which type of gesture the input video belongs. Experiments were carried out using the Montalbano and GRIT datasets. On the Montalbano dataset, the proposed approach achieved an accuracy of 94.58%, this result reaches the state-of-the-art using this dataset, considering only color information. On the GRIT dataset, our proposal achieves more than 98% of accuracy, recall, precision, and F1-score, outperforming the reference approach in more than 6%.

* 17 pages, 12 figures, submitted to Neurocomputing Journal

Via

Access Paper or Ask Questions