Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oğulcan Özdemir

Score-level Multi Cue Fusion for Sign Language Recognition

Sep 29, 2020

Çağrı Gökçe, Oğulcan Özdemir, Ahmet Alp Kındıroğlu, Lale Akarun

Figure 1 for Score-level Multi Cue Fusion for Sign Language Recognition

Figure 2 for Score-level Multi Cue Fusion for Sign Language Recognition

Figure 3 for Score-level Multi Cue Fusion for Sign Language Recognition

Figure 4 for Score-level Multi Cue Fusion for Sign Language Recognition

Abstract:Sign Languages are expressed through hand and upper body gestures as well as facial expressions. Therefore, Sign Language Recognition (SLR) needs to focus on all such cues. Previous work uses hand-crafted mechanisms or network aggregation to extract the different cue features, to increase SLR performance. This is slow and involves complicated architectures. We propose a more straightforward approach that focuses on training separate cue models specializing on the dominant hand, hands, face, and upper body regions. We compare the performance of 3D Convolutional Neural Network (CNN) models specializing in these regions, combine them through score-level fusion, and use the weighted alternative. Our experimental results have shown the effectiveness of mixed convolutional models. Their fusion yields up to 19% accuracy improvement over the baseline using the full upper body. Furthermore, we include a discussion for fusion settings, which can help future work on Sign Language Translation (SLT).

Via

Access Paper or Ask Questions

BosphorusSign22k Sign Language Recognition Dataset

Apr 09, 2020

Oğulcan Özdemir, Ahmet Alp Kındıroğlu, Necati Cihan Camgöz, Lale Akarun

Figure 1 for BosphorusSign22k Sign Language Recognition Dataset

Figure 2 for BosphorusSign22k Sign Language Recognition Dataset

Figure 3 for BosphorusSign22k Sign Language Recognition Dataset

Figure 4 for BosphorusSign22k Sign Language Recognition Dataset

Abstract:Sign Language Recognition is a challenging research domain. It has recently seen several advancements with the increased availability of data. In this paper, we introduce the BosphorusSign22k, a publicly available large scale sign language dataset aimed at computer vision, video recognition and deep learning research communities. The primary objective of this dataset is to serve as a new benchmark in Turkish Sign Language Recognition for its vast lexicon, the high number of repetitions by native signers, high recording quality, and the unique syntactic properties of the signs it encompasses. We also provide state-of-the-art human pose estimates to encourage other tasks such as Sign Language Production. We survey other publicly available datasets and expand on how BosphorusSign22k can contribute to future research that is being made possible through the widespread availability of similar Sign Language resources. We have conducted extensive experiments and present baseline results to underpin future research on our dataset.

* 8 pages

Via

Access Paper or Ask Questions

Temporal Accumulative Features for Sign Language Recognition

Apr 02, 2020

Ahmet Alp Kındıroğlu, Oğulcan Özdemir, Lale Akarun

Figure 1 for Temporal Accumulative Features for Sign Language Recognition

Figure 2 for Temporal Accumulative Features for Sign Language Recognition

Figure 3 for Temporal Accumulative Features for Sign Language Recognition

Figure 4 for Temporal Accumulative Features for Sign Language Recognition

Abstract:In this paper, we propose a set of features called temporal accumulative features (TAF) for representing and recognizing isolated sign language gestures. By incorporating sign language specific constructs to better represent the unique linguistic characteristic of sign language videos, we have devised an efficient and fast SLR method for recognizing isolated sign language gestures. The proposed method is an HSV based accumulative video representation where keyframes based on the linguistic movement-hold model are represented by different colors. We also incorporate hand shape information and using a small scale convolutional neural network, demonstrate that sequential modeling of accumulative features for linguistic subunits improves upon baseline classification results.

* 10 pages

Via

Access Paper or Ask Questions