Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soumitra Samanta

Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Jul 19, 2024

Suvajit Patra, Arkadip Maitra, Megha Tiwari, K. Kumaran, Swathy Prabhu, Swami Punyeshwarananda, Soumitra Samanta

Figure 1 for Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Figure 2 for Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Figure 3 for Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Figure 4 for Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Abstract:Automatic Sign Language (SL) recognition is an important task in the computer vision community. To build a robust SL recognition system, we need a considerable amount of data which is lacking particularly in Indian sign language (ISL). In this paper, we propose a large-scale isolated ISL dataset and a novel SL recognition model based on skeleton graph structure. The dataset covers 2,002 daily used common words in the deaf community recorded by 20 (10 male and 10 female) deaf adult signers (contains 40033 videos). We propose a SL recognition model namely Hierarchical Windowed Graph Attention Network (HWGAT) by utilizing the human upper body skeleton graph structure. The HWGAT tries to capture distinctive motions by giving attention to different body parts induced by the human skeleton graph structure. The utility of the proposed dataset and the usefulness of our model are evaluated through extensive experiments. We pre-trained the proposed model on the proposed dataset and fine-tuned it across different sign language datasets further boosting the performance of 1.10, 0.46, 0.78, and 6.84 percentage points on INCLUDE, LSA64, AUTSL and WLASL respectively compared to the existing state-of-the-art skeleton-based models.

Via

Access Paper or Ask Questions

A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure

Jul 31, 2016

Soumitra Samanta, Bhabatosh Chanda

Figure 1 for A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure

Figure 2 for A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure

Figure 3 for A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure

Figure 4 for A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure

Abstract:In this paper, we present a data-driven approach for human pose tracking in video data. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in a greedy framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and traditional object tracking in a video. Our pose tracking objective function consists of the following terms: likeliness of appearance of a part within a frame, temporal displacement of the part from previous frame to the current frame, and the spatial dependency of a part with its parent in the graph structure. Experimental evaluation on benchmark datasets (VideoPose2, Poses in the Wild and Outdoor Pose) as well as on our newly build ICDPose dataset shows the usefulness of our proposed method.

Via

Access Paper or Ask Questions