Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gokce Dane

DIREG3D: DIrectly REGress 3D Hands from Multiple Cameras

Jan 26, 2022

Ashar Ali, Upal Mahbub, Gokce Dane, Gerhard Reitmayr

Abstract:In this paper, we present DIREG3D, a holistic framework for 3D Hand Tracking. The proposed framework is capable of utilizing camera intrinsic parameters, 3D geometry, intermediate 2D cues, and visual information to regress parameters for accurately representing a Hand Mesh model. Our experiments show that information like the size of the 2D hand, its distance from the optical center, and radial distortion is useful for deriving highly reliable 3D poses in camera space from just monocular information. Furthermore, we extend these results to a multi-view camera setup by fusing features from different viewpoints.

Via

Access Paper or Ask Questions

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

May 16, 2017

Subarna Tripathi, Gokce Dane, Byeongkeun Kang, Vasudev Bhaskaran, Truong Nguyen

Figure 1 for LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Figure 2 for LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Figure 3 for LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Figure 4 for LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Abstract:Deep convolutional Neural Networks (CNN) are the state-of-the-art performers for object detection task. It is well known that object detection requires more computation and memory than image classification. Thus the consolidation of a CNN-based object detection for an embedded system is more challenging. In this work, we propose LCDet, a fully-convolutional neural network for generic object detection that aims to work in embedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. Additionally, we employ 8-bit quantization on the learned weights. We use face detection as a use case. Our TF-Slim based network can predict different faces of different shapes and sizes in a single forward pass. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-of-the-art CNN-based face detection methods, while reducing the model size by 3x and memory-BW by ~4x comparing with one of the best real-time CNN-based object detector such as YOLO. TF 8-bit quantized model provides additional 4x memory reduction while keeping the accuracy as good as the floating point model. The proposed model thus becomes amenable for embedded implementations.

* Embedded Vision Workshop in CVPR

Via

Access Paper or Ask Questions