Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Valentin Bazarevsky

BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

Jun 23, 2022

Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, Cristian Sminchisescu

Figure 1 for BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

Figure 2 for BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

Figure 3 for BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

Figure 4 for BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

Abstract:We present BlazePose GHUM Holistic, a lightweight neural network pipeline for 3D human body landmarks and pose estimation, specifically tailored to real-time on-device inference. BlazePose GHUM Holistic enables motion capture from a single RGB image including avatar control, fitness tracking and AR/VR effects. Our main contributions include i) a novel method for 3D ground truth data acquisition, ii) updated 3D body tracking with additional hand landmarks and iii) full body pose estimation from a monocular image.

* 4 pages, 4 figures; CVPR Workshop on Computer Vision for Augmented and Virtual Reality, New Orleans, LA, 2022

Via

Access Paper or Ask Questions

On-device Real-time Hand Gesture Recognition

Oct 29, 2021

George Sung, Kanstantsin Sokal, Esha Uboweja, Valentin Bazarevsky, Jonathan Baccash, Eduard Gabriel Bazavan, Chuo-Ling Chang, Matthias Grundmann

Figure 1 for On-device Real-time Hand Gesture Recognition

Figure 2 for On-device Real-time Hand Gesture Recognition

Figure 3 for On-device Real-time Hand Gesture Recognition

Figure 4 for On-device Real-time Hand Gesture Recognition

Abstract:We present an on-device real-time hand gesture recognition (HGR) system, which detects a set of predefined static gestures from a single RGB camera. The system consists of two parts: a hand skeleton tracker and a gesture classifier. We use MediaPipe Hands as the basis of the hand skeleton tracker, improve the keypoint accuracy, and add the estimation of 3D keypoints in a world metric space. We create two different gesture classifiers, one based on heuristics and the other using neural networks (NN).

* 5 pages, 6 figures; ICCV Workshop on Computer Vision for Augmented and Virtual Reality, Montreal, Canada, 2021

Via

Access Paper or Ask Questions

MediaPipe Hands: On-device Real-time Hand Tracking

Jun 18, 2020

Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, Matthias Grundmann

Figure 1 for MediaPipe Hands: On-device Real-time Hand Tracking

Figure 2 for MediaPipe Hands: On-device Real-time Hand Tracking

Figure 3 for MediaPipe Hands: On-device Real-time Hand Tracking

Figure 4 for MediaPipe Hands: On-device Real-time Hand Tracking

Abstract:We present a real-time on-device hand tracking pipeline that predicts hand skeleton from single RGB camera for AR/VR applications. The pipeline consists of two models: 1) a palm detector, 2) a hand landmark model. It's implemented via MediaPipe, a framework for building cross-platform ML solutions. The proposed model and pipeline architecture demonstrates real-time inference speed on mobile GPUs and high prediction quality. MediaPipe Hands is open sourced at https://mediapipe.dev.

* 5 pages, 7 figures; CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Seattle, WA, USA, 2020

Via

Access Paper or Ask Questions

BlazePose: On-device Real-time Body Pose tracking

Jun 17, 2020

Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, Matthias Grundmann

Figure 1 for BlazePose: On-device Real-time Body Pose tracking

Figure 2 for BlazePose: On-device Real-time Body Pose tracking

Figure 3 for BlazePose: On-device Real-time Body Pose tracking

Figure 4 for BlazePose: On-device Real-time Body Pose tracking

Abstract:We present BlazePose, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on mobile devices. During inference, the network produces 33 body keypoints for a single person and runs at over 30 frames per second on a Pixel 2 phone. This makes it particularly suited to real-time use cases like fitness tracking and sign language recognition. Our main contributions include a novel body pose tracking solution and a lightweight body pose estimation neural network that uses both heatmaps and regression to keypoint coordinates.

* 4 pages, 6 figures; CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Seattle, WA, USA, 2020

Via

Access Paper or Ask Questions

Real-time Hair Segmentation and Recoloring on Mobile GPUs

Jul 15, 2019

Andrei Tkachenka, Gregory Karpiak, Andrey Vakunov, Yury Kartynnik, Artsiom Ablavatski, Valentin Bazarevsky, Siargey Pisarchyk

Figure 1 for Real-time Hair Segmentation and Recoloring on Mobile GPUs

Figure 2 for Real-time Hair Segmentation and Recoloring on Mobile GPUs

Figure 3 for Real-time Hair Segmentation and Recoloring on Mobile GPUs

Figure 4 for Real-time Hair Segmentation and Recoloring on Mobile GPUs

Abstract:We present a novel approach for neural network-based hair segmentation from a single camera input specifically designed for real-time, mobile application. Our relatively small neural network produces a high-quality hair segmentation mask that is well suited for AR effects, e.g. virtual hair recoloring. The proposed model achieves real-time inference speed on mobile GPUs (30-100+ FPS, depending on the device) with high accuracy. We also propose a very realistic hair recoloring scheme. Our method has been deployed in major AR application and is used by millions of users.

* 4 pages, 5 figures; CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Long Beach, CA, USA, 2019

Via

Access Paper or Ask Questions

BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Jul 14, 2019

Valentin Bazarevsky, Yury Kartynnik, Andrey Vakunov, Karthik Raveendran, Matthias Grundmann

Figure 1 for BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Figure 2 for BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Figure 3 for BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Figure 4 for BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

Abstract:We present BlazeFace, a lightweight and well-performing face detector tailored for mobile GPU inference. It runs at a speed of 200-1000+ FPS on flagship devices. This super-realtime performance enables it to be applied to any augmented reality pipeline that requires an accurate facial region of interest as an input for task-specific models, such as 2D/3D facial keypoint or geometry estimation, facial features or expression classification, and face region segmentation. Our contributions include a lightweight feature extraction network inspired by, but distinct from MobileNetV1/V2, a GPU-friendly anchor scheme modified from Single Shot MultiBox Detector (SSD), and an improved tie resolution strategy alternative to non-maximum suppression.

* 4 pages, 3 figures; CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Long Beach, CA, USA, 2019

Via

Access Paper or Ask Questions