Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Antonis A. Argyros

Y-MAP-Net: Real-time depth, normals, segmentation, multi-label captioning and 2D human pose in RGB images

Nov 15, 2024

Ammar Qammaz, Nikolaos Vasilikopoulos, Iason Oikonomidis, Antonis A. Argyros

Abstract:We present Y-MAP-Net, a Y-shaped neural network architecture designed for real-time multi-task learning on RGB images. Y-MAP-Net, simultaneously predicts depth, surface normals, human pose, semantic segmentation and generates multi-label captions, all from a single network evaluation. To achieve this, we adopt a multi-teacher, single-student training paradigm, where task-specific foundation models supervise the network's learning, enabling it to distill their capabilities into a lightweight architecture suitable for real-time applications. Y-MAP-Net, exhibits strong generalization, simplicity and computational efficiency, making it ideal for robotics and other practical scenarios. To support future research, we will release our code publicly.

* 8 page paper, 6 Figures, 3 Tables

Via

Access Paper or Ask Questions

Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Jul 12, 2021

Giorgos Karvounas, Nikolaos Kyriazis, Iason Oikonomidis, Aggeliki Tsoli, Antonis A. Argyros

Figure 1 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 2 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 3 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 4 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Abstract:The amount and quality of datasets and tools available in the research field of hand pose and shape estimation act as evidence to the significant progress that has been made. We find that there is still room for improvement in both fronts, and even beyond. Even the datasets of the highest quality, reported to date, have shortcomings in annotation. There are tools in the literature that can assist in that direction and yet they have not been considered, so far. To demonstrate how these gaps can be bridged, we employ such a publicly available, multi-camera dataset of hands (InterHand2.6M), and perform effective image-based refinement to improve on the imperfect ground truth annotations, yielding a better dataset. The image-based refinement is achieved through raytracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past. To tackle the lack of reliable ground truth, we resort to realistic synthetic data, to show that the improvement we induce is indeed significant, qualitatively, and quantitatively, too.

Via

Access Paper or Ask Questions

Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

Oct 27, 2015

Georg Poier, Konstantinos Roditakis, Samuel Schulter, Damien Michel, Horst Bischof, Antonis A. Argyros

Figure 1 for Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

Figure 2 for Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

Abstract:Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.

* BMVC 2015 (oral); see also http://lrs.icg.tugraz.at/research/hybridhape/

Via

Access Paper or Ask Questions