Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Viktor Varga

Fast Interactive Video Object Segmentation with Graph Neural Networks

Mar 05, 2021

Viktor Varga, András Lőrincz

Figure 1 for Fast Interactive Video Object Segmentation with Graph Neural Networks

Figure 2 for Fast Interactive Video Object Segmentation with Graph Neural Networks

Figure 3 for Fast Interactive Video Object Segmentation with Graph Neural Networks

Figure 4 for Fast Interactive Video Object Segmentation with Graph Neural Networks

Abstract:Pixelwise annotation of image sequences can be very tedious for humans. Interactive video object segmentation aims to utilize automatic methods to speed up the process and reduce the workload of the annotators. Most contemporary approaches rely on deep convolutional networks to collect and process information from human annotations throughout the video. However, such networks contain millions of parameters and need huge amounts of labeled training data to avoid overfitting. Beyond that, label propagation is usually executed as a series of frame-by-frame inference steps, which is difficult to be parallelized and is thus time consuming. In this paper we present a graph neural network based approach for tackling the problem of interactive video object segmentation. Our network operates on superpixel-graphs which allow us to reduce the dimensionality of the problem by several magnitudes. We show, that our network possessing only a few thousand parameters is able to achieve state-of-the-art performance, while inference remains fast and can be trained quickly with very little data.

Via

Access Paper or Ask Questions

3D Human Pose Estimation with Siamese Equivariant Embedding

Sep 19, 2018

Márton Véges, Viktor Varga, András Lőrincz

Figure 1 for 3D Human Pose Estimation with Siamese Equivariant Embedding

Figure 2 for 3D Human Pose Estimation with Siamese Equivariant Embedding

Figure 3 for 3D Human Pose Estimation with Siamese Equivariant Embedding

Figure 4 for 3D Human Pose Estimation with Siamese Equivariant Embedding

Abstract:In monocular 3D human pose estimation a common setup is to first detect 2D positions and then lift the detection into 3D coordinates. Many algorithms suffer from overfitting to camera positions in the training set. We propose a siamese architecture that learns a rotation equivariant hidden representation to reduce the need for data augmentation. Our method is evaluated on multiple databases with different base networks and shows a consistent improvement of error metrics. It achieves state-of-the-art cross-camera error rate among algorithms that use estimated 2D joint coordinates only.

* 15 pages, 4 figures, submitted to Neurocomputing

Via

Access Paper or Ask Questions