Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roman Kaskman

Learn to Predict Sets Using Feed-Forward Neural Networks

Jan 30, 2020

Hamid Rezatofighi, Roman Kaskman, Farbod T. Motlagh, Qinfeng Shi, Anton Milan, Daniel Cremers, Laura Leal-Taixé, Ian Reid

Figure 1 for Learn to Predict Sets Using Feed-Forward Neural Networks

Figure 2 for Learn to Predict Sets Using Feed-Forward Neural Networks

Figure 3 for Learn to Predict Sets Using Feed-Forward Neural Networks

Figure 4 for Learn to Predict Sets Using Feed-Forward Neural Networks

Abstract:This paper addresses the task of set prediction using deep feed-forward neural networks. A set is a collection of elements which is invariant under permutation and the size of a set is not fixed in advance. Many real-world problems, such as image tagging and object detection, have outputs that are naturally expressed as sets of entities. This creates a challenge for traditional deep neural networks which naturally deal with structured outputs such as vectors, matrices or tensors. We present a novel approach for learning to predict sets with unknown permutation and cardinality using deep neural networks. In our formulation we define a likelihood for a set distribution represented by a) two discrete distributions defining the set cardinally and permutation variables, and b) a joint distribution over set elements with a fixed cardinality. Depending on the problem under consideration, we define different training models for set prediction using deep neural networks. We demonstrate the validity of our set formulations on relevant vision problems such as: 1)multi-label image classification where we achieve state-of-the-art performance on the PASCAL VOC and MS COCO datasets, 2) object detection, for which our formulation outperforms state-of-the-art detectors such as Faster R-CNN and YOLO v3, and 3) a complex CAPTCHA test, where we observe that, surprisingly, our set-based network acquired the ability of mimicking arithmetics without any rules being coded.

* arXiv admin note: substantial text overlap with arXiv:1805.00613

Via

Access Paper or Ask Questions

HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Apr 05, 2019

Roman Kaskman, Sergey Zakharov, Ivan Shugurov, Slobodan Ilic

Figure 1 for HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Figure 2 for HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Figure 3 for HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Figure 4 for HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects

Abstract:One of the most important prerequisites for creating and evaluating 6D object pose detectors are datasets with labeled 6D poses. In the advent of deep learning methods, demand for such datasets is consinuously arising. Despite the fact that some of those exist, they are scarce and typically have restricted setups, e.g. a single object per sequence, or focus on specific object types, such as textureless industrial parts. Besides, two significant components are often ignored: training only from available 3D models instead of real data and scalability, i.e. training one method to detect all objects rather than training one detector per object. Other challenges, such as occlusions, changing light conditions and object appearance changes, as well as precisely defined benchmarks are either not present or scattered among different datasets. In this paper we present dataset for 6D pose estimation that covers the above-mentioned challenges, mainly targeting training from 3D models (both textured and textureless), scalability, occlusions, light and object appearance changes. The dataset features 33 objects (17 toy, 8 household and 8 industry-relevant objects) over 13 scenes of various difficulty. Moreover, we present a set benchmarks with the purpose of testing various desired properties of the detectors, particularly focusing on scalability with respect to the number of objects, resistance to changing light conditions, occlusions and clutter. We also set a baseline for the presented benchmarks using a publicly available state of the art detector. Considering difficulties in making such datasets, we plan to release the code allowing other researchers to extend this dataset or make their own datasets in the future.

Via

Access Paper or Ask Questions

Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks

Oct 02, 2018

S. Hamid Rezatofighi, Roman Kaskman, Farbod T. Motlagh, Qinfeng Shi, Daniel Cremers, Laura Leal-Taixé, Ian Reid

Figure 1 for Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks

Figure 2 for Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks

Figure 3 for Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks

Figure 4 for Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks

Abstract:Many real-world problems, e.g. object detection, have outputs that are naturally expressed as sets of entities. This creates a challenge for traditional deep neural networks which naturally deal with structured outputs such as vectors, matrices or tensors. We present a novel approach for learning to predict sets with unknown permutation and cardinality using deep neural networks. Specifically, in our formulation we incorporate the permutation as unobservable variable and estimate its distribution during the learning process using alternating optimization. We demonstrate the validity of this new formulation on two relevant vision problems: object detection, for which our formulation outperforms state-of-the-art detectors such as Faster R-CNN and YOLO, and a complex CAPTCHA test, where we observe that, surprisingly, our set based network acquired the ability of mimicking arithmetics without any rules being coded.

Via

Access Paper or Ask Questions