Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Slawomir Bak

INRIA Sophia Antipolis

Argoverse: 3D Tracking and Forecasting with Rich Maps

Nov 06, 2019

Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan(+1 more)

Figure 1 for Argoverse: 3D Tracking and Forecasting with Rich Maps

Figure 2 for Argoverse: 3D Tracking and Forecasting with Rich Maps

Figure 3 for Argoverse: 3D Tracking and Forecasting with Rich Maps

Figure 4 for Argoverse: 3D Tracking and Forecasting with Rich Maps

Abstract:We present Argoverse -- two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Argoverse was collected by a fleet of autonomous vehicles in Pittsburgh and Miami. The Argoverse 3D Tracking dataset includes 360 degree images from 7 cameras with overlapping fields of view, 3D point clouds from long range LiDAR, 6-DOF pose, and 3D track annotations. Notably, it is the only modern AV dataset that provides forward-facing stereo imagery. The Argoverse Motion Forecasting dataset includes more than 300,000 5-second tracked scenarios with a particular vehicle identified for trajectory forecasting. Argoverse is the first autonomous vehicle dataset to include "HD maps" with 290 km of mapped lanes with geometric and semantic metadata. All data is released under a Creative Commons license at www.argoverse.org. In our baseline experiments, we illustrate how detailed map information such as lane direction, driveable area, and ground height improves the accuracy of 3D object tracking and motion forecasting. Our tracking and forecasting experiments represent only an initial exploration of the use of rich maps in robotic perception. We hope that Argoverse will enable the research community to explore these problems in greater depth.

* CVPR 2019

Via

Access Paper or Ask Questions

Domain Adaptation through Synthesis for Unsupervised Person Re-identification

Apr 26, 2018

Slawomir Bak, Peter Carr, Jean-Francois Lalonde

Figure 1 for Domain Adaptation through Synthesis for Unsupervised Person Re-identification

Figure 2 for Domain Adaptation through Synthesis for Unsupervised Person Re-identification

Figure 3 for Domain Adaptation through Synthesis for Unsupervised Person Re-identification

Figure 4 for Domain Adaptation through Synthesis for Unsupervised Person Re-identification

Abstract:Drastic variations in illumination across surveillance cameras make the person re-identification problem extremely challenging. Current large scale re-identification datasets have a significant number of training subjects, but lack diversity in lighting conditions. As a result, a trained model requires fine-tuning to become effective under an unseen illumination condition. To alleviate this problem, we introduce a new synthetic dataset that contains hundreds of illumination conditions. Specifically, we use 100 virtual humans illuminated with multiple HDR environment maps which accurately model realistic indoor and outdoor lighting. To achieve better accuracy in unseen illumination conditions we propose a novel domain adaptation technique that takes advantage of our synthetic data and performs fine-tuning in a completely unsupervised way. Our approach yields significantly higher accuracy than semi-supervised and unsupervised state-of-the-art methods, and is very competitive with supervised techniques.

Via

Access Paper or Ask Questions

Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Mar 27, 2018

Shuang Li, Slawomir Bak, Peter Carr, Xiaogang Wang

Figure 1 for Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Figure 2 for Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Figure 3 for Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Figure 4 for Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Abstract:Video-based person re-identification matches video clips of people across non-overlapping cameras. Most existing methods tackle this problem by encoding each video frame in its entirety and computing an aggregate representation across all frames. In practice, people are often partially occluded, which can corrupt the extracted features. Instead, we propose a new spatiotemporal attention model that automatically discovers a diverse set of distinctive body parts. This allows useful information to be extracted from all frames without succumbing to occlusions and misalignments. The network learns multiple spatial attention models and employs a diversity regularization term to ensure multiple models do not discover the same body part. Features extracted from local image regions are organized by spatial attention model and are combined using temporal attention. As a result, the network learns latent representations of the face, torso and other body parts using the best available image patches from the entire video sequence. Extensive evaluations on three datasets show that our framework outperforms the state-of-the-art approaches by large margins on multiple metrics.

Via

Access Paper or Ask Questions

Automatic Tracker Selection w.r.t Object Detection Performance

Apr 08, 2014

Duc Phu Chau, François Bremond, Monique Thonnat, Slawomir Bak

Figure 1 for Automatic Tracker Selection w.r.t Object Detection Performance

Figure 2 for Automatic Tracker Selection w.r.t Object Detection Performance

Figure 3 for Automatic Tracker Selection w.r.t Object Detection Performance

Figure 4 for Automatic Tracker Selection w.r.t Object Detection Performance

Abstract:The tracking algorithm performance depends on video content. This paper presents a new multi-object tracking approach which is able to cope with video content variations. First the object detection is improved using Kanade- Lucas-Tomasi (KLT) feature tracking. Second, for each mobile object, an appropriate tracker is selected among a KLT-based tracker and a discriminative appearance-based tracker. This selection is supported by an online tracking evaluation. The approach has been experimented on three public video datasets. The experimental results show a better performance of the proposed approach compared to recent state of the art trackers.

* IEEE Winter Conference on Applications of Computer Vision (WACV 2014) (2014)

Via

Access Paper or Ask Questions