Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Idoia Ruiz

Weakly Supervised Multi-Object Tracking and Segmentation

Jan 03, 2021

Idoia Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Joan Serrat

Figure 1 for Weakly Supervised Multi-Object Tracking and Segmentation

Figure 2 for Weakly Supervised Multi-Object Tracking and Segmentation

Figure 3 for Weakly Supervised Multi-Object Tracking and Segmentation

Figure 4 for Weakly Supervised Multi-Object Tracking and Segmentation

Abstract:We introduce the problem of weakly supervised Multi-Object Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation. To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsupervised instance segmentation. For that purpose, we extract weak foreground localization information, provided by Grad-CAM heatmaps, to generate a partial ground truth to learn from. Additionally, RGB image level information is employed to refine the mask prediction at the edges of the objects. We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7% for cars and pedestrians, respectively.

* Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021
* Accepted at Autonomous Vehicle Vision WACV 2021 Workshop

Via

Access Paper or Ask Questions

Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Dec 04, 2019

Lorenzo Porzi, Markus Hofinger, Idoia Ruiz, Joan Serrat, Samuel Rota Bulò, Peter Kontschieder

Figure 1 for Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Figure 2 for Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Figure 3 for Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Figure 4 for Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Abstract:In this work we contribute a novel pipeline to automatically generate training data, and to improve over state-of-the-art multi-object tracking and segmentation (MOTS) methods. Our proposed tracklet mining algorithm turns raw street-level videos into high-fidelity MOTS training data, is scalable and overcomes the need of expensive and time-consuming manual annotation approaches. We leverage state-of-the-art instance segmentation results in combination with optical flow obtained from models also trained on automatically harvested training data. Our second major contribution is MOTSNet - a deep learning, tracking-by-detection architecture for MOTS - deploying a novel mask-pooling layer for improved object association over time. Training MOTSNet with our automatically extracted data leads to significantly improved sMOTSA scores on the novel KITTI MOTS dataset (+1.9%/+7.5% on cars/pedestrians). Even without learning from a single, manually annotated MOTS training example we still improve over prior state-of-the-art, confirming the compelling properties of our pipeline. On the MOTSChallenge dataset we improve by +4.1%, further confirming the efficacy of our proposed MOTSNet.

Via

Access Paper or Ask Questions

Optimizing Speed/Accuracy Trade-Off for Person Re-identification via Knowledge Distillation

Dec 07, 2018

Idoia Ruiz, Bogdan Raducanu, Rakesh Mehta, Jaume Amores

Figure 1 for Optimizing Speed/Accuracy Trade-Off for Person Re-identification via Knowledge Distillation

Figure 2 for Optimizing Speed/Accuracy Trade-Off for Person Re-identification via Knowledge Distillation

Figure 3 for Optimizing Speed/Accuracy Trade-Off for Person Re-identification via Knowledge Distillation

Figure 4 for Optimizing Speed/Accuracy Trade-Off for Person Re-identification via Knowledge Distillation

Abstract:Finding a person across a camera network plays an important role in video surveillance. For a real-world person re-identification application, in order to guarantee an optimal time response, it is crucial to find the balance between accuracy and speed. We analyse this trade-off, comparing a classical method, that comprises hand-crafted feature description and metric learning, in particular, LOMO and XQDA, with state-of-the-art deep learning techniques, using image classification networks, ResNet and MobileNets. Additionally, we propose and analyse network distillation as a learning strategy to reduce the computational cost of the deep learning approach at test time. We evaluate both methods on the Market-1501 and DukeMTMC-reID large-scale datasets.

Via

Access Paper or Ask Questions

Metric Learning for Novelty and Anomaly Detection

Aug 16, 2018

Marc Masana, Idoia Ruiz, Joan Serrat, Joost van de Weijer, Antonio M. Lopez

Figure 1 for Metric Learning for Novelty and Anomaly Detection

Figure 2 for Metric Learning for Novelty and Anomaly Detection

Figure 3 for Metric Learning for Novelty and Anomaly Detection

Figure 4 for Metric Learning for Novelty and Anomaly Detection

Abstract:When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently. The capability to detect out-of-distribution images is therefore crucial for many real-world applications. We divide out-of-distribution detection between novelty detection ---images of classes which are not in the training set but are related to those---, and anomaly detection ---images with classes which are unrelated to the training set. By related we mean they contain the same type of objects, like digits in MNIST and SVHN. Most existing work has focused on anomaly detection, and has addressed this problem considering networks trained with the cross-entropy loss. Differently from them, we propose to use metric learning which does not have the drawback of the softmax layer (inherent to cross-entropy methods), which forces the network to divide its prediction power over the learned classes. We perform extensive experiments and evaluate both novelty and anomaly detection, even in a relevant application such as traffic sign recognition, obtaining comparable or better results than previous works.

* Accepted at BMVC 2018, 10 pages main article and 4 pages supplementary material

Via

Access Paper or Ask Questions