Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammed E. Fathy

TDT: Teaching Detectors to Track without Fully Annotated Videos

May 11, 2022

Shuzhi Yu, Guanhang Wu, Chunhui Gu, Mohammed E. Fathy

Figure 1 for TDT: Teaching Detectors to Track without Fully Annotated Videos

Figure 2 for TDT: Teaching Detectors to Track without Fully Annotated Videos

Figure 3 for TDT: Teaching Detectors to Track without Fully Annotated Videos

Figure 4 for TDT: Teaching Detectors to Track without Fully Annotated Videos

Abstract:Recently, one-stage trackers that use a joint model to predict both detections and appearance embeddings in one forward pass received much attention and achieved state-of-the-art results on the Multi-Object Tracking (MOT) benchmarks. However, their success depends on the availability of videos that are fully annotated with tracking data, which is expensive and hard to obtain. This can limit the model generalization. In comparison, the two-stage approach, which performs detection and embedding separately, is slower but easier to train as their data are easier to annotate. We propose to combine the best of the two worlds through a data distillation approach. Specifically, we use a teacher embedder, trained on Re-ID datasets, to generate pseudo appearance embedding labels for the detection datasets. Then, we use the augmented dataset to train a detector that is also capable of regressing these pseudo-embeddings in a fully-convolutional fashion. Our proposed one-stage solution matches the two-stage counterpart in quality but is 3 times faster. Even though the teacher embedder has not seen any tracking data during training, our proposed tracker achieves competitive performance with some popular trackers (e.g. JDE) trained with fully labeled tracking data.

* Workshop on Learning with Limited Labelled Data for Image and Video Understanding (L3D-IVU), CVPR2022 Workshop

Via

Access Paper or Ask Questions

Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences

Aug 02, 2018

Mohammed E. Fathy, Quoc-Huy Tran, M. Zeeshan Zia, Paul Vernaza, Manmohan Chandraker

Figure 1 for Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences

Figure 2 for Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences

Figure 3 for Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences

Figure 4 for Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences

Abstract:Interest point descriptors have fueled progress on almost every problem in computer vision. Recent advances in deep neural networks have enabled task-specific learned descriptors that outperform hand-crafted descriptors on many problems. We demonstrate that commonly used metric learning approaches do not optimally leverage the feature hierarchies learned in a Convolutional Neural Network (CNN), especially when applied to the task of geometric feature matching. While a metric loss applied to the deepest layer of a CNN, is often expected to yield ideal features irrespective of the task, in fact the growing receptive field as well as striding effects cause shallower features to be better at high precision matching tasks. We leverage this insight together with explicit supervision at multiple levels of the feature hierarchy for better regularization, to learn more effective descriptors in the context of geometric matching tasks. Further, we propose to use activation maps at different layers of a CNN, as an effective and principled replacement for the multi-resolution image pyramids often used for matching tasks. We propose concrete CNN architectures employing these ideas, and evaluate them on multiple datasets for 2D and 3D geometric matching as well as optical flow, demonstrating state-of-the-art results and generalization across datasets.

* Accepted to ECCV 2018

Via

Access Paper or Ask Questions

Fundamental Matrix Estimation: A Study of Error Criteria

Jun 24, 2017

Mohammed E. Fathy, Ashraf S. Hussein, Mohammed F. Tolba

Figure 1 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 2 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 3 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 4 for Fundamental Matrix Estimation: A Study of Error Criteria

Abstract:The fundamental matrix (FM) describes the geometric relations that exist between two images of the same scene. Different error criteria are used for estimating FMs from an input set of correspondences. In this paper, the accuracy and efficiency aspects of the different error criteria were studied. We mathematically and experimentally proved that the most popular error criterion, the symmetric epipolar distance, is biased. It was also shown that despite the similarity between the algebraic expressions of the symmetric epipolar distance and Sampson distance, they have different accuracy properties. In addition, a new error criterion, Kanatani distance, was proposed and was proved to be the most effective for use during the outlier removal phase from accuracy and efficiency perspectives. To thoroughly test the accuracy of the different error criteria, we proposed a randomized algorithm for Reprojection Error-based Correspondence Generation (RE-CG). As input, RE-CG takes an FM and a desired reprojection error value $d$. As output, RE-CG generates a random correspondence having that error value. Mathematical analysis of this algorithm revealed that the success probability for any given trial is 1 - (2/3)^2 at best and is 1 - (6/7)^2 at worst while experiments demonstrated that the algorithm often succeeds after only one trial.

* 15 pages, 7 figures, Pattern Recognition Letters, 2011

Via

Access Paper or Ask Questions