Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Naoki Kato

Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Sep 05, 2024

Keisuke Toida, Naoki Kato, Osamu Segawa, Takeshi Nakamura, Kazuhiro Hotta

Figure 1 for Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Figure 2 for Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Figure 3 for Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Figure 4 for Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Abstract:We propose a Ground IoU (Gr-IoU) to address the data association problem in multi-object tracking. When tracking objects detected by a camera, it often occurs that the same object is assigned different IDs in consecutive frames, especially when objects are close to each other or overlapping. To address this issue, we introduce Gr-IoU, which takes into account the 3D structure of the scene. Gr-IoU transforms traditional bounding boxes from the image space to the ground plane using the vanishing point geometry. The IoU calculated with these transformed bounding boxes is more sensitive to the front-to-back relationships of objects, thereby improving data association accuracy and reducing ID switches. We evaluated our Gr-IoU method on the MOT17 and MOT20 datasets, which contain diverse tracking scenarios including crowded scenes and sequences with frequent occlusions. Experimental results demonstrated that Gr-IoU outperforms conventional real-time methods without appearance features.

* Accepted for the ECCV 2024 Workshop on Affective Behavior Analysis in-the-wild(ABAW)

Via

Access Paper or Ask Questions

Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video

Nov 04, 2020

Naoki Kato, Hiroto Honda, Yusuke Uchida

Figure 1 for Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video

Figure 2 for Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video

Figure 3 for Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video

Abstract:The effectiveness of the approaches to predict 3D poses from 2D poses estimated in each frame of a video has been demonstrated for 3D human pose estimation. However, 2D poses without appearance information of persons have much ambiguity with respect to the joint depths. In this paper, we propose to estimate a 3D pose in each frame of a video and refine it considering temporal information. The proposed approach reduces the ambiguity of the joint depths and improves the 3D pose estimation accuracy.

Via

Access Paper or Ask Questions

Improving Multi-Person Pose Estimation using Label Correction

Nov 08, 2018

Naoki Kato, Tianqi Li, Kohei Nishino, Yusuke Uchida

Figure 1 for Improving Multi-Person Pose Estimation using Label Correction

Figure 2 for Improving Multi-Person Pose Estimation using Label Correction

Figure 3 for Improving Multi-Person Pose Estimation using Label Correction

Figure 4 for Improving Multi-Person Pose Estimation using Label Correction

Abstract:Significant attention is being paid to multi-person pose estimation methods recently, as there has been rapid progress in the field owing to convolutional neural networks. Especially, recent method which exploits part confidence maps and Part Affinity Fields (PAFs) has achieved accurate real-time prediction of multi-person keypoints. However, human annotated labels are sometimes inappropriate for learning models. For example, if there is a limb that extends outside an image, a keypoint for the limb may not have annotations because it exists outside of the image, and thus the labels for the limb can not be generated. If a model is trained with data including such missing labels, the output of the model for the location, even though it is correct, is penalized as a false positive, which is likely to cause negative effects on the performance of the model. In this paper, we point out the existence of some patterns of inappropriate labels, and propose a novel method for correcting such labels with a teacher model trained on such incomplete data. Experiments on the COCO dataset show that training with the corrected labels improves the performance of the model and also speeds up training.

Via

Access Paper or Ask Questions