Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Naga Venkata Sai Raviteja Chappa

LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition

Oct 28, 2024

Naga Venkata Sai Raviteja Chappa, Khoa Luu

Abstract:Group Activity Recognition (GAR) remains challenging in computer vision due to the complex nature of multi-agent interactions. This paper introduces LiGAR, a LIDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition. LiGAR leverages LiDAR data as a structural backbone to guide the processing of visual and textual information, enabling robust handling of occlusions and complex spatial arrangements. Our framework incorporates a Multi-Scale LIDAR Transformer, Cross-Modal Guided Attention, and an Adaptive Fusion Module to integrate multi-modal data at different semantic levels effectively. LiGAR's hierarchical architecture captures group activities at various granularities, from individual actions to scene-level dynamics. Extensive experiments on the JRDB-PAR, Volleyball, and NBA datasets demonstrate LiGAR's superior performance, achieving state-of-the-art results with improvements of up to 10.6% in F1-score on JRDB-PAR and 5.9% in Mean Per Class Accuracy on the NBA dataset. Notably, LiGAR maintains high performance even when LiDAR data is unavailable during inference, showcasing its adaptability. Our ablation studies highlight the significant contributions of each component and the effectiveness of our multi-modal, multi-scale approach in advancing the field of group activity recognition.

* 14 pages, 4 figures, 10 tables

Via

Access Paper or Ask Questions

OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation

May 22, 2022

Thanh-Dat Truong, Naga Venkata Sai Raviteja Chappa, Xuan Bac Nguyen, Ngan Le, Ashley Dowling, Khoa Luu

Figure 1 for OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation

Figure 2 for OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation

Figure 3 for OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation

Figure 4 for OTAdapt: Optimal Transport-based Approach For Unsupervised Domain Adaptation

Abstract:Unsupervised domain adaptation is one of the challenging problems in computer vision. This paper presents a novel approach to unsupervised domain adaptations based on the optimal transport-based distance. Our approach allows aligning target and source domains without the requirement of meaningful metrics across domains. In addition, the proposal can associate the correct mapping between source and target domains and guarantee a constraint of topology between source and target domains. The proposed method is evaluated on different datasets in various problems, i.e. (i) digit recognition on MNIST, MNIST-M, USPS datasets, (ii) Object recognition on Amazon, Webcam, DSLR, and VisDA datasets, (iii) Insect Recognition on the IP102 dataset. The experimental results show that our proposed method consistently improves performance accuracy. Also, our framework could be incorporated with any other CNN frameworks within an end-to-end deep network design for recognition problems to improve their performance.

* Accepted to ICPR 2022

Via

Access Paper or Ask Questions