Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rohit Bharadwaj

VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs

Jun 14, 2024

Rohit Bharadwaj, Hanan Gani, Muzammal Naseer, Fahad Shahbaz Khan, Salman Khan

Figure 1 for VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs

Figure 2 for VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs

Figure 3 for VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs

Figure 4 for VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs

Abstract:The recent developments in Large Multi-modal Video Models (Video-LMMs) have significantly enhanced our ability to interpret and analyze video data. Despite their impressive capabilities, current Video-LMMs have not been evaluated for anomaly detection tasks, which is critical to their deployment in practical scenarios e.g., towards identifying deepfakes, manipulated video content, traffic accidents and crimes. In this paper, we introduce VANE-Bench, a benchmark designed to assess the proficiency of Video-LMMs in detecting and localizing anomalies and inconsistencies in videos. Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models, encompassing a variety of subtle anomalies and inconsistencies grouped into five categories: unnatural transformations, unnatural appearance, pass-through, disappearance and sudden appearance. Additionally, our benchmark features real-world samples from existing anomaly detection datasets, focusing on crime-related irregularities, atypical pedestrian behavior, and unusual events. The task is structured as a visual question-answering challenge to gauge the models' ability to accurately detect and localize the anomalies within the videos. We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies. In conclusion, our research offers significant insights into the current capabilities of Video-LMMs in the realm of anomaly detection, highlighting the importance of our work in evaluating and improving these models for real-world applications. Our code and data is available at https://hananshafi.github.io/vane-benchmark/

* Data: https://huggingface.co/datasets/rohit901/VANE-Bench

Via

Access Paper or Ask Questions

Enhancing Novel Object Detection via Cooperative Foundational Models

Nov 22, 2023

Rohit Bharadwaj, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Enhancing Novel Object Detection via Cooperative Foundational Models

Figure 2 for Enhancing Novel Object Detection via Cooperative Foundational Models

Figure 3 for Enhancing Novel Object Detection via Cooperative Foundational Models

Figure 4 for Enhancing Novel Object Detection via Cooperative Foundational Models

Abstract:In this work, we address the challenging and emergent problem of novel object detection (NOD), focusing on the accurate detection of both known and novel object categories during inference. Traditional object detection algorithms are inherently closed-set, limiting their capability to handle NOD. We present a novel approach to transform existing closed-set detectors into open-set detectors. This transformation is achieved by leveraging the complementary strengths of pre-trained foundational models, specifically CLIP and SAM, through our cooperative mechanism. Furthermore, by integrating this mechanism with state-of-the-art open-set detectors such as GDINO, we establish new benchmarks in object detection performance. Our method achieves 17.42 mAP in novel object detection and 42.08 mAP for known objects on the challenging LVIS dataset. Adapting our approach to the COCO OVD split, we surpass the current state-of-the-art by a margin of 7.2 $ \text{AP}_{50} $ for novel classes. Our code is available at https://github.com/rohit901/cooperative-foundational-models .

* Code: https://github.com/rohit901/cooperative-foundational-models

Via

Access Paper or Ask Questions