Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Aug 19, 2024

Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian

Figure 1 for Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Figure 2 for Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Figure 3 for Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Figure 4 for Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Share this with someone who'll enjoy it:

Abstract:Human Action Recognition (HAR) stands as a pivotal research domain in both computer vision and artificial intelligence, with RGB cameras dominating as the preferred tool for investigation and innovation in this field. However, in real-world applications, RGB cameras encounter numerous challenges, including light conditions, fast motion, and privacy concerns. Consequently, bio-inspired event cameras have garnered increasing attention due to their advantages of low energy consumption, high dynamic range, etc. Nevertheless, most existing event-based HAR datasets are low resolution ($346 \times 260$). In this paper, we propose a large-scale, high-definition ($1280 \times 800$) human action recognition dataset based on the CeleX-V event camera, termed CeleX-HAR. It encompasses 150 commonly occurring action categories, comprising a total of 124,625 video sequences. Various factors such as multi-view, illumination, action speed, and occlusion are considered when recording these data. To build a more comprehensive benchmark dataset, we report over 20 mainstream HAR models for future works to compare. In addition, we also propose a novel Mamba vision backbone network for event stream based HAR, termed EVMamba, which equips the spatial plane multi-directional scanning and novel voxel temporal scanning mechanism. By encoding and mining the spatio-temporal information of event streams, our EVMamba has achieved favorable results across multiple datasets. Both the dataset and source code will be released on \url{https://github.com/Event-AHU/CeleX-HAR}

* In Peer Review

View paper on

Share this with someone who'll enjoy it:

Title:Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

Paper and Code