Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Greg Castanon

Fine-grained Activities of People Worldwide

Jul 11, 2022

Jeffrey Byrne, Greg Castanon, Zhongheng Li, Gil Ettinger

Figure 1 for Fine-grained Activities of People Worldwide

Figure 2 for Fine-grained Activities of People Worldwide

Figure 3 for Fine-grained Activities of People Worldwide

Figure 4 for Fine-grained Activities of People Worldwide

Abstract:Every day, humans perform many closely related activities that involve subtle discriminative motions, such as putting on a shirt vs. putting on a jacket, or shaking hands vs. giving a high five. Activity recognition by ethical visual AI could provide insights into our patterns of daily life, however existing activity recognition datasets do not capture the massive diversity of these human activities around the world. To address this limitation, we introduce Collector, a free mobile app to record video while simultaneously annotating objects and activities of consented subjects. This new data collection platform was used to curate the Consented Activities of People (CAP) dataset, the first large-scale, fine-grained activity dataset of people worldwide. The CAP dataset contains 1.45M video clips of 512 fine grained activity labels of daily life, collected by 780 subjects in 33 countries. We provide activity classification and activity detection benchmarks for this dataset, and analyze baseline results to gain insight into how people around with world perform common activities. The dataset, benchmarks, evaluation tools, public leaderboards and mobile apps are available for use at visym.github.io/cap.

Via

Access Paper or Ask Questions

Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

May 01, 2014

Greg Castanon, Mohamed Elgharib, Venkatesh Saligrama, Pierre-Marc Jodoin

Figure 1 for Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

Figure 2 for Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

Figure 3 for Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

Figure 4 for Retrieval in Long Surveillance Videos using User Described Motion and Object Attributes

Abstract:We present a content-based retrieval method for long surveillance videos both for wide-area (Airborne) as well as near-field imagery (CCTV). Our goal is to retrieve video segments, with a focus on detecting objects moving on routes, that match user-defined events of interest. The sheer size and remote locations where surveillance videos are acquired, necessitates highly compressed representations that are also meaningful for supporting user-defined queries. To address these challenges we archive long-surveillance video through lightweight processing based on low-level local spatio-temporal extraction of motion and object features. These are then hashed into an inverted index using locality-sensitive hashing (LSH). This local approach allows for query flexibility as well as leads to significant gains in compression. Our second task is to extract partial matches to the user-created query and assembles them into full matches using Dynamic Programming (DP). DP exploits causality to assemble the indexed low level features into a video segment which matches the query route. We examine CCTV and Airborne footage, whose low contrast makes motion extraction more difficult. We generate robust motion estimates for Airborne data using a tracklets generation algorithm while we use Horn and Schunck approach to generate motion estimates for CCTV. Our approach handles long routes, low contrasts and occlusion. We derive bounds on the rate of false positives and demonstrate the effectiveness of the approach for counting, motion pattern recognition and abandoned object applications.

* 13 pages

Via

Access Paper or Ask Questions