Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bharath Ramesh

Unsupervised Motion Segmentation for Neuromorphic Aerial Surveillance

May 24, 2024

Sami Arja, Alexandre Marcireau, Saeed Afshar, Bharath Ramesh, Gregory Cohen

Abstract:Achieving optimal performance with frame-based vision sensors on aerial platforms poses a significant challenge due to the fundamental tradeoffs between bandwidth and latency. Event cameras, which draw inspiration from biological vision systems, present a promising alternative due to their exceptional temporal resolution, superior dynamic range, and minimal power requirements. Due to these properties, they are well-suited for processing and segmenting fast motions that require rapid reactions. However, previous methods for event-based motion segmentation encountered limitations, such as the need for per-scene parameter tuning or manual labelling to achieve satisfactory results. To overcome these issues, our proposed method leverages features from self-supervised transformers on both event data and optical flow information, eliminating the need for human annotations and reducing the parameter tuning problem. In this paper, we use an event camera with HD resolution onboard a highly dynamic aerial platform in an urban setting. We conduct extensive evaluations of our framework across multiple datasets, demonstrating state-of-the-art performance compared to existing works. Our method can effectively handle various types of motion and an arbitrary number of moving objects. Code and dataset are available at: \url{https://samiarja.github.io/evairborne/}

* 31 pages, 11 figures, 8 tables

Via

Access Paper or Ask Questions

Superevents: Towards Native Semantic Segmentation for Event-based Cameras

May 13, 2021

Weng Fei Low, Ankit Sonthalia, Zhi Gao, André van Schaik, Bharath Ramesh

Figure 1 for Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Figure 2 for Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Figure 3 for Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Figure 4 for Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Abstract:Most successful computer vision models transform low-level features, such as Gabor filter responses, into richer representations of intermediate or mid-level complexity for downstream visual tasks. These mid-level representations have not been explored for event cameras, although it is especially relevant to the visually sparse and often disjoint spatial information in the event stream. By making use of locally consistent intermediate representations, termed as superevents, numerous visual tasks ranging from semantic segmentation, visual tracking, depth estimation shall benefit. In essence, superevents are perceptually consistent local units that delineate parts of an object in a scene. Inspired by recent deep learning architectures, we present a novel method that employs lifetime augmentation for obtaining an event stream representation that is fed to a fully convolutional network to extract superevents. Our qualitative and quantitative experimental results on several sequences of a benchmark dataset highlights the significant potential for event-based downstream applications.

Via

Access Paper or Ask Questions

e-TLD: Event-based Framework for Dynamic Object Tracking

Sep 02, 2020

Bharath Ramesh, Shihao Zhang, Hong Yang, Andres Ussa, Matthew Ong, Garrick Orchard, Cheng Xiang

Abstract:This paper presents a long-term object tracking framework with a moving event camera under general tracking conditions. A first of its kind for these revolutionary cameras, the tracking framework uses a discriminative representation for the object with online learning, and detects and re-tracks the object when it comes back into the field-of-view. One of the key novelties is the use of an event-based local sliding window technique that tracks reliably in scenes with cluttered and textured background. In addition, Bayesian bootstrapping is used to assist real-time processing and boost the discriminative power of the object representation. On the other hand, when the object re-enters the field-of-view of the camera, a data-driven, global sliding window detector locates the object for subsequent tracking. Extensive experiments demonstrate the ability of the proposed framework to track and detect arbitrary objects of various shapes and sizes, including dynamic objects such as a human. This is a significant improvement compared to earlier works that simply track objects as long as they are visible under simpler background settings. Using the ground truth locations for five different objects under three motion settings, namely translation, rotation and 6-DOF, quantitative measurement is reported for the event-based tracking framework with critical insights on various performance issues. Finally, real-time implementation in C++ highlights tracking ability under scale, rotation, view-point and occlusion scenarios in a lab setting.

* 11 pages, 10 figures

Via

Access Paper or Ask Questions

A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-time Systems

Jul 21, 2020

Andres Ussa, Chockalingam Senthil Rajen, Deepak Singla, Jyotibdha Acharya, Gideon Fu Chuanrong, Arindam Basu, Bharath Ramesh

Figure 1 for A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-time Systems

Figure 2 for A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-time Systems

Figure 3 for A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-time Systems

Figure 4 for A Hybrid Neuromorphic Object Tracking and Classification Framework for Real-time Systems

Abstract:Deep learning inference that needs to largely take place on the 'edge' is a highly computational and memory intensive workload, making it intractable for low-power, embedded platforms such as mobile nodes and remote security applications. To address this challenge, this paper proposes a real-time, hybrid neuromorphic framework for object tracking and classification using event-based cameras that possess properties such as low-power consumption (5-14 mW) and high dynamic range (120 dB). Nonetheless, unlike traditional approaches of using event-by-event processing, this work uses a mixed frame and event approach to get energy savings with high performance. Using a frame-based region proposal method based on the density of foreground events, a hardware-friendly object tracking scheme is implemented using the apparent object velocity while tackling occlusion scenarios. The object track input is converted back to spikes for TrueNorth classification via the energy-efficient deep network (EEDN) pipeline. Using originally collected datasets, we train the TrueNorth model on the hardware track outputs, instead of using ground truth object locations as commonly done, and demonstrate the ability of our system to handle practical surveillance scenarios. As an optional paradigm, to exploit the low latency and asynchronous nature of neuromorphic vision sensors (NVS), we also propose a continuous-time tracker with C++ implementation where each event is processed individually. Thereby, we extensively compare the proposed methodologies to state-of-the-art event-based and frame-based methods for object tracking and classification, and demonstrate the use case of our neuromorphic approach for real-time and embedded applications without sacrificing performance. Finally, we also showcase the efficacy of the proposed system to a standard RGB camera setup when evaluated over several hours of traffic recordings.

* 11 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:1910.09806

Via

Access Paper or Ask Questions

EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Neuromorphic Vision Sensors

May 31, 2020

Deepak Singla, Vivek Mohan, Tarun Pulluri, Andres Ussa, Bharath Ramesh, Arindam Basu

Figure 1 for EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Neuromorphic Vision Sensors

Figure 2 for EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Neuromorphic Vision Sensors

Figure 3 for EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Neuromorphic Vision Sensors

Figure 4 for EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Neuromorphic Vision Sensors

Abstract:In this paper, we present a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic vision sensor (NVS) used in the application of traffic monitoring with a hardware efficient processing pipeline that optimizes memory and computational needs. The usage of NVS gives the advantage of rejecting background while it has a unique disadvantage of fragmented objects due to lack of events generated by smooth areas such as glass windows. To exploit the background removal, we propose an event based binary image (EBBI) creation that signals presence or absence of events in a frame duration. This reduces memory requirement and enables usage of simple algorithms like median filtering and connected component labeling (CCL) for denoise and region proposal (RP) respectively. To overcome the fragmentation issue, a YOLO inspired neural network based detector and classifier (NNDC) to merge fragmented region proposals has been proposed. Finally, a simplified version of Kalman filter, termed overlap based tracker (OT), exploiting overlap between detections and tracks is proposed with heuristics to overcome occlusion. The proposed pipeline is evaluated using more than 5 hours of traffic recordings. Our proposed hybrid architecture outperformed (AUC = $0.45$) Deep learning (DL) based tracker SiamMask (AUC = $0.33$) operating on simultaneously recorded RGB frames while requiring $2200\times$ less computations. Compared to pure event based mean shift (AUC = $0.31$), our approach requires $68\times$ more computations but provides much better performance. Finally, we also evaluated our performance on two different NVS: DAVIS and CeleX and demonstrated similar gains. To the best of our knowledge, this is the first report where an NVS based solution is directly compared to other simultaneously recorded frame based method and shows tremendous promise.

* 15 pages, 12 figures

Via

Access Paper or Ask Questions

HyNNA: Improved Performance for Neuromorphic Vision Sensor based Surveillance using Hybrid Neural Network Architecture

Mar 19, 2020

Deepak Singla, Soham Chatterjee, Lavanya Ramapantulu, Andres Ussa, Bharath Ramesh, Arindam Basu

Figure 1 for HyNNA: Improved Performance for Neuromorphic Vision Sensor based Surveillance using Hybrid Neural Network Architecture

Figure 2 for HyNNA: Improved Performance for Neuromorphic Vision Sensor based Surveillance using Hybrid Neural Network Architecture

Figure 3 for HyNNA: Improved Performance for Neuromorphic Vision Sensor based Surveillance using Hybrid Neural Network Architecture

Figure 4 for HyNNA: Improved Performance for Neuromorphic Vision Sensor based Surveillance using Hybrid Neural Network Architecture

Abstract:Applications in the Internet of Video Things (IoVT) domain have very tight constraints with respect to power and area. While neuromorphic vision sensors (NVS) may offer advantages over traditional imagers in this domain, the existing NVS systems either do not meet the power constraints or have not demonstrated end-to-end system performance. To address this, we improve on a recently proposed hybrid event-frame approach by using morphological image processing algorithms for region proposal and address the low-power requirement for object detection and classification by exploring various convolutional neural network (CNN) architectures. Specifically, we compare the results obtained from our object detection framework against the state-of-the-art low-power NVS surveillance system and show an improved accuracy of 82.16% from 63.1%. Moreover, we show that using multiple bits does not improve accuracy, and thus, system designers can save power and area by using only single bit event polarity information. In addition, we explore the CNN architecture space for object classification and show useful insights to trade-off accuracy for lower power using lesser memory and arithmetic operations.

* 4 pages, 2 figures

Via

Access Paper or Ask Questions

A low-power end-to-end hybrid neuromorphic framework for surveillance applications

Oct 25, 2019

Andres Ussa, Luca Della Vedova, Vandana Reddy Padala, Deepak Singla, Jyotibdha Acharya, Charles Zhang Lei, Garrick Orchard, Arindam Basu, Bharath Ramesh

Abstract:With the success of deep learning, object recognition systems that can be deployed for real-world applications are becoming commonplace. However, inference that needs to largely take place on the `edge' (not processed on servers), is a highly computational and memory intensive workload, making it intractable for low-power mobile nodes and remote security applications. To address this challenge, this paper proposes a low-power (5W) end-to-end neuromorphic framework for object tracking and classification using event-based cameras that possess desirable properties such as low power consumption (5-14 mW) and high dynamic range (120 dB). Nonetheless, unlike traditional approaches of using event-by-event processing, this work uses a mixed frame and event approach to get energy savings with high performance. Using a frame-based region proposal method based on the density of foreground events, a hardware-friendly object tracking is implemented using the apparent object velocity while tackling occlusion scenarios. For low-power classification of the tracked objects, the event camera is interfaced to IBM TrueNorth, which is time-multiplexed to tackle up to eight instances for a traffic monitoring application. The frame-based object track input is converted back to spikes for Truenorth classification via the energy efficient deep network (EEDN) pipeline. Using originally collected datasets, we train the TrueNorth model on the hardware track outputs, instead of using ground truth object locations as commonly done, and demonstrate the efficacy of our system to handle practical surveillance scenarios. Finally, we compare the proposed methodologies to state-of-the-art event-based systems for object tracking and classification, and demonstrate the use case of our neuromorphic approach for low-power applications without sacrificing on performance.

* 12 pages, 3 figures

Via

Access Paper or Ask Questions

EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors

Oct 04, 2019

Jyotibdha Acharya, Andres Ussa Caycedo, Vandana Reddy Padala, Rishi Raj Sidhu Singh, Garrick Orchard, Bharath Ramesh, Arindam Basu

Figure 1 for EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors

Figure 2 for EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors

Figure 3 for EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors

Figure 4 for EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors

Abstract:In this paper, we present EBBIOT-a novel paradigm for object tracking using stationary neuromorphic vision sensors in low-power sensor nodes for the Internet of Video Things (IoVT). Different from fully event based tracking or fully frame based approaches, we propose a mixed approach where we create event-based binary images (EBBI) that can use memory efficient noise filtering algorithms. We exploit the motion triggering aspect of neuromorphic sensors to generate region proposals based on event density counts with >1000X less memory and computes compared to frame based approaches. We also propose a simple overlap based tracker (OT) with prediction based handling of occlusion. Our overall approach requires 7X less memory and 3X less computations than conventional noise filtering and event based mean shift (EBMS) tracking. Finally, we show that our approach results in significantly higher precision and recall compared to EBMS approach as well as Kalman Filter tracker when evaluated over 1.1 hours of traffic recordings at two different locations.

* 6 pages, 5 figures

Via

Access Paper or Ask Questions

PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

Apr 24, 2019

Bharath Ramesh, Andres Ussa, Luca Della Vedova, Hong Yang, Garrick Orchard

Figure 1 for PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

Figure 2 for PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

Figure 3 for PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

Figure 4 for PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

Abstract:We present the first purely event-based, energy-efficient approach for object detection and categorization using an event camera. Compared to traditional frame-based cameras, choosing event cameras results in high temporal resolution (order of microseconds), low power consumption (few hundred mW) and wide dynamic range (120 dB) as attractive properties. However, event-based object recognition systems are far behind their frame-based counterparts in terms of accuracy. To this end, this paper presents an event-based feature extraction method devised by accumulating local activity across the image frame and then applying principal component analysis (PCA) to the normalized neighborhood region. Subsequently, we propose a backtracking-free k-d tree mechanism for efficient feature matching by taking advantage of the low-dimensionality of the feature representation. Additionally, the proposed k-d tree mechanism allows for feature selection to obtain a lower-dimensional dictionary representation when hardware resources are limited to implement dimensionality reduction. Consequently, the proposed system can be realized on a field-programmable gate array (FPGA) device leading to high performance over resource ratio. The proposed system is tested on real-world event-based datasets for object categorization, showing superior classification performance and relevance to state-of-the-art algorithms. Additionally, we verified the object detection method and real-time FPGA performance in lab settings under non-controlled illumination conditions with limited training data and ground truth annotations.

* Accepted in ACCV 2018 Workshops, to appear

Via

Access Paper or Ask Questions

DART: Distribution Aware Retinal Transform for Event-based Cameras

Oct 30, 2017

Bharath Ramesh, Hong Yang, Garrick Orchard, Ngoc Anh Le Thi, Cheng Xiang

Figure 1 for DART: Distribution Aware Retinal Transform for Event-based Cameras

Figure 2 for DART: Distribution Aware Retinal Transform for Event-based Cameras

Figure 3 for DART: Distribution Aware Retinal Transform for Event-based Cameras

Figure 4 for DART: Distribution Aware Retinal Transform for Event-based Cameras

Abstract:We introduce a new event-based visual descriptor, termed as distribution aware retinal transform (DART), for pattern recognition using silicon retina cameras. The DART descriptor captures the information of the spatio-temporal distribution of events, and forms a rich structural representation. Consequently, the event context encoded by DART greatly simplifies the feature correspondence problem, which is highly relevant to many event-based vision problems. The proposed descriptor is robust to scale and rotation variations without the need for spectral analysis. To demonstrate the effectiveness of the DART descriptors, they are employed as local features in the bag-of-features classification framework. The proposed framework is tested on the N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101 datasets, as well as a new object dataset, N-SOD (Neuromorphic-Single Object Dataset), collected to test unconstrained viewpoint recognition. We report a competitive classification accuracy of 97.95% on the N-MNIST and the best classification accuracy compared to existing works on the MNIST-DVS (99%), CIFAR10-DVS (65.9%) and NCaltech-101 (70.3%). Using the in-house N-SOD, we demonstrate real-time classification performance on an Intel Compute Stick directly interfaced to an event camera flying on-board a quadcopter. In addition, taking advantage of the high-temporal resolution of event cameras, the classification system is extended to tackle object tracking. Finally, we demonstrate efficient feature matching for event-based cameras using kd-trees.

* 12 pages, submitted to TPAMI in Oct 2017

Via

Access Paper or Ask Questions