Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arslan Munir

ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Aug 25, 2022

James Wensel, Hayat Ullah, Arslan Munir

Figure 1 for ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Figure 2 for ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Figure 3 for ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Figure 4 for ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos

Abstract:Human activity recognition is an emerging and important area in computer vision which seeks to determine the activity an individual or group of individuals are performing. The applications of this field ranges from generating highlight videos in sports, to intelligent surveillance and gesture recognition. Most activity recognition systems rely on a combination of convolutional neural networks (CNNs) to perform feature extraction from the data and recurrent neural networks (RNNs) to determine the time dependent nature of the data. This paper proposes and designs two transformer neural networks for human activity recognition: a recurrent transformer (ReT), a specialized neural network used to make predictions on sequences of data, as well as a vision transformer (ViT), a transformer optimized for extracting salient features from images, to improve speed and scalability of activity recognition. We have provided an extensive comparison of the proposed transformer neural networks with the contemporary CNN and RNN-based human activity recognition models in terms of speed and accuracy.

Via

Access Paper or Ask Questions

Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Aug 09, 2022

Hayat Ullah, Arslan Munir

Figure 1 for Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Figure 2 for Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Figure 3 for Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Figure 4 for Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Abstract:Vision-based human activity recognition has emerged as one of the essential research areas in video analytics domain. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the human activity recognition task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency and robustness, resulting in a biased tradeoff in their proposals to deal with challenging human activity recognition problem. To overcome the limitations of contemporary deep learning models for human activity recognition, this paper presents a computationally efficient yet generic spatial-temporal cascaded framework that exploits the deep discriminative spatial and temporal features for human activity recognition. For efficient representation of human actions, we have proposed an efficient dual attentional convolutional neural network (CNN) architecture that leverages a unified channel-spatial attention mechanism to extract human-centric salient features in video frames. The dual channel-spatial attention layers together with the convolutional layers learn to be more attentive in the spatial receptive fields having objects over the number of feature maps. The extracted discriminative salient features are then forwarded to stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning. Extensive experiments are conducted, where the obtained results show that the proposed framework attains an improvement in execution time up to 167 times in terms of frames per second as compared to most of the contemporary action recognition methods.

Via

Access Paper or Ask Questions

Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Nov 09, 2021

Mahmood Azhar Qureshi, Arslan Munir

Figure 1 for Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Figure 2 for Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Figure 3 for Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Figure 4 for Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks

Abstract:Sparse convolutional neural networks (CNNs) have gained significant traction over the past few years as sparse CNNs can drastically decrease the model size and computations, if exploited befittingly, as compared to their dense counterparts. Sparse CNNs often introduce variations in the layer shapes and sizes, which can prevent dense accelerators from performing well on sparse CNN models. Recently proposed sparse accelerators like SCNN, Eyeriss v2, and SparTen, actively exploit the two-sided or full sparsity, that is, sparsity in both weights and activations, for performance gains. These accelerators, however, either have inefficient micro-architecture, which limits their performance, have no support for non-unit stride convolutions and fully-connected (FC) layers, or suffer massively from systematic load imbalance. To circumvent these issues and support both sparse and dense models, we propose Phantom, a multi-threaded, dynamic, and flexible neural computational core. Phantom uses sparse binary mask representation to actively lookahead into sparse computations, and dynamically schedule its computational threads to maximize the thread utilization and throughput. We also generate a two-dimensional (2D) mesh architecture of Phantom neural computational cores, which we refer to as Phantom-2D accelerator, and propose a novel dataflow that supports all layers of a CNN, including unit and non-unit stride convolutions, and FC layers. In addition, Phantom-2D uses a two-level load balancing strategy to minimize the computational idling, thereby, further improving the hardware utilization. To show support for different types of layers, we evaluate the performance of the Phantom architecture on VGG16 and MobileNet. Our simulations show that the Phantom-2D accelerator attains a performance gain of 12x, 4.1x, 1.98x, and 2.36x, over dense architectures, SCNN, SparTen, and Eyeriss v2, respectively.

* A version of this work is currently under review at the ACM Transactions on Embedded Computing Systems (TECS)

Via

Access Paper or Ask Questions

NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Jul 19, 2020

Mahmood Azhar Qureshi, Arslan Munir

Figure 1 for NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Figure 2 for NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Figure 3 for NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Figure 4 for NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Abstract:Convolutional neural networks (CNNs) require high throughput hardware accelerators for real time applications owing to their huge computational cost. Most traditional CNN accelerators rely on single core, linear processing elements (PEs) in conjunction with 1D dataflows for accelerating convolution operations. This limits the maximum achievable ratio of peak throughput per PE count to unity. Most of the past works optimize their dataflows to attain close to a 100% hardware utilization to reach this ratio. In this paper, we introduce a high throughput, multi-threaded, log-based PE core. The designed core provides a 200% increase in peak throughput per PE count while only incurring a 6% increase in area overhead compared to a single, linear multiplier PE core with same output bit precision. We also present a 2D weight broadcast dataflow which exploits the multi-threaded nature of the PE cores to achieve a high hardware utilization per layer for various CNNs. The entire architecture, which we refer to as NeuroMAX, is implemented on Xilinx Zynq 7020 SoC at 200 MHz processing clock. Detailed analysis is performed on throughput, hardware utilization, area and power breakdown, and latency to show performance improvement compared to previous FPGA and ASIC designs.

* To be published in ICCAD 2020

Via

Access Paper or Ask Questions

TrolleyMod v1.0: An Open-Source Simulation and Data-Collection Platform for Ethical Decision Making in Autonomous Vehicles

Nov 14, 2018

Vahid Behzadan, James Minton, Arslan Munir

Figure 1 for TrolleyMod v1.0: An Open-Source Simulation and Data-Collection Platform for Ethical Decision Making in Autonomous Vehicles

Figure 2 for TrolleyMod v1.0: An Open-Source Simulation and Data-Collection Platform for Ethical Decision Making in Autonomous Vehicles

Figure 3 for TrolleyMod v1.0: An Open-Source Simulation and Data-Collection Platform for Ethical Decision Making in Autonomous Vehicles

Abstract:This paper presents TrolleyMod v1.0, an open-source platform based on the CARLA simulator for the collection of ethical decision-making data for autonomous vehicles. This platform is designed to facilitate experiments aiming to observe and record human decisions and actions in high-fidelity simulations of ethical dilemmas that occur in the context of driving. Targeting experiments in the class of trolley problems, TrolleyMod provides a seamless approach to creating new experimental settings and environments with the realistic physics-engine and the high-quality graphical capabilities of CARLA and the Unreal Engine. Also, TrolleyMod provides a straightforward interface between the CARLA environment and Python to enable the implementation of custom controllers, such as deep reinforcement learning agents. The results of such experiments can be used for sociological analyses, as well as the training and tuning of value-aligned autonomous vehicles based on social values that are inferred from observations.

Via

Access Paper or Ask Questions

Emergence of Addictive Behaviors in Reinforcement Learning Agents

Nov 14, 2018

Vahid Behzadan, Roman V. Yampolskiy, Arslan Munir

Figure 1 for Emergence of Addictive Behaviors in Reinforcement Learning Agents

Figure 2 for Emergence of Addictive Behaviors in Reinforcement Learning Agents

Figure 3 for Emergence of Addictive Behaviors in Reinforcement Learning Agents

Abstract:This paper presents a novel approach to the technical analysis of wireheading in intelligent agents. Inspired by the natural analogues of wireheading and their prevalent manifestations, we propose the modeling of such phenomenon in Reinforcement Learning (RL) agents as psychological disorders. In a preliminary step towards evaluating this proposal, we study the feasibility and dynamics of emergent addictive policies in Q-learning agents in the tractable environment of the game of Snake. We consider a slightly modified settings for this game, in which the environment provides a "drug" seed alongside the original "healthy" seed for the consumption of the snake. We adopt and extend an RL-based model of natural addiction to Q-learning agents in this settings, and derive sufficient parametric conditions for the emergence of addictive behaviors in such agents. Furthermore, we evaluate our theoretical analysis with three sets of simulation-based experiments. The results demonstrate the feasibility of addictive wireheading in RL agents, and provide promising venues of further research on the psychopathological modeling of complex AI safety problems.

Via

Access Paper or Ask Questions

The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Oct 23, 2018

Vahid Behzadan, Arslan Munir

Figure 1 for The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Figure 2 for The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Figure 3 for The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Figure 4 for The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Abstract:Since the inception of Deep Reinforcement Learning (DRL) algorithms, there has been a growing interest in both research and industrial communities in the promising potentials of this paradigm. The list of current and envisioned applications of deep RL ranges from autonomous navigation and robotics to control applications in the critical infrastructure, air traffic control, defense technologies, and cybersecurity. While the landscape of opportunities and the advantages of deep RL algorithms are justifiably vast, the security risks and issues in such algorithms remain largely unexplored. To facilitate and motivate further research on these critical challenges, this paper presents a foundational treatment of the security problem in DRL. We formulate the security requirements of DRL, and provide a high-level threat model through the classification and identification of vulnerabilities, attack vectors, and adversarial capabilities. Furthermore, we present a review of current literature on security of deep RL from both offensive and defensive perspectives. Lastly, we enumerate critical research venues and open problems in mitigation and prevention of intentional attacks against deep RL as a roadmap for further research in this area.

* arXiv admin note: text overlap with arXiv:1807.06064, arXiv:1712.03632, arXiv:1803.02811, arXiv:1710.00814 by other authors

Via

Access Paper or Ask Questions

Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise

Jun 04, 2018

Vahid Behzadan, Arslan Munir

Figure 1 for Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise

Figure 2 for Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise

Figure 3 for Mitigation of Policy Manipulation Attacks on Deep Q-Networks with Parameter-Space Noise

Abstract:Recent developments have established the vulnerability of deep reinforcement learning to policy manipulation attacks via intentionally perturbed inputs, known as adversarial examples. In this work, we propose a technique for mitigation of such attacks based on addition of noise to the parameter space of deep reinforcement learners during training. We experimentally verify the effect of parameter-space noise in reducing the transferability of adversarial examples, and demonstrate the promising performance of this technique in mitigating the impact of whitebox and blackbox attacks at both test and training times.

* arXiv admin note: substantial text overlap with arXiv:1701.04143, arXiv:1712.09344

Via

Access Paper or Ask Questions

Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles

Jun 04, 2018

Vahid Behzadan, Arslan Munir

Figure 1 for Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles

Figure 2 for Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles

Figure 3 for Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles

Abstract:With the rapidly growing interest in autonomous navigation, the body of research on motion planning and collision avoidance techniques has enjoyed an accelerating rate of novel proposals and developments. However, the complexity of new techniques and their safety requirements render the bulk of current benchmarking frameworks inadequate, thus leaving the need for efficient comparison techniques unanswered. This work proposes a novel framework based on deep reinforcement learning for benchmarking the behavior of collision avoidance mechanisms under the worst-case scenario of dealing with an optimal adversarial agent, trained to drive the system into unsafe states. We describe the architecture and flow of this framework as a benchmarking solution, and demonstrate its efficacy via a practical case study of comparing the reliability of two collision avoidance mechanisms in response to intentional collision attempts.

Via

Access Paper or Ask Questions

A Psychopathological Approach to Safety Engineering in AI and AGI

May 23, 2018

Vahid Behzadan, Arslan Munir, Roman V. Yampolskiy

Figure 1 for A Psychopathological Approach to Safety Engineering in AI and AGI

Abstract:The complexity of dynamics in AI techniques is already approaching that of complex adaptive systems, thus curtailing the feasibility of formal controllability and reachability analysis in the context of AI safety. It follows that the envisioned instances of Artificial General Intelligence (AGI) will also suffer from challenges of complexity. To tackle such issues, we propose the modeling of deleterious behaviors in AI and AGI as psychological disorders, thereby enabling the employment of psychopathological approaches to analysis and control of misbehaviors. Accordingly, we present a discussion on the feasibility of the psychopathological approaches to AI safety, and propose general directions for research on modeling, diagnosis, and treatment of psychological disorders in AGI.

Via

Access Paper or Ask Questions