Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammadreza Baharani

MoFM: A Large-Scale Human Motion Foundation Model

Feb 08, 2025

Mohammadreza Baharani, Ghazal Alinezhad Noghre, Armin Danesh Pazho, Gabriel Maldonado, Hamed Tabkhi

Abstract:AFoundation Models (FM) have increasingly drawn the attention of researchers due to their scalability and generalization across diverse tasks. Inspired by the success of FMs and the principles that have driven advancements in Large Language Models (LLMs), we introduce MoFM as a novel Motion Foundation Model. MoFM is designed for the semantic understanding of complex human motions in both time and space. To facilitate large-scale training, MotionBook, a comprehensive human motion dictionary of discretized motions is designed and employed. MotionBook utilizes Thermal Cubes to capture spatio-temporal motion heatmaps, applying principles from discrete variational models to encode human movements into discrete units for a more efficient and scalable representation. MoFM, trained on a large corpus of motion data, provides a foundational backbone adaptable to diverse downstream tasks, supporting paradigms such as one-shot, unsupervised, and supervised tasks. This versatility makes MoFM well-suited for a wide range of motion-based applications.

Via

Access Paper or Ask Questions

Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things

Jan 09, 2023

Armin Danesh Pazho, Christopher Neff, Ghazal Alinezhad Noghre, Babak Rahimi Ardabili, Shanle Yao, Mohammadreza Baharani, Hamed Tabkhi

Figure 1 for Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things

Figure 2 for Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things

Figure 3 for Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things

Figure 4 for Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things

Abstract:With the advancement of vision-based artificial intelligence, the proliferation of the Internet of Things connected cameras, and the increasing societal need for rapid and equitable security, the demand for accurate real-time intelligent surveillance has never been higher. This article presents Ancilia, an end-to-end scalable, intelligent video surveillance system for the Artificial Intelligence of Things. Ancilia brings state-of-the-art artificial intelligence to real-world surveillance applications while respecting ethical concerns and performing high-level cognitive tasks in real-time. Ancilia aims to revolutionize the surveillance landscape, to bring more effective, intelligent, and equitable security to the field, resulting in safer and more secure communities without requiring people to compromise their right to privacy.

Via

Access Paper or Ask Questions

DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Aug 01, 2021

Mohammadreza Baharani, Vinit Katariya, Nichole Morris, Omidreza Shoghli, Hamed Tabkhi

Figure 1 for DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Figure 2 for DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Figure 3 for DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Figure 4 for DeepTrack: Lightweight Deep Learning for Vehicle Path Prediction in Highways

Abstract:Vehicle trajectory prediction is an essential task for enabling many intelligent transportation systems. While there have been some promising advances in the field, there is a need for new agile algorithms with smaller model sizes and lower computational requirements. This article presents DeepTrack, a novel deep learning algorithm customized for real-time vehicle trajectory prediction in highways. In contrast to previous methods, the vehicle dynamics are encoded using Agile Temporal Convolutional Networks (ATCNs) to provide more robust time prediction with less computation. ATCN also uses depthwise convolution, which reduces the complexity of models compared to existing approaches in terms of model size and operations. Overall, our experimental results demonstrate that DeepTrack achieves comparable accuracy to state-of-the-art trajectory prediction models but with smaller model sizes and lower computational complexity, making it more suitable for real-world deployment.

Via

Access Paper or Ask Questions

ATCN: Agile Temporal Convolutional Networks for Processing of Time Series on Edge

Nov 11, 2020

Mohammadreza Baharani, Steven Furgurson, Babak Parkhideh, Hamed Tabkhi

Figure 1 for ATCN: Agile Temporal Convolutional Networks for Processing of Time Series on Edge

Figure 2 for ATCN: Agile Temporal Convolutional Networks for Processing of Time Series on Edge

Figure 3 for ATCN: Agile Temporal Convolutional Networks for Processing of Time Series on Edge

Figure 4 for ATCN: Agile Temporal Convolutional Networks for Processing of Time Series on Edge

Abstract:This paper presents a scalable deep learning model called Agile Temporal Convolutional Network (ATCN) for high-accurate fast classification and time series prediction in resource-constrained embedded systems. ATCN is primarily designed for mobile embedded systems with performance and memory constraints such as wearable biomedical devices and real-time reliability monitoring systems. It makes fundamental improvements over the mainstream temporal convolutional neural networks, including the incorporation of separable depth-wise convolution to reduce the computational complexity of the model and residual connections as time attention machines, increase the network depth and accuracy. The result of this configurability makes the ATCN a family of compact networks with formalized hyper-parameters that allow the model architecture to be configurable and adjusted based on the application requirements. We demonstrate the capabilities of our proposed ATCN on accuracy and performance trade-off on three embedded applications, including transistor reliability monitoring, heartbeat classification of ECG signals, and digit classification. Our comparison results against state-of-the-art approaches demonstrate much lower computation and memory demand for faster processing with better prediction and classification accuracy. The source code of the ATCN model is publicly available at https://github.com/TeCSAR-UNCC/ATCN.

Via

Access Paper or Ask Questions

REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking

Nov 25, 2019

Christopher Neff, Matías Mendieta, Shrey Mohan, Mohammadreza Baharani, Samuel Rogers, Hamed Tabkhi

Figure 1 for REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking

Figure 2 for REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking

Figure 3 for REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking

Figure 4 for REVAMP$^2$T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking

Abstract:This article presents REVAMP$^2$T, Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking, as an integrated end-to-end IoT system for privacy-built-in decentralized situational awareness. REVAMP$^2$T presents novel algorithmic and system constructs to push deep learning and video analytics next to IoT devices (i.e. video cameras). On the algorithm side, REVAMP$^2$T proposes a unified integrated computer vision pipeline for detection, re-identification, and tracking across multiple cameras without the need for storing the streaming data. At the same time, it avoids facial recognition, and tracks and re-identifies pedestrians based on their key features at runtime. On the IoT system side, REVAMP$^2$T provides infrastructure to maximize hardware utilization on the edge, orchestrates global communications, and provides system-wide re-identification, without the use of personally identifiable information, for a distributed IoT network. For the results and evaluation, this article also proposes a new metric, Accuracy$\cdot$Efficiency (\AE), for holistic evaluation of IoT systems for real-time video analytics based on accuracy, performance, and power efficiency. REVAMP$^2$T outperforms current state-of-the-art by as much as thirteen-fold \AE~improvement.

* Published as an article paper in IEEE Internet of Things Journal: Special Issue on Privacy and Security in Distributed Edge Computing and Evolving IoT

Via

Access Paper or Ask Questions

Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Aug 19, 2019

Mohammadreza Baharani, Shrey Mohan, Hamed Tabkhi

Figure 1 for Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Figure 2 for Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Figure 3 for Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Figure 4 for Real-time Person Re-identification at the Edge: A Mixed Precision Approach

Abstract:A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server $3.25\times$ reaching to 27.77fps and $1.75\times$, respectively and decreases power consumption on the edge node by $1.45\times$, while it deteriorates accuracy only 5.6\% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid.

* International Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in Computer Science
* This is a pre-print of an article published in International Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in Computer Science. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-27272-2_3

Via

Access Paper or Ask Questions

Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters

Aug 03, 2019

Mohammadreza Baharani, Mehrdad Biglarbegian, Babak Parkhideh, Hamed Tabkhi

Figure 1 for Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters

Figure 2 for Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters

Figure 3 for Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters

Figure 4 for Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters

Abstract:With the significant growth of advanced high-frequency power converters, on-line monitoring and active reliability assessment of power electronic devices are extremely crucial. This article presents a transformative approach, named Deep Learning Reliability Awareness of Converters at the Edge (Deep RACE), for real-time reliability modeling and prediction of high-frequency MOSFET power electronic converters. Deep RACE offers a holistic solution which comprises algorithm advances, and full system integration (from the cloud down to the edge node) to create a near real-time reliability awareness. On the algorithm side, this paper proposes a deep learning algorithmic solution based on stacked LSTM for collective reliability training and inference across collective MOSFET converters based on device resistance changes. Deep RACE also proposes an integrative edge-to-cloud solution to offer a scalable decentralized devices-specific reliability monitoring, awareness, and modeling. The MOSFET convertors are IoT devices which have been empowered with edge real-time deep learning processing capabilities. The proposed Deep RACE solution has been prototyped and implemented through learning from MOSFET data set provided by NASA. Our experimental results show an average miss prediction of $8.9\%$ over five different devices which is a much higher accuracy compared to well-known classical approaches (Kalman Filter, and Particle Filter). Deep RACE only requires $26ms$ processing time and $1.87W$ computing power on Edge IoT device.

* IEEE Internet of Things Journal, 2019
* 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions