Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Riley Tavassoli

Active shooter detection and robust tracking utilizing supplemental synthetic data

Sep 06, 2023

Joshua R. Waite, Jiale Feng, Riley Tavassoli, Laura Harris, Sin Yong Tan, Subhadeep Chakraborty, Soumik Sarkar

Figure 1 for Active shooter detection and robust tracking utilizing supplemental synthetic data

Figure 2 for Active shooter detection and robust tracking utilizing supplemental synthetic data

Figure 3 for Active shooter detection and robust tracking utilizing supplemental synthetic data

Figure 4 for Active shooter detection and robust tracking utilizing supplemental synthetic data

Abstract:The increasing concern surrounding gun violence in the United States has led to a focus on developing systems to improve public safety. One approach to developing such a system is to detect and track shooters, which would help prevent or mitigate the impact of violent incidents. In this paper, we proposed detecting shooters as a whole, rather than just guns, which would allow for improved tracking robustness, as obscuring the gun would no longer cause the system to lose sight of the threat. However, publicly available data on shooters is much more limited and challenging to create than a gun dataset alone. Therefore, we explore the use of domain randomization and transfer learning to improve the effectiveness of training with synthetic data obtained from Unreal Engine environments. This enables the model to be trained on a wider range of data, increasing its ability to generalize to different situations. Using these techniques with YOLOv8 and Deep OC-SORT, we implemented an initial version of a shooter tracking system capable of running on edge hardware, including both a Raspberry Pi and a Jetson Nano.

* 11 pages, 6 figures

Via

Access Paper or Ask Questions

Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception

Aug 31, 2023

Riley Tavassoli, Mani Amani, Reza Akhavian

Figure 1 for Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception

Figure 2 for Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception

Figure 3 for Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception

Figure 4 for Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception

Abstract:Vision-language models (VLMs) have shown powerful capabilities in visual question answering and reasoning tasks by combining visual representations with the abstract skill set large language models (LLMs) learn during pretraining. Vision, while the most popular modality to augment LLMs with, is only one representation of a scene. In human-robot interaction scenarios, robot perception requires accurate scene understanding by the robot. In this paper, we define and demonstrate a method of aligning the embedding spaces of different modalities (in this case, inertial measurement unit (IMU) data) to the vision embedding space through a combination of supervised and contrastive training, enabling the VLM to understand and reason about these additional modalities without retraining. We opt to give the model IMU embeddings directly over using a separate human activity recognition model that feeds directly into the prompt to allow for any nonlinear interactions between the query, image, and IMU signal that would be lost by mapping the IMU data to a discrete activity label. Further, we demonstrate our methodology's efficacy through experiments involving human activity recognition using IMU data and visual inputs. Our results show that using multiple modalities as input improves the VLM's scene understanding and enhances its overall performance in various tasks, thus paving the way for more versatile and capable language models in multi-modal contexts.

* Preprint submitted to Information Fusion

Via

Access Paper or Ask Questions

Robust Activity Recognition for Adaptive Worker-Robot Interaction using Transfer Learning

Aug 28, 2023

Farid Shahnavaz, Riley Tavassoli, Reza Akhavian

Abstract:Human activity recognition (HAR) using machine learning has shown tremendous promise in detecting construction workers' activities. HAR has many applications in human-robot interaction research to enable robots' understanding of human counterparts' activities. However, many existing HAR approaches lack robustness, generalizability, and adaptability. This paper proposes a transfer learning methodology for activity recognition of construction workers that requires orders of magnitude less data and compute time for comparable or better classification accuracy. The developed algorithm transfers features from a model pre-trained by the original authors and fine-tunes them for the downstream task of activity recognition in construction. The model was pre-trained on Kinetics-400, a large-scale video-based human activity recognition dataset with 400 distinct classes. The model was fine-tuned and tested using videos captured from manual material handling (MMH) activities found on YouTube. Results indicate that the fine-tuned model can recognize distinct MMH tasks in a robust and adaptive manner which is crucial for the widespread deployment of collaborative robots in construction.

* 2023 ASCE International Conference on Computing in Civil Engineering (I3CE)

Via

Access Paper or Ask Questions