Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shohreh Deldari

SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition

Oct 14, 2024

Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, Flora D. Salim

Abstract:In this work, we bridge the gap between wearable sensor technology and personalized AI assistants by enabling Large Language Models (LLMs) to understand time-series tasks like human activity recognition (HAR). Despite the strong reasoning and generalization capabilities of LLMs, leveraging them for sensor data tasks remains largely unexplored. This gap stems from challenges like the lack of semantic context in time-series data, computational limitations, and LLMs' difficulty processing numerical inputs. To address these issues, we introduce SensorLLM, a two-stage framework to unlock LLMs' potential for sensor data tasks. In the Sensor-Language Alignment Stage, we introduce special tokens for each sensor channel and automatically generate trend-descriptive text to align sensor data with textual inputs, enabling SensorLLM to capture numerical changes, channel-specific information, and sensor data of varying lengths-capabilities that existing LLMs typically struggle with, all without the need for human annotations. Next, in Task-Aware Tuning Stage, we refine the model for HAR classification using the frozen LLM and alignment module, achieving performance on par with or surpassing state-of-the-art models. We further demonstrate that SensorLLM evolves into an effective sensor learner, reasoner, and classifier through Sensor-Language Alignment, enabling it to generalize across diverse datasets for HAR tasks. We strongly believe our work lays the stepstone for future time-series and text alignment research, offering a path toward foundation models for sensor data.

Via

Access Paper or Ask Questions

AuditNet: A Conversational AI-based Security Assistant [DEMO]

Jul 19, 2024

Shohreh Deldari, Mohammad Goudarzi, Aditya Joshi, Arash Shaghaghi, Simon Finn, Flora D. Salim, Sanjay Jha

Abstract:In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate compliance checking on the go, in diverse domains, including but not limited to network infrastructure, legal contracts, educational standards, environmental regulations, and government policies. By leveraging retrieval-augmented generation using large language models, our framework automates the review, indexing, and retrieval of relevant, context-aware information, streamlining the process of verifying adherence to established guidelines and requirements. This AI assistant not only reduces the manual effort involved in compliance checks but also enhances accuracy and efficiency, supporting professionals in maintaining high standards of practice and ensuring regulatory compliance in their respective fields. We propose and demonstrate AuditNet, the first conversational AI security assistant designed to assist IoT network security experts by providing instant access to security standards, policies, and regulations.

Via

Access Paper or Ask Questions

ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Jun 19, 2024

Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

Figure 1 for ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Figure 2 for ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Figure 3 for ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Figure 4 for ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Abstract:Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark, ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks. The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets. Additionally, we introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects. This framework addresses challenges including memory complexity from long video clips, natural language complexity from open queries, and text-video misalignment. We posit that ViLCo-Bench, with greater complexity compared to existing continual learning benchmarks, would serve as a critical tool for exploring the video-language domain, extending beyond conventional class-incremental tasks, and addressing complex and limited annotation issues. The curated data, evaluations, and our novel method are available at https://github.com/cruiseresearchgroup/ViLCo .

* 14 pages, 4 figures, 8 tables, under review

Via

Access Paper or Ask Questions

A collection of the accepted papers for the Human-Centric Representation Learning workshop at AAAI 2024

Mar 14, 2024

Dimitris Spathis, Aaqib Saeed, Ali Etemad, Sana Tonekaboni, Stefanos Laskaridis, Shohreh Deldari, Chi Ian Tang, Patrick Schwab, Shyam Tailor

Abstract:This non-archival index is not complete, as some accepted papers chose to opt-out of inclusion. The list of all accepted papers is available on the workshop website.

Via

Access Paper or Ask Questions

Latent Masking for Multimodal Self-supervised Learning in Health Timeseries

Jul 31, 2023

Shohreh Deldari, Dimitris Spathis, Mohammad Malekzadeh, Fahim Kawsar, Flora Salim, Akhil Mathur

Abstract:Limited availability of labeled data for machine learning on biomedical time-series hampers progress in the field. Self-supervised learning (SSL) is a promising approach to learning data representations without labels. However, current SSL methods require expensive computations for negative pairs and are designed for single modalities, limiting their versatility. To overcome these limitations, we introduce CroSSL (Cross-modal SSL). CroSSL introduces two novel concepts: masking intermediate embeddings from modality-specific encoders and aggregating them into a global embedding using a cross-modal aggregator. This enables the handling of missing modalities and end-to-end learning of cross-modal patterns without prior data preprocessing or time-consuming negative-pair sampling. We evaluate CroSSL on various multimodal time-series benchmarks, including both medical-grade and consumer biosignals. Our results demonstrate superior performance compared to previous SSL techniques and supervised benchmarks with minimal labeled data. We additionally analyze the impact of different masking ratios and strategies and assess the robustness of the learned representations to missing modalities. Overall, our work achieves state-of-the-art performance while highlighting the benefits of masking latent embeddings for cross-modal learning in temporal health data.

* Presented at ML4MHD workshop at ICML2023

Via

Access Paper or Ask Questions

Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study

May 01, 2023

Jason Liu, Shohreh Deldari, Hao Xue, Van Nguyen, Flora D. Salim

Figure 1 for Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study

Figure 2 for Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study

Figure 3 for Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study

Figure 4 for Self-supervised Activity Representation Learning with Incremental Data: An Empirical Study

Abstract:In the context of mobile sensing environments, various sensors on mobile devices continually generate a vast amount of data. Analyzing this ever-increasing data presents several challenges, including limited access to annotated data and a constantly changing environment. Recent advancements in self-supervised learning have been utilized as a pre-training step to enhance the performance of conventional supervised models to address the absence of labelled datasets. This research examines the impact of using a self-supervised representation learning model for time series classification tasks in which data is incrementally available. We proposed and evaluated a workflow in which a model learns to extract informative features using a corpus of unlabeled time series data and then conducts classification on labelled data using features extracted by the model. We analyzed the effect of varying the size, distribution, and source of the unlabeled data on the final classification performance across four public datasets, including various types of sensors in diverse applications.

* 6 pages, accepted in the 24th IEEE International Conference on Mobile Data Management (MDM2023)

Via

Access Paper or Ask Questions

COCOA: Cross Modality Contrastive Learning for Sensor Data

Aug 03, 2022

Shohreh Deldari, Hao Xue, Aaqib Saeed, Daniel V. Smith, Flora D. Salim

Figure 1 for COCOA: Cross Modality Contrastive Learning for Sensor Data

Figure 2 for COCOA: Cross Modality Contrastive Learning for Sensor Data

Figure 3 for COCOA: Cross Modality Contrastive Learning for Sensor Data

Figure 4 for COCOA: Cross Modality Contrastive Learning for Sensor Data

Abstract:Self-Supervised Learning (SSL) is a new paradigm for learning discriminative representations without labelled data and has reached comparable or even state-of-the-art results in comparison to supervised counterparts. Contrastive Learning (CL) is one of the most well-known approaches in SSL that attempts to learn general, informative representations of data. CL methods have been mostly developed for applications in computer vision and natural language processing where only a single sensor modality is used. A majority of pervasive computing applications, however, exploit data from a range of different sensor modalities. While existing CL methods are limited to learning from one or two data sources, we propose COCOA (Cross mOdality COntrastive leArning), a self-supervised model that employs a novel objective function to learn quality representations from multisensor data by computing the cross-correlation between different data modalities and minimizing the similarity between irrelevant instances. We evaluate the effectiveness of COCOA against eight recently introduced state-of-the-art self-supervised models, and two supervised baselines across five public datasets. We show that COCOA achieves superior classification performance to all other approaches. Also, COCOA is far more label-efficient than the other baselines including the fully supervised model using only one-tenth of available labelled data.

* 27 pages, 10 figures, 6 tables, Accepted with minor revision at IMWUT Vol. 6 No. 3

Via

Access Paper or Ask Questions

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Jun 08, 2022

Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V. Smith, Flora D. Salim

Figure 1 for Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Figure 2 for Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Figure 3 for Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Figure 4 for Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Abstract:Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training. Acquiring annotated data can be a difficult and costly process. Self-supervised methods have been introduced to improve the efficiency of training data through discriminative pre-training of models using supervisory signals that have been freely obtained from the raw data. Unlike existing reviews of SSRL that have pre-dominately focused upon methods in the fields of CV or NLP for a single modality, we aim to provide the first comprehensive review of multimodal self-supervised learning methods for temporal data. To this end, we 1) provide a comprehensive categorization of existing SSRL methods, 2) introduce a generic pipeline by defining the key components of a SSRL framework, 3) compare existing models in terms of their objective function, network architecture and potential applications, and 4) review existing multimodal techniques in each category and various modalities. Finally, we present existing weaknesses and future opportunities. We believe our work develops a perspective on the requirements of SSRL in domains that utilise multimodal and/or temporal data

* 36 pages, 5 figures, 9 tables, Survey paper

Via

Access Paper or Ask Questions

Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Dec 01, 2020

Shohreh Deldari, Daniel V. Smith, Hao Xue, Flora D. Salim

Figure 1 for Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Figure 2 for Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Figure 3 for Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Figure 4 for Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Abstract:Change Point Detection techniques aim to capture changes in trends and sequences in time-series data to describe the underlying behaviour of the system. Detecting changes and anomalies in the web services, the trend of applications usage can provide valuable insight towards the system, however, many existing approaches are done in a supervised manner, requiring well-labelled data. As the amount of data produced and captured by sensors are growing rapidly, it is getting harder and even impossible to annotate the data. Therefore, coming up with a self-supervised solution is a necessity these days. In this work, we propose TSCP2 a novel self-supervised technique for temporal change point detection, based on representation learning with Temporal Convolutional Network (TCN). To the best of our knowledge, our proposed method is the first method which employs Contrastive Learning for prediction with the aim change point detection. Through extensive evaluations, we demonstrate that our method outperforms multiple state-of-the-art change point detection and anomaly detection baselines, including those adopting either unsupervised or semi-supervised approach. TSCP2 is shown to improve both non-Deep learning- and Deep learning-based methods by 0.28 and 0.12 in terms of average F1-score across three datasets.

* 12 pages

Via

Access Paper or Ask Questions

ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

Jul 24, 2020

Shohreh Deldari, Daniel V. Smith, Amin Sadri, Flora D. Salim

Figure 1 for ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

Figure 2 for ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

Figure 3 for ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

Figure 4 for ESPRESSO: Entropy and ShaPe awaRe timE-Series SegmentatiOn for processing heterogeneous sensor data

Abstract:Extracting informative and meaningful temporal segments from high-dimensional wearable sensor data, smart devices, or IoT data is a vital preprocessing step in applications such as Human Activity Recognition (HAR), trajectory prediction, gesture recognition, and lifelogging. In this paper, we propose ESPRESSO (Entropy and ShaPe awaRe timE-Series SegmentatiOn), a hybrid segmentation model for multi-dimensional time-series that is formulated to exploit the entropy and temporal shape properties of time-series. ESPRESSO differs from existing methods that focus upon particular statistical or temporal properties of time-series exclusively. As part of model development, a novel temporal representation of time-series $WCAC$ was introduced along with a greedy search approach that estimate segments based upon the entropy metric. ESPRESSO was shown to offer superior performance to four state-of-the-art methods across seven public datasets of wearable and wear-free sensing. In addition, we undertake a deeper investigation of these datasets to understand how ESPRESSO and its constituent methods perform with respect to different dataset characteristics. Finally, we provide two interesting case-studies to show how applying ESPRESSO can assist in inferring daily activity routines and the emotional state of humans.

* 23 pages, 11 figures, accepted at IMWUT Volume(4) issue(3)

Via

Access Paper or Ask Questions