Abstract:As large language models (LLMs) increasingly shape the AI landscape, fine-tuning pretrained models has become more popular than in the pre-LLM era for achieving optimal performance in domain-specific tasks. However, pretrained LLMs such as ChatGPT are periodically evolved, i.e., model parameters are frequently updated), making it challenging for downstream users with limited resources to keep up with fine-tuning the newest LLMs for their domain application. Even though fine-tuning costs have nowadays been reduced thanks to the innovations of parameter-efficient fine-tuning such as LoRA, not all downstream users have adequate computing for frequent personalization. Moreover, access to fine-tuning datasets, particularly in sensitive domains such as healthcare, could be time-restrictive, making it crucial to retain the knowledge encoded in earlier fine-tuned rounds for future adaptation. In this paper, we present PortLLM, a training-free framework that (i) creates an initial lightweight model update patch to capture domain-specific knowledge, and (ii) allows a subsequent seamless plugging for the continual personalization of evolved LLM at minimal cost. Our extensive experiments cover seven representative datasets, from easier question-answering tasks {BoolQ, SST2} to harder reasoning tasks {WinoGrande, GSM8K}, and models including {Mistral-7B, Llama2, Llama3.1, and Gemma2}, validating the portability of our designed model patches and showcasing the effectiveness of our proposed framework. For instance, PortLLM achieves comparable performance to LoRA fine-tuning with reductions of up to 12.2x in GPU memory usage. Finally, we provide theoretical justifications to understand the portability of our model update patches, which offers new insights into the theoretical dimension of LLMs' personalization.
Abstract:Edge devices, with their widely varying capabilities, support a diverse range of edge AI models. This raises the question: how does an edge model differ from a high-accuracy (base) model for the same task? We introduce XDELTA, a novel explainable AI tool that explains differences between a high-accuracy base model and a computationally efficient but lower-accuracy edge model. To achieve this, we propose a learning-based approach to characterize the model difference, named the DELTA network, which complements the feature representation capability of the edge network in a compact form. To construct DELTA, we propose a sparsity optimization framework that extracts the essence of the base model to ensure compactness and sufficient feature representation capability of DELTA, and implement a negative correlation learning approach to ensure it complements the edge model. We conduct a comprehensive evaluation to test XDELTA's ability to explain model discrepancies, using over 1.2 million images and 24 models, and assessing real-world deployments with six participants. XDELTA excels in explaining differences between base and edge models (arbitrary pairs as well as compressed base models) through geometric and concept-level analysis, proving effective in real-world applications.
Abstract:Recent research has demonstrated the capability of physiological signals to infer both user emotional and attention responses. This presents an opportunity for leveraging widely available physiological sensors in smartwatches, to detect real-time emotional cues in users, such as stress and excitement. In this paper, we introduce SensEmo, a smartwatch-based system designed for affective learning. SensEmo utilizes multiple physiological sensor data, including heart rate and galvanic skin response, to recognize a student's motivation and concentration levels during class. This recognition is facilitated by a personalized emotion recognition model that predicts emotional states based on degrees of valence and arousal. With real-time emotion and attention feedback from students, we design a Markov decision process-based algorithm to enhance student learning effectiveness and experience by by offering suggestions to the teacher regarding teaching content and pacing. We evaluate SensEmo with 22 participants in real-world classroom environments. Evaluation results show that SensEmo recognizes student emotion with an average of 88.9% accuracy. More importantly, SensEmo assists students to achieve better online learning outcomes, e.g., an average of 40.0% higher grades in quizzes, over the traditional learning without student emotional feedback.
Abstract:This paper examines the application of WiFi signals for real-world monitoring of daily activities in home healthcare scenarios. While the state-of-the-art of WiFi-based activity recognition is promising in lab environments, challenges arise in real-world settings due to environmental, subject, and system configuration variables, affecting accuracy and adaptability. The research involved deploying systems in various settings and analyzing data shifts. It aims to guide realistic development of robust, context-aware WiFi sensing systems for elderly care. The findings suggest a shift in WiFi-based activity sensing, bridging the gap between academic research and practical applications, enhancing life quality through technology.
Abstract:A fundamental problem of every intermittently-powered sensing system is that signals acquired by these systems over a longer period in time are also intermittent. As a consequence, these systems fail to capture parts of a longer-duration event that spans over multiple charge-discharge cycles of the capacitor that stores the harvested energy. From an application's perspective, this is viewed as sporadic bursts of missing values in the input data -- which may not be recoverable using statistical interpolation or imputation methods. In this paper, we study this problem in the light of an intermittent audio classification system and design an end-to-end system -- SoundSieve -- that is capable of accurately classifying audio events that span multiple on-off cycles of the intermittent system. SoundSieve employs an offline audio analyzer that learns to identify and predict important segments of an audio clip that must be sampled to ensure accurate classification of the audio. At runtime, SoundSieve employs a lightweight, energy- and content-aware audio sampler that decides when the system should wake up to capture the next chunk of audio; and a lightweight, intermittence-aware audio classifier that performs imputation and on-device inference. Through extensive evaluations using popular audio datasets as well as real systems, we demonstrate that SoundSieve yields 5%--30% more accurate inference results than the state-of-the-art.
Abstract:With the rise of hailing services, people are increasingly relying on shared mobility (e.g., Uber, Lyft) drivers to pick up for transportation. However, such drivers and riders have difficulties finding each other in urban areas as GPS signals get blocked by skyscrapers, in crowded environments (e.g., in stadiums, airports, and bars), at night, and in bad weather. It wastes their time, creates a bad user experience, and causes more CO2 emissions due to idle driving. In this work, we explore the potential of Wi-Fi to help drivers to determine the street side of the riders. Our proposed system is called CarFi that uses Wi-Fi CSI from two antennas placed inside a moving vehicle and a data-driven technique to determine the street side of the rider. By collecting real-world data in realistic and challenging settings by blocking the signal with other people and other parked cars, we see that CarFi is 95.44% accurate in rider-side determination in both line of sight (LoS) and non-line of sight (nLoS) conditions, and can be run on an embedded GPU in real-time.
Abstract:We propose SmartON, a batteryless system that learns to wake up proactively at the right moment in order to detect events of interest. It does so by adapting the duty cycle to match the distribution of event arrival times under the constraints of harvested energy. While existing energy harvesting systems either wake up periodically at a fixed rate to sense and process the data, or wake up only in accordance with the availability of the energy source, SmartON employs a three-phase learning framework to learn the energy harvesting pattern as well as the pattern of events at run-time, and uses that knowledge to wake itself up when events are most likely to occur. The three-phase learning framework enables rapid adaptation to environmental changes in both short and long terms. Being able to remain asleep more often than a CTID (charging-then-immediate-discharging) wake-up system and adapt to the event pattern, SmartON is able to reduce energy waste, increase energy efficiency, and capture more events. To realize SmartON we have developed a dedicated hardware platform whose power management module activates capacitors on-the-fly to dynamically increase its storage capacitance. We conduct both simulation-driven and real-system experiments to demonstrate that SmartON captures 1X--7X more events and is 8X--17X more energy-efficient than a CTID system.
Abstract:The lack of adequate training data is one of the major hurdles in WiFi-based activity recognition systems. In this paper, we propose Wi-Fringe, which is a WiFi CSI-based device-free human gesture recognition system that recognizes named gestures, i.e., activities and gestures that have a semantically meaningful name in English language, as opposed to arbitrary free-form gestures. Given a list of activities (only their names in English text), along with zero or more training examples (WiFi CSI values) per activity, Wi-Fringe is able to detect all activities at runtime. In other words, a subset of activities that Wi-Fringe detects do not require any training examples at all.
Abstract:In this paper, we propose a time-, energy-, and accuracy-aware scheduling algorithm for intermittently powered systems that execute compressed deep learning tasks that are suitable for MCUs and are powered solely by harvested energy. The sporadic nature of harvested energy, resource constraints of the embedded platform, and the computational demand of deep neural networks (even though compressed) pose a unique and challenging real-time scheduling problem for which no solutions have been proposed in the literature. We empirically study the problem and model the energy harvesting pattern as well as the trade-off between the accuracy and execution of a deep neural network. We develop an imprecise computing-based scheduling algorithm that improves the schedulability of deep learning tasks on intermittently powered systems. We also utilize the dependency of the computational need of data samples for deep learning models and propose early termination of deep neural networks. We further propose a semi-supervised machine learning model that exploits the deep features and contributes in determining the imprecise partition of a task. We implement our proposed algorithms on two different datasets and real-life scenarios and show that it increases the accuracy by 9.45% - 3.19%, decreases the execution time by 14\% and successfully schedules 33%-12% more tasks.
Abstract:In this paper, we introduce the concept of intermittent learning, which enables energy harvested computing platforms to execute certain classes of machine learning tasks. We identify unique challenges to intermittent learning relating to the data and application semantics of machine learning tasks. To address these challenges, we devise an algorithm that determines a sequence of actions to achieve the desired learning objective under tight energy constraints. We further increase the energy efficiency of the system by proposing three heuristics that help an intermittent learner decide whether to learn or discard training examples at run-time. In order to provide a probabilistic bound on the completion of a learning task, we perform an energy event-based analysis that helps us analyze intermittent learning systems where the uncertainty lies in both energy and training example generation. We implement and evaluate three intermittent learning applications that learn the air quality, human presence, and vibration using solar, RF, and kinetic energy harvesters, respectively. We demonstrate that the proposed framework improves the energy efficiency of a learner by up to 100% and cuts down the number of learning examples by up to 50% when compared to state-of-the-art intermittent computing systems without our framework.