Abstract:Federated learning (FL) is a distributed model training paradigm that preserves clients' data privacy. It has gained tremendous attention from both academia and industry. FL hyper-parameters (e.g., the number of selected clients and the number of training passes) significantly affect the training overhead in terms of computation time, transmission time, computation load, and transmission load. However, the current practice of manually selecting FL hyper-parameters imposes a heavy burden on FL practitioners because applications have different training preferences. In this paper, we propose FedTune, an automatic FL hyper-parameter tuning algorithm tailored to applications' diverse system requirements in FL training. FedTune iteratively adjusts FL hyper-parameters during FL training and can be easily integrated into existing FL systems. Through extensive evaluations of FedTune for diverse applications and FL aggregation algorithms, we show that FedTune is lightweight and effective, achieving 8.48%-26.75% system overhead reduction compared to using fixed FL hyper-parameters. This paper assists FL practitioners in designing high-performance FL training solutions. The source code of FedTune is available at https://github.com/DataSysTech/FedTune.
Abstract:As a privacy-preserving paradigm for training Machine Learning (ML) models, Federated Learning (FL) has received tremendous attention from both industry and academia. In a typical FL scenario, clients exhibit significant heterogeneity in terms of data distribution and hardware configurations. Thus, randomly sampling clients in each training round may not fully exploit the local updates from heterogeneous clients, resulting in lower model accuracy, slower convergence rate, degraded fairness, etc. To tackle the FL client heterogeneity problem, various client selection algorithms have been developed, showing promising performance improvement. In this paper, we systematically present recent advances in the emerging field of FL client selection and its challenges and research opportunities. We hope to facilitate practitioners in choosing the most suitable client selection mechanisms for their applications, as well as inspire researchers and newcomers to better understand this exciting research topic.
Abstract:We propose STAF, a Spatio-Temporal Attention Fusion network for few-shot video classification. STAF first extracts coarse-grained spatial and temporal features of videos by applying a 3D Convolution Neural Networks embedding network. It then fine-tunes the extracted features using self-attention and cross-attention networks. Last, STAF applies a lightweight fusion network and a nearest neighbor classifier to classify each query video. To evaluate STAF, we conduct extensive experiments on three benchmarks (UCF101, HMDB51, and Something-Something-V2). The experimental results show that STAF improves state-of-the-art accuracy by a large margin, e.g., STAF increases the five-way one-shot accuracy by 5.3% and 7.0% for UCF101 and HMDB51, respectively.
Abstract:The past decade witnesses a rapid development in the measurement and monitoring technologies for food science. Among these technologies, spectroscopy has been widely used for the analysis of food quality, safety, and nutritional properties. Due to the complexity of food systems and the lack of comprehensive predictive models, rapid and simple measurements to predict complex properties in food systems are largely missing. Machine Learning (ML) has shown great potential to improve classification and prediction of these properties. However, the barriers to collect large datasets for ML applications still persists. In this paper, we explore different approaches of data annotation and model training to improve data efficiency for ML applications. Specifically, we leverage Active Learning (AL) and Semi-Supervised Learning (SSL) and investigate four approaches: baseline passive learning, AL, SSL, and a hybrid of AL and SSL. To evaluate these approaches, we collect two spectroscopy datasets: predicting plasma dosage and detecting foodborne pathogen. Our experimental results show that, compared to the de facto passive learning approach, AL and SSL methods reduce the number of labeled samples by 50% and 25% for each ML application, respectively.
Abstract:Federated learning (FL) is a distributed model training paradigm that preserves clients' data privacy. FL hyper-parameters significantly affect the training overheads in terms of time, computation, and communication. However, the current practice of manually selecting FL hyper-parameters puts a high burden on FL practitioners since various applications prefer different training preferences. In this paper, we propose FedTuning, an automatic FL hyper-parameter tuning algorithm tailored to applications' diverse system requirements of FL training. FedTuning is lightweight and flexible, achieving an average of 41% improvement for different training preferences on time, computation, and communication compared to fixed FL hyper-parameters. FedTuning is available at https://github.com/dtczhl/FedTuning.
Abstract:With the development of the Internet of Things(IoT) and Artificial Intelligence(AI) technologies, human activity recognition has enabled various applications, such as smart homes and assisted living. In this paper, we target a new healthcare application of human activity recognition, early mobility recognition for Intensive Care Unit(ICU) patients. Early mobility is essential for ICU patients who suffer from long-time immobilization. Our system includes accelerometer-based data collection from ICU patients and an AI model to recognize patients' early mobility. To improve the model accuracy and stability, we identify features that are insensitive to sensor orientations and propose a segment voting process that leverages a majority voting strategy to recognize each segment's activity. Our results show that our system improves model accuracy from 77.78\% to 81.86\% and reduces the model instability (standard deviation) from 16.69\% to 6.92\%, compared to the same AI model without our feature engineering and segment voting process.
Abstract:Federated Learning (FL) has received a significant amount of attention in the industry and research community due to its capability of keeping data on local devices. To aggregate the gradients of local models to train the global model, existing works require that the global model and the local models are the same. However, Internet of Things (IoT) devices are inherently diverse regarding computation speed and onboard memory. In this paper, we propose an FL framework targeting the heterogeneity of IoT devices. Specifically, local models are compressed from the global model, and the gradients of the compressed local models are used to update the global model. We conduct preliminary experiments to illustrate that our framework can facilitate the design of IoT-aware FL.
Abstract:Differences in gait patterns of children with Duchenne muscular dystrophy (DMD) and typically developing (TD) peers are visible to the eye, but quantification of those differences outside of the gait laboratory has been elusive. We measured vertical, mediolateral, and anteroposterior acceleration using a waist-worn iPhone accelerometer during ambulation across a typical range of velocities. Six TD and six DMD children from 3-15 years of age underwent seven walking/running tasks, including five 25m walk/run tests at a slow walk to running speeds, a 6-minute walk test (6MWT), and a 100-meter-run/walk (100MRW). We extracted temporospatial clinical gait features (CFs) and applied multiple Artificial Intelligence (AI) tools to differentiate between DMD and TD control children using extracted features and raw data. Extracted CFs showed reduced step length and a greater mediolateral component of total power (TP) consistent with shorter strides and Trendelenberg-like gait commonly observed in DMD. AI methods using CFs and raw data varied ineffectiveness at differentiating between DMD and TD controls at different speeds, with an accuracy of some methods exceeding 91%. We demonstrate that by using AI tools with accelerometer data from a consumer-level smartphone, we can identify DMD gait disturbance in toddlers to early teens.
Abstract:With the rapid development of the internet of things (IoT) and artificial intelligence (AI) technologies, human activity recognition (HAR) has been applied in a variety of domains such as security and surveillance, human-robot interaction, and entertainment. Even though a number of surveys and review papers have been published, there is a lack of HAR overview paper focusing on healthcare applications that use wearable sensors. Therefore, we fill in the gap by presenting this overview paper. In particular, we present our emerging HAR projects for healthcare: identification of human activities for intensive care unit (ICU) patients and Duchenne muscular dystrophy (DMD) patients. Our HAR systems include hardware design to collect sensor data from ICU patients and DMD patients and accurate AI models to recognize patients' activities. This overview paper covers considerations and settings for building a HAR healthcare system, including sensor factors, AI model comparison, and system challenges.
Abstract:As a massive number of the Internet of Things (IoT) devices are deployed, the security and privacy issues in IoT arouse more and more attention. The IoT attacks are causing tremendous loss to the IoT networks and even threatening human safety. Compared to traditional networks, IoT networks have unique characteristics, which make the attack detection more challenging. First, the heterogeneity of platforms, protocols, software, and hardware exposes various vulnerabilities. Second, in addition to the traditional high-rate attacks, the low-rate attacks are also extensively used by IoT attackers to obfuscate the legitimate and malicious traffic. These low-rate attacks are challenging to detect and can persist in the networks. Last, the attackers are evolving to be more intelligent and can dynamically change their attack strategies based on the environment feedback to avoid being detected, making it more challenging for the defender to discover a consistent pattern to identify the attack. In order to adapt to the new characteristics in IoT attacks, we propose a reinforcement learning-based attack detection model that can automatically learn and recognize the transformation of the attack pattern. Therefore, we can continuously detect IoT attacks with less human intervention. In this paper, we explore the crucial features of IoT traffics and utilize the entropy-based metrics to detect both the high-rate and low-rate IoT attacks. Afterward, we leverage the reinforcement learning technique to continuously adjust the attack detection threshold based on the detection feedback, which optimizes the detection and the false alarm rate. We conduct extensive experiments over a real IoT attack dataset and demonstrate the effectiveness of our IoT attack detection framework.