Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weitao Xu

Beyond Models! Explainable Data Valuation and Metric Adaption for Recommendation

Feb 12, 2025

Renqi Jia, Xiaokun Zhang, Bowei He, Qiannan Zhu, Weitao Xu, Jiehao Chen, Chen Ma

Abstract:User behavior records serve as the foundation for recommender systems. While the behavior data exhibits ease of acquisition, it often suffers from varying quality. Current methods employ data valuation to discern high-quality data from low-quality data. However, they tend to employ black-box design, lacking transparency and interpretability. Besides, they are typically tailored to specific evaluation metrics, leading to limited generality across various tasks. To overcome these issues, we propose an explainable and versatile framework DVR which can enhance the efficiency of data utilization tailored to any requirements of the model architectures and evaluation metrics. For explainable data valuation, a data valuator is presented to evaluate the data quality via calculating its Shapley value from the game-theoretic perspective, ensuring robust mathematical properties and reliability. In order to accommodate various evaluation metrics, including differentiable and non-differentiable ones, a metric adapter is devised based on reinforcement learning, where a metric is treated as the reinforcement reward that guides model optimization. Extensive experiments conducted on various benchmarks verify that our framework can improve the performance of current recommendation algorithms on various metrics including ranking accuracy, diversity, and fairness. Specifically, our framework achieves up to 34.7\% improvements over existing methods in terms of representative NDCG metric. The code is available at https://github.com/renqii/DVR.

Via

Access Paper or Ask Questions

Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning

Jan 21, 2025

Chih Wei Ling, Youqi Wu, Jiande Sun, Cheuk Ting Li, Linqi Song, Weitao Xu

Abstract:Training machine learning models on decentralized private data via federated learning (FL) poses two key challenges: communication efficiency and privacy protection. In this work, we address these challenges within the trusted aggregator model by introducing a novel approach called the Communication-Efficient and Privacy-Adaptable Mechanism (CEPAM), achieving both objectives simultaneously. In particular, CEPAM leverages the rejection-sampled universal quantizer (RSUQ), a construction of randomized vector quantizer whose resulting distortion is equivalent to a prescribed noise, such as Gaussian or Laplace noise, enabling joint differential privacy and compression. Moreover, we analyze the trade-offs among user privacy, global utility, and transmission rate of CEPAM by defining appropriate metrics for FL with differential privacy and compression. Our CEPAM provides the additional benefit of privacy adaptability, allowing clients and the server to customize privacy protection based on required accuracy and protection. We assess CEPAM's utility performance using MNIST dataset, demonstrating that CEPAM surpasses baseline models in terms of learning accuracy.

* 18 pages, 3 figures, Submitted to 2025 IEEE International Symposium on Information Theory

Via

Access Paper or Ask Questions

EmbedGenius: Towards Automated Software Development for Generic Embedded IoT Systems

Dec 12, 2024

Huanqi Yang, Mingzhe Li, Mingda Han, Zhenjiang Li, Weitao Xu

Figure 1 for EmbedGenius: Towards Automated Software Development for Generic Embedded IoT Systems

Figure 2 for EmbedGenius: Towards Automated Software Development for Generic Embedded IoT Systems

Figure 3 for EmbedGenius: Towards Automated Software Development for Generic Embedded IoT Systems

Figure 4 for EmbedGenius: Towards Automated Software Development for Generic Embedded IoT Systems

Abstract:Embedded IoT system development is crucial for enabling seamless connectivity and functionality across a wide range of applications. However, such a complex process requires cross-domain knowledge of hardware and software and hence often necessitates direct developer involvement, making it labor-intensive, time-consuming, and error-prone. To address this challenge, this paper introduces EmbedGenius, the first fully automated software development platform for general-purpose embedded IoT systems. The key idea is to leverage the reasoning ability of Large Language Models (LLMs) and embedded system expertise to automate the hardware-in-the-loop development process. The main methods include a component-aware library resolution method for addressing hardware dependencies, a library knowledge generation method that injects utility domain knowledge into LLMs, and an auto-programming method that ensures successful deployment. We evaluate EmbedGenius's performance across 71 modules and four mainstream embedded development platforms with over 350 IoT tasks. Experimental results show that EmbedGenius can generate codes with an accuracy of 95.7% and complete tasks with a success rate of 86.5%, surpassing human-in-the-loop baselines by 15.6%--37.7% and 25.5%--53.4%, respectively. We also show EmbedGenius's potential through case studies in environmental monitoring and remote control systems development.

Via

Access Paper or Ask Questions

TransCompressor: LLM-Powered Multimodal Data Compression for Smart Transportation

Nov 25, 2024

Huanqi Yang, Rucheng Wu, Weitao Xu

Abstract:The incorporation of Large Language Models (LLMs) into smart transportation systems has paved the way for improving data management and operational efficiency. This study introduces TransCompressor, a novel framework that leverages LLMs for efficient compression and decompression of multimodal transportation sensor data. TransCompressor has undergone thorough evaluation with diverse sensor data types, including barometer, speed, and altitude measurements, across various transportation modes like buses, taxis, and MTRs. Comprehensive evaluation illustrates the effectiveness of TransCompressor in reconstructing transportation sensor data at different compression ratios. The results highlight that, with well-crafted prompts, LLMs can utilize their vast knowledge base to contribute to data compression processes, enhancing data storage, analysis, and retrieval in smart transportation settings.

* 6 pages

Via

Access Paper or Ask Questions

EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

Jun 18, 2024

Shengzhe Lyu, Yongliang Chen, Di Duan, Renqi Jia, Weitao Xu

Figure 1 for EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

Figure 2 for EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

Figure 3 for EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

Figure 4 for EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

Abstract:In the realm of smart sensing with the Internet of Things, earable devices are empowered with the capability of multi-modality sensing and intelligence of context-aware computing, leading to its wide usage in Human Activity Recognition (HAR). Nonetheless, unlike the movements captured by Inertial Measurement Unit (IMU) sensors placed on the upper or lower body, those motion signals obtained from earable devices show significant changes in amplitudes and patterns, especially in the presence of dynamic and unpredictable head movements, posing a significant challenge for activity classification. In this work, we present EarDA, an adversarial-based domain adaptation system to extract the domain-independent features across different sensor locations. Moreover, while most deep learning methods commonly rely on training with substantial amounts of labeled data to offer good accuracy, the proposed scheme can release the potential usage of publicly available smartphone-based IMU datasets. Furthermore, we explore the feasibility of applying a filter-based data processing method to mitigate the impact of head movement. EarDA, the proposed system, enables more data-efficient and accurate activity sensing. It achieves an accuracy of 88.8% under HAR task, demonstrating a significant 43% improvement over methods without domain adaptation. This clearly showcases its effectiveness in mitigating domain gaps.

* accepted by 2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)

Via

Access Paper or Ask Questions

Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs!

Mar 10, 2024

Huanqi Yang, Sijie Ji, Rucheng Wu, Weitao Xu

Abstract:There is a burgeoning discussion around the capabilities of Large Language Models (LLMs) in acting as fundamental components that can be seamlessly incorporated into Artificial Intelligence of Things (AIoT) to interpret complex trajectories. This study introduces LLMTrack, a model that illustrates how LLMs can be leveraged for Zero-Shot Trajectory Recognition by employing a novel single-prompt technique that combines role-play and think step-by-step methodologies with unprocessed Inertial Measurement Unit (IMU) data. We evaluate the model using real-world datasets designed to challenge it with distinct trajectories characterized by indoor and outdoor scenarios. In both test scenarios, LLMTrack not only meets but exceeds the performance benchmarks set by traditional machine learning approaches and even contemporary state-of-the-art deep learning models, all without the requirement of training on specialized datasets. The results of our research suggest that, with strategically designed prompts, LLMs can tap into their extensive knowledge base and are well-equipped to analyze raw sensor data with remarkable effectiveness.

Via

Access Paper or Ask Questions

Adaptive Top-K in SGD for Communication-Efficient Distributed Learning

Oct 24, 2022

Mengzhe Ruan, Guangfeng Yan, Yuanzhang Xiao, Linqi Song, Weitao Xu

Abstract:Distributed stochastic gradient descent (SGD) with gradient compression has emerged as a communication-efficient solution to accelerate distributed learning. Top-K sparsification is one of the most popular gradient compression methods that sparsifies the gradient in a fixed degree during model training. However, there lacks an approach to adaptively adjust the degree of sparsification to maximize the potential of model performance or training speed. This paper addresses this issue by proposing a novel adaptive Top-K SGD framework, enabling adaptive degree of sparsification for each gradient descent step to maximize the convergence performance by exploring the trade-off between communication cost and convergence error. Firstly, we derive an upper bound of the convergence error for the adaptive sparsification scheme and the loss function. Secondly, we design the algorithm by minimizing the convergence error under the communication cost constraints. Finally, numerical results show that the proposed adaptive Top-K in SGD achieves a significantly better convergence rate compared with the state-of-the-art methods.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions

Simultaneous Energy Harvesting and Gait Recognition using Piezoelectric Energy Harvester

Sep 06, 2020

Dong Ma, Guohao Lan, Weitao Xu, Mahbub Hassan, Wen Hu

Figure 1 for Simultaneous Energy Harvesting and Gait Recognition using Piezoelectric Energy Harvester

Figure 2 for Simultaneous Energy Harvesting and Gait Recognition using Piezoelectric Energy Harvester

Figure 3 for Simultaneous Energy Harvesting and Gait Recognition using Piezoelectric Energy Harvester

Figure 4 for Simultaneous Energy Harvesting and Gait Recognition using Piezoelectric Energy Harvester

Abstract:Piezoelectric energy harvester, which generates electricity from stress or vibrations, is gaining increasing attention as a viable solution to extend battery life in wearables. Recent research further reveals that, besides generating energy, PEH can also serve as a passive sensor to detect human gait power-efficiently because its stress or vibration patterns are significantly influenced by the gait. However, as PEHs are not designed for precise measurement of motion, achievable gait recognition accuracy remains low with conventional classification algorithms. The accuracy deteriorates further when the generated electricity is stored simultaneously. To classify gait reliably while simultaneously storing generated energy, we make two distinct contributions. First, we propose a preprocessing algorithm to filter out the effect of energy storage on PEH electricity signal. Second, we propose a long short-term memory (LSTM) network-based classifier to accurately capture temporal information in gait-induced electricity generation. We prototype the proposed gait recognition architecture in the form factor of an insole and evaluate its gait recognition as well as energy harvesting performance with 20 subjects. Our results show that the proposed architecture detects human gait with 12% higher recall and harvests up to 127% more energy while consuming 38% less power compared to the state-of-the-art.

* 13 pages, 17 figures, and 2 tables

Via

Access Paper or Ask Questions

A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Feb 23, 2020

Weitao Xu, Xiang Zhang, Lina Yao, Wanli Xue, Bo Wei

Figure 1 for A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Figure 2 for A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Figure 3 for A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Figure 4 for A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Abstract:Automatic identification of animal species by their vocalization is an important and challenging task. Although many kinds of audio monitoring system have been proposed in the literature, they suffer from several disadvantages such as non-trivial feature selection, accuracy degradation because of environmental noise or intensive local computation. In this paper, we propose a deep learning based acoustic classification framework for Wireless Acoustic Sensor Network (WASN). The proposed framework is based on cloud architecture which relaxes the computational burden on the wireless sensor node. To improve the recognition accuracy, we design a multi-view Convolution Neural Network (CNN) to extract the short-, middle-, and long-term dependencies in parallel. The evaluation on two real datasets shows that the proposed architecture can achieve high accuracy and outperforms traditional classification systems significantly when the environmental noise dominate the audio signal (low SNR). Moreover, we implement and deploy the proposed system on a testbed and analyse the system performance in real-world environments. Both simulation and real-world evaluation demonstrate the accuracy and robustness of the proposed acoustic classification system in distinguishing species of animals.

* Ad Hoc Networks 2020

Via

Access Paper or Ask Questions