Abstract:Mobile traffic forecasting allows operators to anticipate network dynamics and performance in advance, offering substantial potential for enhancing service quality and improving user experience. However, existing models are often task-oriented and are trained with tailored data, which limits their effectiveness in diverse mobile network tasks of Base Station (BS) deployment, resource allocation, energy optimization, etc. and hinders generalization across different urban environments. Foundation models have made remarkable strides across various domains of NLP and CV due to their multi-tasking adaption and zero/few-shot learning capabilities. In this paper, we propose an innovative Foundation model for Mo}bile traffic forecasting (FoMo), aiming to handle diverse forecasting tasks of short/long-term predictions and distribution generation across multiple cities to support network planning and optimization. FoMo combines diffusion models and transformers, where various spatio-temporal masks are proposed to enable FoMo to learn intrinsic features of different tasks, and a contrastive learning strategy is developed to capture the correlations between mobile traffic and urban contexts, thereby improving its transfer learning capability. Extensive experiments on 9 real-world datasets demonstrate that FoMo outperforms current models concerning diverse forecasting tasks and zero/few-shot learning, showcasing a strong universality. We further deploy the FoMo on the JiuTian optimization platform of China Mobile, where we use the predicted mobile data to formulate network planning and optimization applications, including BS deployment, resource block scheduling, and BS sleep control.
Abstract:Data attribution methods aim to quantify the influence of individual training samples on the prediction of artificial intelligence (AI) models. As training data plays an increasingly crucial role in the modern development of large-scale AI models, data attribution has found broad applications in improving AI performance and safety. However, despite a surge of new data attribution methods being developed recently, there lacks a comprehensive library that facilitates the development, benchmarking, and deployment of different data attribution methods. In this work, we introduce $\texttt{dattri}$, an open-source data attribution library that addresses the above needs. Specifically, $\texttt{dattri}$ highlights three novel design features. Firstly, $\texttt{dattri}$ proposes a unified and easy-to-use API, allowing users to integrate different data attribution methods into their PyTorch-based machine learning pipeline with a few lines of code changed. Secondly, $\texttt{dattri}$ modularizes low-level utility functions that are commonly used in data attribution methods, such as Hessian-vector product, inverse-Hessian-vector product or random projection, making it easier for researchers to develop new data attribution methods. Thirdly, $\texttt{dattri}$ provides a comprehensive benchmark framework with pre-trained models and ground truth annotations for a variety of benchmark settings, including generative AI settings. We have implemented a variety of state-of-the-art efficient data attribution methods that can be applied to large-scale neural network models, and will continuously update the library in the future. Using the developed $\texttt{dattri}$ library, we are able to perform a comprehensive and fair benchmark analysis across a wide range of data attribution methods. The source code of $\texttt{dattri}$ is available at https://github.com/TRAIS-Lab/dattri.
Abstract:Channel State Information (CSI) of WiFi signals becomes increasingly attractive in human sensing applications due to the pervasiveness of WiFi, robustness to illumination and view points, and little privacy concern comparing to cameras. In majority of existing works, CSI sequences are analyzed by traditional signal processing approaches. These approaches rely on strictly imposed assumption on propagation paths, reflection and attenuation of signal interacting with human bodies and indoor background. This makes existing approaches very difficult to model the delicate body characteristics and activities in the real applications. To address these issues, we build CSI-Net, a unified Deep Neural Network (DNN), that fully utilizes the strength of deep feature representation and the power of existing DNN architectures for CSI-based human sensing problems. Using CSI-Net, we jointly solved two body characterization problems: biometrics estimation (including body fat, muscle, water and bone rates) and human identification. We also demonstrated the application of CSI-Net on two distinctive action recognition tasks: the hand sign recognition (fine-scaled action of the hand) and falling detection (coarse-scaled motion of the body). Besides the technical contribution of CSI-Net, we present major discoveries and insights on how the multi-frequency CSI signals are encoded and processed in DNNs, which, to the best of our knowledge, is the first attempt that bridges the WiFi sensing and deep learning in human sensing problems.