Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liang Qian

Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Jan 04, 2025

Zhongwei Wang, Tong Wu, Zhiyong Chen, Liang Qian, Yin Xu, Meixia Tao

Figure 1 for Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Figure 2 for Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Figure 3 for Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Figure 4 for Diffusion Model-Based Data Synthesis Aided Federated Semi-Supervised Learning

Abstract:Federated semi-supervised learning (FSSL) is primarily challenged by two factors: the scarcity of labeled data across clients and the non-independent and identically distribution (non-IID) nature of data among clients. In this paper, we propose a novel approach, diffusion model-based data synthesis aided FSSL (DDSA-FSSL), which utilizes a diffusion model (DM) to generate synthetic data, bridging the gap between heterogeneous local data distributions and the global data distribution. In DDSA-FSSL, clients address the challenge of the scarcity of labeled data by employing a federated learning-trained classifier to perform pseudo labeling for unlabeled data. The DM is then collaboratively trained using both labeled and precision-optimized pseudo-labeled data, enabling clients to generate synthetic samples for classes that are absent in their labeled datasets. This process allows clients to generate more comprehensive synthetic datasets aligned with the global distribution. Extensive experiments conducted on multiple datasets and varying non-IID distributions demonstrate the effectiveness of DDSA-FSSL, e.g., it improves accuracy from 38.46% to 52.14% on CIFAR-10 datasets with 10% labeled data.

* accepted by IEEE WCNC 2025

Via

Access Paper or Ask Questions

WDMoE: Wireless Distributed Mixture of Experts for Large Language Models

Nov 11, 2024

Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang Qian, Shuguang Cui, Wenjun Zhang, Ping Zhang

Figure 1 for WDMoE: Wireless Distributed Mixture of Experts for Large Language Models

Figure 2 for WDMoE: Wireless Distributed Mixture of Experts for Large Language Models

Figure 3 for WDMoE: Wireless Distributed Mixture of Experts for Large Language Models

Figure 4 for WDMoE: Wireless Distributed Mixture of Experts for Large Language Models

Abstract:Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but the role of wireless networks in supporting LLMs has not been thoroughly explored. In this paper, we propose a wireless distributed Mixture of Experts (WDMoE) architecture to enable collaborative deployment of LLMs across edge servers at the base station (BS) and mobile devices in wireless networks. Specifically, we decompose the MoE layer in LLMs by placing the gating network and the preceding neural network layer at BS, while distributing the expert networks among the devices. This deployment leverages the parallel inference capabilities of expert networks on mobile devices, effectively utilizing the limited computing and caching resources of these devices. Accordingly, we develop a performance metric for WDMoE-based LLMs, which accounts for both model capability and latency. To minimize the latency while maintaining accuracy, we jointly optimize expert selection and bandwidth allocation based on the performance metric. Moreover, we build a hardware testbed using NVIDIA Jetson kits to validate the effectiveness of WDMoE. Both theoretical simulations and practical hardware experiments demonstrate that the proposed method can significantly reduce the latency without compromising LLM performance.

Via

Access Paper or Ask Questions

WDMoE: Wireless Distributed Large Language Models with Mixture of Experts

May 06, 2024

Nan Xue, Yaping Sun, Zhiyong Chen, Meixia Tao, Xiaodong Xu, Liang Qian, Shuguang Cui, Ping Zhang

Abstract:Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but how wireless communications can support LLMs has not been extensively studied. In this paper, we propose a wireless distributed LLMs paradigm based on Mixture of Experts (MoE), named WDMoE, deploying LLMs collaboratively across edge servers of base station (BS) and mobile devices in the wireless communications system. Specifically, we decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at BS, while distributing the expert networks across the devices. This arrangement leverages the parallel capabilities of expert networks on distributed devices. Moreover, to overcome the instability of wireless communications, we design an expert selection policy by taking into account both the performance of the model and the end-to-end latency, which includes both transmission delay and inference delay. Evaluations conducted across various LLMs and multiple datasets demonstrate that WDMoE not only outperforms existing models, such as Llama 2 with 70 billion parameters, but also significantly reduces end-to-end latency.

* submitted to IEEE conference

Via

Access Paper or Ask Questions

CDDM: Channel Denoising Diffusion Models for Wireless Semantic Communications

Sep 16, 2023

Tong Wu, Zhiyong Chen, Dazhi He, Liang Qian, Yin Xu, Meixia Tao, Wenjun Zhang

Abstract:Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for eliminating noise leads us to wonder whether DM can be applied to wireless communications to help the receiver mitigate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for semantic communications over wireless channels in this paper. CDDM can be applied as a new physical layer module after the channel equalization to learn the distribution of the channel input signal, and then utilizes this learned knowledge to remove the channel noise. We derive corresponding training and sampling algorithms of CDDM according to the forward diffusion process specially designed to adapt the channel models and theoretically prove that the well-trained CDDM can effectively reduce the conditional entropy of the received signal under small sampling steps. Moreover, we apply CDDM to a semantic communications system based on joint source-channel coding (JSCC) for image transmission. Extensive experimental results demonstrate that CDDM can further reduce the mean square error (MSE) after minimum mean square error (MMSE) equalizer, and the joint CDDM and JSCC system achieves better performance than the JSCC system and the traditional JPEG2000 with low-density parity-check (LDPC) code approach.

* submitted to IEEE Transactions on Wireless Communications. arXiv admin note: substantial text overlap with arXiv:2305.09161

Via

Access Paper or Ask Questions

CDDM: Channel Denoising Diffusion Models for Wireless Communications

May 16, 2023

Tong Wu, Zhiyong Chen, Dazhi He, Liang Qian, Yin Xu, Meixia Tao, Wenjun Zhang

Figure 1 for CDDM: Channel Denoising Diffusion Models for Wireless Communications

Figure 2 for CDDM: Channel Denoising Diffusion Models for Wireless Communications

Figure 3 for CDDM: Channel Denoising Diffusion Models for Wireless Communications

Figure 4 for CDDM: Channel Denoising Diffusion Models for Wireless Communications

Abstract:Diffusion models (DM) can gradually learn to remove noise, which have been widely used in artificial intelligence generated content (AIGC) in recent years. The property of DM for removing noise leads us to wonder whether DM can be applied to wireless communications to help the receiver eliminate the channel noise. To address this, we propose channel denoising diffusion models (CDDM) for wireless communications in this paper. CDDM can be applied as a new physical layer module after the channel equalization to learn the distribution of the channel input signal, and then utilizes this learned knowledge to remove the channel noise. We design corresponding training and sampling algorithms for the forward diffusion process and the reverse sampling process of CDDM. Moreover, we apply CDDM to a semantic communications system based on joint source-channel coding (JSCC). Experimental results demonstrate that CDDM can further reduce the mean square error (MSE) after minimum mean square error (MMSE) equalizer, and the joint CDDM and JSCC system achieves better performance than the JSCC system and the traditional JPEG2000 with low-density parity-check (LDPC) code approach.

Via

Access Paper or Ask Questions