Abstract:Generative semantic communication models are reshaping semantic communication frameworks by moving beyond pixel-wise optimization to align with human perception. However, many existing approaches prioritize image-level perceptual quality, often neglecting alignment with downstream tasks, which can lead to suboptimal semantic representation. This paper introduces an Ultra-Low Bitrate Semantic Communication (ULBSC) system that employs a conditional generative model and a learnable condition codebook.By integrating saliency conditions and image-level semantic information, the proposed method enables high-perceptual-quality and controllable task-oriented image transmission. Recognizing shared patterns among objects, we propose a codebook-assisted condition transmission method, integrated with joint source-channel coding (JSCC)-based text transmission to establish ULBSC. The codebook serves as a knowledge base, reducing communication costs to achieve ultra-low bitrate while enhancing robustness against noise and inaccuracies in saliency detection. Simulation results indicate that, under ultra-low bitrate conditions with an average compression ratio of 0.57Z%o, the proposed system delivers superior visual quality compared to traditional JSCC techniques and achieves higher saliency similarity between the generated and source images compared to state-of-the-art generative semantic communication methods.
Abstract:Accurate forecasting of spatiotemporal data remains challenging due to complex spatial dependencies and temporal dynamics. The inherent uncertainty and variability in such data often render deterministic models insufficient, prompting a shift towards probabilistic approaches, where diffusion-based generative models have emerged as effective solutions. In this paper, we present ProGen, a novel framework for probabilistic spatiotemporal time series forecasting that leverages Stochastic Differential Equations (SDEs) and diffusion-based generative modeling techniques in the continuous domain. By integrating a novel denoising score model, graph neural networks, and a tailored SDE, ProGen provides a robust solution that effectively captures spatiotemporal dependencies while managing uncertainty. Our extensive experiments on four benchmark traffic datasets demonstrate that ProGen outperforms state-of-the-art deterministic and probabilistic models. This work contributes a continuous, diffusion-based generative approach to spatiotemporal forecasting, paving the way for future research in probabilistic modeling and stochastic processes.
Abstract:Semantic communication (SemComm) has emerged as new paradigm shifts.Most existing SemComm systems transmit continuously distributed signals in analog fashion.However, the analog paradigm is not compatible with current digital communication frameworks. In this paper, we propose an alternating multi-phase training strategy (AMP) to enable the joint training of the networks in the encoder and decoder through non-differentiable digital processes. AMP contains three training phases, aiming at feature extraction (FE), robustness enhancement (RE), and training-testing alignment (TTA), respectively. AMP contains three training phases, aiming at feature extraction (FE), robustness enhancement (RE), and training-testing alignment (TTA), respectively. In particular, in the FE stage, we learn the representation ability of semantic information by end-to-end training the encoder and decoder in an analog manner. When we take digital communication into consideration, the domain shift between digital and analog demands the fine-tuning for encoder and decoder. To cope with joint training process within the non-differentiable digital processes, we propose the alternation between updating the decoder individually and jointly training the codec in RE phase. To boost robustness further, we investigate a mask-attack (MATK) in RE to simulate an evident and severe bit-flipping effect in a differentiable manner. To address the training-testing inconsistency introduced by MATK, we employ an additional TTA phase, fine-tuning the decoder without MATK. Combining with AMP and an information restoration network, we propose a digital SemComm system for image transmission, named AMP-SC. Comparing with the representative benchmark, AMP-SC achieves $0.82 \sim 1.65$dB higher average reconstruction performance among various representative datasets at different scales and a wide range of signal-to-noise ratio.
Abstract:Semantic communication has emerged as new paradigm shifts in 6G from the conventional syntax-oriented communications. Recently, the wireless broadcast technology has been introduced to support semantic communication system toward higher communication efficiency. Nevertheless, existing broadcast semantic communication systems target on general representation within one stage and fail to balance the inference accuracy among users. In this paper, the broadcast encoding process is decomposed into compression and fusion to improves communication efficiency with adaptation to tasks and channels.Particularly, we propose multiple task-channel-aware sub-encoders (TCE) and a channel-aware feature fusion sub-encoder (CFE) towards compression and fusion, respectively. In TCEs, multiple local-channel-aware attention blocks are employed to extract and compress task-relevant information for each user. In GFE, we introduce a global-channel-aware fine-tuning block to merge these compressed task-relevant signals into a compact broadcast signal. Notably, we retrieve the bottleneck in DeepBroadcast and leverage information bottleneck theory to further optimize the parameter tuning of TCEs and CFE.We substantiate our approach through experiments on a range of heterogeneous tasks across various channels with additive white Gaussian noise (AWGN) channel, Rayleigh fading channel, and Rician fading channel. Simulation results evidence that the proposed DeepBroadcast outperforms the state-of-the-art methods.