Abstract:Multimodal information extraction (IE) tasks have attracted increasing attention because many studies have shown that multimodal information benefits text information extraction. However, existing multimodal IE datasets mainly focus on sentence-level image-facilitated IE in English text, and pay little attention to video-based multimodal IE and fine-grained visual grounding. Therefore, in order to promote the development of multimodal IE, we constructed a multimodal multilingual multitask dataset, named M$^{3}$D, which has the following features: (1) It contains paired document-level text and video to enrich multimodal information; (2) It supports two widely-used languages, namely English and Chinese; (3) It includes more multimodal IE tasks such as entity recognition, entity chain extraction, relation extraction and visual grounding. In addition, our dataset introduces an unexplored theme, i.e., biography, enriching the domains of multimodal IE resources. To establish a benchmark for our dataset, we propose an innovative hierarchical multimodal IE model. This model effectively leverages and integrates multimodal information through a Denoised Feature Fusion Module (DFFM). Furthermore, in non-ideal scenarios, modal information is often incomplete. Thus, we designed a Missing Modality Construction Module (MMCM) to alleviate the issues caused by missing modalities. Our model achieved an average performance of 53.80% and 53.77% on four tasks in English and Chinese datasets, respectively, which set a reasonable standard for subsequent research. In addition, we conducted more analytical experiments to verify the effectiveness of our proposed module. We believe that our work can promote the development of the field of multimodal IE.
Abstract:The Dynamic Zero-COVID Policy in China spanned three years and diverse emotional responses have been observed at different times. In this paper, we retrospectively analyzed public sentiments and perceptions of the policy, especially regarding how they evolved over time, and how they related to people's lived experiences. Through sentiment analysis of 2,358 collected Weibo posts, we identified four representative points, i.e., policy initialization, sharp sentiment change, lowest sentiment score, and policy termination, for an in-depth discourse analysis through the lens of appraisal theory. In the end, we reflected on the evolving public sentiments toward the Dynamic Zero-COVID Policy and proposed implications for effective epidemic prevention and control measures for future crises.
Abstract:Timing synchronization (TS) is vital for orthogonal frequency division multiplexing (OFDM) systems, which makes the discrete Fourier transform (DFT) window start at the inter-symbol-interference (ISI)-free region. However, the multi-path uncertainty in wireless communication scenarios degrades the TS correctness. To alleviate this degradation, we propose a learning-based TS method enhanced by improving the design of training label. In the proposed method, the classic cross-correlator extracts the initial TS feature for benefiting the following machine learning. Wherein, the network architecture unfolds one classic cross-correlation process. Against the multi-path uncertainty, a novel training label is designed by representing the ISI-free region and especially highlighting its approximate midpoint. Therein, the closer to the region boundary of ISI-free the smaller label values are set, expecting to locate the maximum network output in ISI-free region with a high probability. Then, to guarantee the correctness of labeling, we exploit the priori information of line-of-sight (LOS) to form a LOS-aided labeling. Numerical results confirm that, the proposed training label effectively enhances the correctness of the proposed TS learner against the multi-path uncertainty.
Abstract:Timing synchronization (TS) is one of the key tasks in orthogonal frequency division multiplexing (OFDM) systems. However, multi-path uncertainty corrupts the TS correctness, making OFDM systems suffer from a severe inter-symbol-interference (ISI). To tackle this issue, we propose a timing-metric learning-based TS method assisted by a lightweight one-dimensional convolutional neural network (1-D CNN). Specifically, the receptive field of 1-D CNN is specifically designed to extract the metric features from the classic synchronizer. Then, to combat the multi-path uncertainty, we employ the varying delays and gains of multi-path (the characteristics of multi-path uncertainty) to design the timing-metric objective, and thus form the training labels. This is typically different from the existing timing-metric objectives with respect to the timing synchronization point. Our method substantively increases the completeness of training data against the multi-path uncertainty due to the complete preservation of metric information. By this mean, the TS correctness is improved against the multi-path uncertainty. Numerical results demonstrate the effectiveness and generalization of the proposed TS method against the multi-path uncertainty.
Abstract:Due to the implementation bottleneck of training data collection in realistic wireless communications systems, supervised learning-based timing synchronization (TS) is challenged by the incompleteness of training data. To tackle this bottleneck, we extend the computer-aided approach, with which the local device can generate the training data instead of generating learning labels from the received samples collected in realistic systems, and then construct an extreme learning machine (ELM)-based TS network in orthogonal frequency division multiplexing (OFDM) systems. Specifically, by leveraging the rough information of channel impulse responses (CIRs), i.e., root-mean-square (r.m.s) delay, we propose the loose constraint-based and flexible constraint-based training strategies for the learning-label design against the maximum multi-path delay. The underlying mechanism is to improve the completeness of multi-path delays that may appear in the realistic wireless channels and thus increase the statistical efficiency of the designed TS learner. By this means, the proposed ELM-based TS network can alleviate the degradation of generalization performance. Numerical results reveal the robustness and generalization of the proposed scheme against varying parameters.
Abstract:Due to the interdependency of frame synchronization (FS) and channel estimation (CE), joint FS and CE (JFSCE) schemes are proposed to enhance their functionalities and therefore boost the overall performance of wireless communication systems. Although traditional JFSCE schemes alleviate the influence between FS and CE, they show deficiencies in dealing with hardware imperfection (HI) and deterministic line-of-sight (LOS) path. To tackle this challenge, we proposed a cascaded ELM-based JFSCE to alleviate the influence of HI in the scenario of the Rician fading channel. Specifically, the conventional JFSCE method is first employed to extract the initial features, and thus forms the non-Neural Network (NN) solutions for FS and CE, respectively. Then, the ELM-based networks, named FS-NET and CE-NET, are cascaded to capture the NN solutions of FS and CE. Simulation and analysis results show that, compared with the conventional JFSCE methods, the proposed cascaded ELM-based JFSCE significantly reduces the error probability of FS and the normalized mean square error (NMSE) of CE, even against the impacts of parameter variations.
Abstract:Multi-path fading seriously affects the accuracy of timing synchronization (TS) in orthogonal frequency division multiplexing (OFDM) systems. To tackle this issue, we propose a convolutional neural network (CNN)-based TS scheme assisted by initial path acquisition in this paper. Specifically, the classic cross-correlation method is first employed to estimate a coarse timing offset and capture an initial path, which shrinks the TS search region. Then, a one-dimensional (1-D) CNN is developed to optimize the TS of OFDM systems. Due to the narrowed search region of TS, the CNN-based TS effectively locates the accurate TS point and inspires us to construct a lightweight network in terms of computational complexity and online running time. Compared with the compressed sensing-based TS method and extreme learning machine-based TS method, simulation results show that the proposed method can effectively improve the TS performance with the reduced computational complexity and online running time. Besides, the proposed TS method presents robustness against the variant parameters of multi-path fading channels.