Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaofeng Li

PAD: Phase-Amplitude Decoupling Fusion for Multi-Modal Land Cover Classification

Apr 27, 2025

Huiling Zheng, Xian Zhong, Bin Liu, Yi Xiao, Bihan Wen, Xiaofeng Li

Abstract:The fusion of Synthetic Aperture Radar (SAR) and RGB imagery for land cover classification remains challenging due to modality heterogeneity and the underutilization of spectral complementarity. Existing methods often fail to decouple shared structural features from modality-specific radiometric attributes, leading to feature conflicts and information loss. To address this issue, we propose Phase-Amplitude Decoupling (PAD), a frequency-aware framework that separates phase (modality-shared) and amplitude (modality-specific) components in the Fourier domain. Specifically, PAD consists of two key components: 1) Phase Spectrum Correction (PSC), which aligns cross-modal phase features through convolution-guided scaling to enhance geometric consistency, and 2) Amplitude Spectrum Fusion (ASF), which dynamically integrates high-frequency details and low-frequency structures using frequency-adaptive multilayer perceptrons. This approach leverages SAR's sensitivity to morphological features and RGB's spectral richness. Extensive experiments on WHU-OPT-SAR and DDHR-SK datasets demonstrate state-of-the-art performance. Our work establishes a new paradigm for physics-aware multi-modal fusion in remote sensing. The code will be available at https://github.com/RanFeng2/PAD.

* 13 pages, 8 figures

Via

Access Paper or Ask Questions

LangYa: Revolutionizing Cross-Spatiotemporal Ocean Forecasting

Dec 24, 2024

Nan Yang, Chong Wang, Meihua Zhao, Zimeng Zhao, Huiling Zheng, Bin Zhang, Jianing Wang, Xiaofeng Li

Abstract:Ocean forecasting is crucial for both scientific research and societal benefits. Currently, the most accurate forecasting systems are global ocean forecasting systems (GOFSs), which represent the ocean state variables (OSVs) as discrete grids and solve partial differential equations (PDEs) governing the transitions of oceanic state variables using numerical methods. However, GOFSs processes are computationally expensive and prone to cumulative errors. Recently, large artificial intelligence (AI)-based models significantly boosted forecasting speed and accuracy. Unfortunately, building a large AI ocean forecasting system that can be considered cross-spatiotemporal and air-sea coupled forecasts remains a significant challenge. Here, we introduce LangYa, a cross-spatiotemporal and air-sea coupled ocean forecasting system. Results demonstrate that the time embedding module in LangYa enables a single model to make forecasts with lead times ranging from 1 to 7 days. The air-sea coupled module effectively simulates air-sea interactions. The ocean self-attention module improves network stability and accelerates convergence during training, and the adaptive thermocline loss function improves the accuracy of thermocline forecasting. Compared to existing numerical and AI-based ocean forecasting systems, LangYa uses 27 years of global ocean data from the Global Ocean Reanalysis and Simulation version 12 (GLORYS12) for training and achieves more reliable deterministic forecasting results for OSVs. LangYa forecasting system provides global ocean researchers with access to a powerful software tool for accurate ocean forecasting and opens a new paradigm for ocean science.

* 18pages, 5 figures

Via

Access Paper or Ask Questions

Paths of A Million People: Extracting Life Trajectories from Wikipedia

May 25, 2024

Ying Zhang, Xiaofeng Li, Zhaoyang Liu, Haipeng Zhang

Figure 1 for Paths of A Million People: Extracting Life Trajectories from Wikipedia

Figure 2 for Paths of A Million People: Extracting Life Trajectories from Wikipedia

Figure 3 for Paths of A Million People: Extracting Life Trajectories from Wikipedia

Figure 4 for Paths of A Million People: Extracting Life Trajectories from Wikipedia

Abstract:Notable people's life trajectories have been a focus of study -- the locations and times of various activities, such as birth, death, education, marriage, competition, work, delivering a speech, making a scientific discovery, finishing a masterpiece, and fighting a battle, and how these people interact with others, carry important messages for the broad research related to human dynamics. However, the scarcity of trajectory data in terms of volume, density, and inter-person interactions, limits relevant studies from being comprehensive and interactive. We mine millions of biography pages from Wikipedia and tackle the generalization problem stemming from the variety and heterogeneity of the trajectory descriptions. Our ensemble model COSMOS, which combines the idea of semi-supervised learning and contrastive learning, achieves an F1 score of 85.95%. For this task, we also create a hand-curated dataset, WikiLifeTrajectory, consisting of 8,852 (person, time, location) triplets as ground truth. Besides, we perform an empirical analysis on the trajectories of 8,272 historians to demonstrate the validity of the extracted results. To facilitate the research on trajectory extractions and help the analytical studies to construct grand narratives, we make our code, the million-level extracted trajectories, and the WikiLifeTrajectory dataset publicly available.

* Preprint, under review. 15 pages

Via

Access Paper or Ask Questions

A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Apr 25, 2024

Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

Figure 1 for A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Figure 2 for A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Figure 3 for A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Figure 4 for A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

Abstract:Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classifying 2D echocardiography data into five distinct echocardiographic views: apical 4-chamber, parasternal long axis of left ventricle, parasternal short axis at levels of the mitral valve, papillary muscle, and apex. It then extracts features of each view separately and combines five features for disease classification. A total of 212 patients diagnosed with HCM, and 30 patients diagnosed with CA, along with 200 individuals with normal cardiac function(Normal), were enrolled in this study from 2018 to 2022. This approach achieved a precision, recall of 0.905, and micro-F1 score of 0.904, demonstrating its effectiveness in accurately identifying HCM and CA using a multi-view analysis.

Via

Access Paper or Ask Questions

Deep Learning for Joint Design of Pilot, Channel Feedback, and Hybrid Beamforming in FDD Massive MIMO-OFDM Systems

Dec 10, 2023

Junyi Yang, Weifeng Zhu, Shu Sun, Xiaofeng Li, Xingqin Lin, Meixia Tao

Abstract:This letter considers the transceiver design in frequency division duplex (FDD) massive multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) systems for high-quality data transmission. We propose a novel deep learning based framework where the procedures of pilot design, channel feedback, and hybrid beamforming are realized by carefully crafted deep neural networks. All the considered modules are jointly learned in an end-to-end manner, and a graph neural network is adopted to effectively capture interactions between beamformers based on the built graphical representation. Numerical results validate the effectiveness of our method.

* 5 pages, 4 figures, acccpted by IEEE Communication Letters

Via

Access Paper or Ask Questions

SNR-based beaconless multi-scan link acquisition model with vibration for LEO-to-ground laser communication

Aug 06, 2023

Sen Yang, Xiaofeng Li

Abstract:We propose a link acquisition time model deeply involving the process from the transmitted power to received signal-to-noise ratio (SNR) for LEO-to-ground laser communication for the first time. Compared with the conventional acquisition models founded on geometry analysis with divergence angle threshold, utilizing SNR as the decision criterion is more appropriate for practical engineering requirements. Specially, under the combined effects of platform vibration and turbulence, we decouple the parameters of beam divergence angle, spiral pitch, and coverage factor at a fixed transmitted power for a given average received SNR threshold. Then the single-scan acquisition probability is obtained by integrating the field of uncertainty (FOU), probability distribution of coverage factor, and receiver field angle. Consequently, the closed-form analytical expression of acquisition time expectation adopting multi-scan, which ensures acquisition success, with essential reset time between single-scan is derived. The optimizations concerning the beam divergence angle, spiral pitch, and FOU are presented. Moreover, the influence of platform vibration is investigated. All the analytical derivations are confirmed by Monte Carlo simulations. Notably, we provide a theoretical method for designing the minimum divergence angle modulated by the laser, which not only improves the acquisition performance within a certain vibration range, but also achieves a good trade-off with the system complexity.

* 15 pages, 7 figures

Via

Access Paper or Ask Questions

CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

Jun 21, 2022

Xiaofeng Li, Bin Ren, Xipeng Shen, Yanzhi Wang

Figure 1 for CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

Figure 2 for CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

Figure 3 for CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

Figure 4 for CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

Abstract:There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices. The shift has however been seriously hampered by the large growing gap between DNN computing demands and the computing power on edge or end devices. This article presents the design of XGen, an optimizing framework for DNN designed to bridge the gap. XGen takes cross-cutting co-design as its first-order consideration. Its full-stack AI-oriented optimizations consist of a number of innovative optimizations at every layer of the DNN software stack, all designed in a cooperative manner. The unique technology makes XGen able to optimize various DNNs, including those with an extreme depth (e.g., BERT, GPT, other transformers), and generate code that runs several times faster than those from existing DNN frameworks, while delivering the same level of accuracy.

Via

Access Paper or Ask Questions

Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager

May 04, 2022

Yuan Zhou, Keran Chen, Xiaofeng Li

Figure 1 for Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager

Figure 2 for Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager

Figure 3 for Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager

Figure 4 for Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager

Abstract:Sea fog significantly threatens the safety of maritime activities. This paper develops a sea fog dataset (SFDD) and a dual branch sea fog detection network (DB-SFNet). We investigate all the observed sea fog events in the Yellow Sea and the Bohai Sea (118.1{\deg}E-128.1{\deg}E, 29.5{\deg}N-43.8{\deg}N) from 2010 to 2020, and collect the sea fog images for each event from the Geostationary Ocean Color Imager (GOCI) to comprise the dataset SFDD. The location of the sea fog in each image in SFDD is accurately marked. The proposed dataset is characterized by a long-time span, large number of samples, and accurate labeling, that can substantially improve the robustness of various sea fog detection models. Furthermore, this paper proposes a dual branch sea fog detection network to achieve accurate and holistic sea fog detection. The poporsed DB-SFNet is composed of a knowledge extraction module and a dual branch optional encoding decoding module. The two modules jointly extracts discriminative features from both visual and statistical domain. Experiments show promising sea fog detection results with an F1-score of 0.77 and a critical success index of 0.63. Compared with existing advanced deep learning networks, DB-SFNet is superior in detection performance and stability, particularly in the mixed cloud and fog areas.

Via

Access Paper or Ask Questions

Asymptotic analysis of V-BLAST MIMO for coherent optical wireless communications in Gamma-Gamma turbulence

Jul 12, 2021

Yiming Li, Chao Gao, Mark S. Leeson, Xiaofeng Li

Figure 1 for Asymptotic analysis of V-BLAST MIMO for coherent optical wireless communications in Gamma-Gamma turbulence

Figure 2 for Asymptotic analysis of V-BLAST MIMO for coherent optical wireless communications in Gamma-Gamma turbulence

Figure 3 for Asymptotic analysis of V-BLAST MIMO for coherent optical wireless communications in Gamma-Gamma turbulence

Figure 4 for Asymptotic analysis of V-BLAST MIMO for coherent optical wireless communications in Gamma-Gamma turbulence

Abstract:This paper investigates the asymptotic BER performance of coherent optical wireless communication systems in Gamma-Gamma turbulence when applying the V-BLAST MIMO scheme. A new method is proposed to quantify the performance of the system and mathematical solutions for asymptotic BER performance are derived. Counterintuitive results are shown since the diversity gain of the V-BLAST MIMO system is equal to the number of the receivers. As a consequence, it is shown that when applying the V-BLAST MIMO scheme, the symbol rate per transmission can be equal to the number of transmitters with some cost to diversity gain. This means that we can simultaneously exploit the spatial multiplexing and diversity properties of the MIMO system to achieve a higher data rate than existing schemes in a channel that displays severe turbulence and moderate attenuation.

Via

Access Paper or Ask Questions

Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Oct 11, 2020

Shu Sun, Xiaofeng Li

Figure 1 for Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Figure 2 for Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Figure 3 for Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Figure 4 for Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Abstract:In this work, we propose a novel scheduling algorithm with contiguous frequency-domain resource allocation (FDRA) based on deep reinforcement learning (DRL) that jointly selects users and allocates resource blocks (RBs). The scheduling problem is modeled as a Markov decision process, and a DRL agent determines which user and how many consecutive RBs for that user should be scheduled at each RB allocation step. The state, action, and reward sets are delicately designed to train the DRL network. More specifically, the originally quasicontinuous action space, which is inherent to contiguous FDRA, is refined into a finite and discrete action space to obtain a tradeoff between the inference latency and system performance. Simulation results show that the proposed DRL-based algorithm outperforms other representative baseline schemes while having lower online computational complexity.

* 6 pages, 4 figures

Via

Access Paper or Ask Questions