Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinjie Li

Joint Channel Estimation and Signal Detection for MIMO-OFDM: A Novel Data-Aided Approach with Reduced Computational Overhead

Apr 20, 2025

Xinjie Li, Jing Zhang, Xingyu Zhou, Chao-Kai Wen, Shi Jin

Abstract:The acquisition of channel state information (CSI) is essential in MIMO-OFDM communication systems. Data-aided enhanced receivers, by incorporating domain knowledge, effectively mitigate performance degradation caused by imperfect CSI, particularly in dynamic wireless environments. However, existing methodologies face notable challenges: they either refine channel estimates within MIMO subsystems separately, which proves ineffective due to deviations from assumptions regarding the time-varying nature of channels, or fully exploit the time-frequency characteristics but incur significantly high computational overhead due to dimensional concatenation. To address these issues, this study introduces a novel data-aided method aimed at reducing complexity, particularly suited for fast-fading scenarios in fifth-generation (5G) and beyond networks. We derive a general form of a data-aided linear minimum mean-square error (LMMSE)-based algorithm, optimized for iterative joint channel estimation and signal detection. Additionally, we propose a computationally efficient alternative to this algorithm, which achieves comparable performance with significantly reduced complexity. Empirical evaluations reveal that our proposed algorithms outperform several state-of-the-art approaches across various MIMO-OFDM configurations, pilot sequence lengths, and in the presence of time variability. Comparative analysis with basis expansion model-based iterative receivers highlights the superiority of our algorithms in achieving an effective trade-off between accuracy and computational complexity.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

Sub-Clustering for Class Distance Recalculation in Long-Tailed Drug Classification

Apr 07, 2025

Yujia Su, Xinjie Li, Lionel Z. Wang

Abstract:In the real world, long-tailed data distributions are prevalent, making it challenging for models to effectively learn and classify tail classes. However, we discover that in the field of drug chemistry, certain tail classes exhibit higher identifiability during training due to their unique molecular structural features, a finding that significantly contrasts with the conventional understanding that tail classes are generally difficult to identify. Existing imbalance learning methods, such as resampling and cost-sensitive reweighting, overly rely on sample quantity priors, causing models to excessively focus on tail classes at the expense of head class performance. To address this issue, we propose a novel method that breaks away from the traditional static evaluation paradigm based on sample size. Instead, we establish a dynamical inter-class separability metric using feature distances between different classes. Specifically, we employ a sub-clustering contrastive learning approach to thoroughly learn the embedding features of each class, and we dynamically compute the distances between class embeddings to capture the relative positional evolution of samples from different classes in the feature space, thereby rebalancing the weights of the classification loss function. We conducted experiments on multiple existing long-tailed drug datasets and achieved competitive results by improving the accuracy of tail classes without compromising the performance of dominant classes.

Via

Access Paper or Ask Questions

Are Conditional Latent Diffusion Models Effective for Image Restoration?

Dec 13, 2024

Yunchen Yuan, Junyuan Xiao, Xinjie Li

Figure 1 for Are Conditional Latent Diffusion Models Effective for Image Restoration?

Figure 2 for Are Conditional Latent Diffusion Models Effective for Image Restoration?

Figure 3 for Are Conditional Latent Diffusion Models Effective for Image Restoration?

Figure 4 for Are Conditional Latent Diffusion Models Effective for Image Restoration?

Abstract:Recent advancements in image restoration increasingly employ conditional latent diffusion models (CLDMs). While these models have demonstrated notable performance improvements in recent years, this work questions their suitability for IR tasks. CLDMs excel in capturing high-level semantic correlations, making them effective for tasks like text-to-image generation with spatial conditioning. However, in IR, where the goal is to enhance image perceptual quality, these models face difficulty of modeling the relationship between degraded images and ground truth images using a low-level representation. To support our claims, we compare state-of-the-art CLDMs with traditional image restoration models through extensive experiments. Results reveal that despite the scaling advantages of CLDMs, they suffer from high distortion and semantic deviation, especially in cases with minimal degradation, where traditional methods outperform them. Additionally, we perform empirical studies to examine the impact of various CLDM design elements on their restoration performance. We hope this finding inspires a reexamination of current CLDM-based IR solutions, opening up more opportunities in this field.

Via

Access Paper or Ask Questions

Controlling the Latent Diffusion Model for Generative Image Shadow Removal via Residual Generation

Dec 03, 2024

Xinjie Li, Yang Zhao, Dong Wang, Yuan Chen, Li Cao, Xiaoping Liu

Abstract:Large-scale generative models have achieved remarkable advancements in various visual tasks, yet their application to shadow removal in images remains challenging. These models often generate diverse, realistic details without adequate focus on fidelity, failing to meet the crucial requirements of shadow removal, which necessitates precise preservation of image content. In contrast to prior approaches that aimed to regenerate shadow-free images from scratch, this paper utilizes diffusion models to generate and refine image residuals. This strategy fully uses the inherent detailed information within shadowed images, resulting in a more efficient and faithful reconstruction of shadow-free content. Additionally, to revent the accumulation of errors during the generation process, a crosstimestep self-enhancement training strategy is proposed. This strategy leverages the network itself to augment the training data, not only increasing the volume of data but also enabling the network to dynamically correct its generation trajectory, ensuring a more accurate and robust output. In addition, to address the loss of original details in the process of image encoding and decoding of large generative models, a content-preserved encoder-decoder structure is designed with a control mechanism and multi-scale skip connections to achieve high-fidelity shadow-free image reconstruction. Experimental results demonstrate that the proposed method can reproduce high-quality results based on a large latent diffusion prior and faithfully preserve the original contents in shadow regions.

* 13pages, 10 figures

Via

Access Paper or Ask Questions

EllipBench: A Large-scale Benchmark for Machine-learning based Ellipsometry Modeling

Jul 25, 2024

Yiming Ma, Xinjie Li, Xin Sun, Zhiyong Wang, Lionel Z. Wang

Abstract:Ellipsometry is used to indirectly measure the optical properties and thickness of thin films. However, solving the inverse problem of ellipsometry is time-consuming since it involves human expertise to apply the data fitting techniques. Many studies use traditional machine learning-based methods to model the complex mathematical fitting process. In our work, we approach this problem from a deep learning perspective. First, we introduce a large-scale benchmark dataset to facilitate deep learning methods. The proposed dataset encompasses 98 types of thin film materials and 4 types of substrate materials, including metals, alloys, compounds, and polymers, among others. Additionally, we propose a deep learning framework that leverages residual connections and self-attention mechanisms to learn the massive data points. We also introduce a reconstruction loss to address the common challenge of multiple solutions in thin film thickness prediction. Compared to traditional machine learning methods, our framework achieves state-of-the-art (SOTA) performance on our proposed dataset. The dataset and code will be available upon acceptance.

Via

Access Paper or Ask Questions

DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior

Apr 04, 2024

Yiming Zhang, Zhe Wang, Xinjie Li, Yunchen Yuan, Chengsong Zhang, Xiao Sun, Zhihang Zhong, Jian Wang

Figure 1 for DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior

Figure 2 for DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior

Figure 3 for DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior

Figure 4 for DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior

Abstract:Human body restoration plays a vital role in various applications related to the human body. Despite recent advances in general image restoration using generative models, their performance in human body restoration remains mediocre, often resulting in foreground and background blending, over-smoothing surface textures, missing accessories, and distorted limbs. Addressing these challenges, we propose a novel approach by constructing a human body-aware diffusion model that leverages domain-specific knowledge to enhance performance. Specifically, we employ a pretrained body attention module to guide the diffusion model's focus on the foreground, addressing issues caused by blending between the subject and background. We also demonstrate the value of revisiting the language modality of the diffusion model in restoration tasks by seamlessly incorporating text prompt to improve the quality of surface texture and additional clothing and accessories details. Additionally, we introduce a diffusion sampler tailored for fine-grained human body parts, utilizing local semantic information to rectify limb distortions. Lastly, we collect a comprehensive dataset for benchmarking and advancing the field of human body restoration. Extensive experimental validation showcases the superiority of our approach, both quantitatively and qualitatively, over existing methods.

Via

Access Paper or Ask Questions

CI w/o TN: Context Injection without Task Name for Procedure Planning

Feb 23, 2024

Xinjie Li

Abstract:This paper explores the challenge of procedure planning in instructional videos, which involves creating goal-directed plans based on visual start and goal observations from videos. Previous research has tackled this problem with gradually weaker training supervision, from heavy intermediate visual observations or language instructions to task class supervision. However, with the advent of large language models, even given only the task name, these models can produce a detailed plan. In this study, we propose a much weaker setting without task name as supervision, which is not currently solvable by existing large language models since they require good prompts with sufficient information. Specifically, we hypothesize that previous intermediate supervisions can serve as context information, and we use captions of visual start and goal observations as a much cheaper form of supervision. This approach greatly reduces the labeling cost since the captions can be easily obtained by large pre-trained vision-language foundation models. Technically, we apply BLIP to generate captions as supervision to train the context feature with contrastive learning loss. Afterward, the context feature is fed into the generator to aid in plan generation. Our experiments on two datasets with varying scales demonstrate that our model can achieve comparable performance on multiple metrics, which validates our hypothesis.

Via

Access Paper or Ask Questions

On the Sweet Spot of Contrastive Views for Knowledge-enhanced Recommendation

Sep 23, 2023

Haibo Ye, Xinjie Li, Yuan Yao, Hanghang Tong

Abstract:In recommender systems, knowledge graph (KG) can offer critical information that is lacking in the original user-item interaction graph (IG). Recent process has explored this direction and shows that contrastive learning is a promising way to integrate both. However, we observe that existing KG-enhanced recommenders struggle in balancing between the two contrastive views of IG and KG, making them sometimes even less effective than simply applying contrastive learning on IG without using KG. In this paper, we propose a new contrastive learning framework for KG-enhanced recommendation. Specifically, to make full use of the knowledge, we construct two separate contrastive views for KG and IG, and maximize their mutual information; to ease the contrastive learning on the two views, we further fuse KG information into IG in a one-direction manner.Extensive experimental results on three real-world datasets demonstrate the effectiveness and efficiency of our method, compared to the state-of-the-art. Our code is available through the anonymous link:https://figshare.com/articles/conference_contribution/SimKGCL/22783382

Via

Access Paper or Ask Questions

Enhancing Underwater Image via Adaptive Color and Contrast Enhancement, and Denoising

Apr 02, 2021

Xinjie Li, Guojia Hou, Kunqian Li

Figure 1 for Enhancing Underwater Image via Adaptive Color and Contrast Enhancement, and Denoising

Figure 2 for Enhancing Underwater Image via Adaptive Color and Contrast Enhancement, and Denoising

Figure 3 for Enhancing Underwater Image via Adaptive Color and Contrast Enhancement, and Denoising

Figure 4 for Enhancing Underwater Image via Adaptive Color and Contrast Enhancement, and Denoising

Abstract:Images captured underwater are often characterized by low contrast, color distortion, and noise. To address these visual degradations, we propose a novel scheme by constructing an adaptive color and contrast enhancement, and denoising (ACCE-D) framework for underwater image enhancement. In the proposed framework, Gaussian filter and Bilateral filter are respectively employed to decompose the high-frequency and low-frequency components. Benefited from this separation, we utilize soft-thresholding operation to suppress the noise in the high-frequency component. Accordingly, the low-frequency component is enhanced by using an adaptive color and contrast enhancement (ACCE) strategy. The proposed ACCE is a new adaptive variational framework implemented in the HSI color space, in which we design a Gaussian weight function and a Heaviside function to adaptively adjust the role of data item and regularized item. Moreover, we derive a numerical solution for ACCE, and adopt a pyramid-based strategy to accelerate the solving procedure. Experimental results demonstrate that our strategy is effective in color correction, visibility improvement, and detail revealing. Comparison with state-of-the-art techniques also validate the superiority of propose method. Furthermore, we have verified the utility of our proposed ACCE-D for enhancing other types of degraded scenes, including foggy scene, sandstorm scene and low-light scene.

Via

Access Paper or Ask Questions

Semantic Bilinear Pooling for Fine-Grained Recognition

Apr 03, 2019

Xinjie Li, Chun Yang, Songlu Chen, Chao Zhu, Xucheng Yin

Figure 1 for Semantic Bilinear Pooling for Fine-Grained Recognition

Figure 2 for Semantic Bilinear Pooling for Fine-Grained Recognition

Figure 3 for Semantic Bilinear Pooling for Fine-Grained Recognition

Figure 4 for Semantic Bilinear Pooling for Fine-Grained Recognition

Abstract:Fine-grained recognition, e.g., vehicle identification or bird classification, naturally has specific hierarchical labels, where fine levels are always much harder to be classified than coarse levels. However, most of the recent deep learning based methods neglect the semantic structure of fine-grained objects, and do not take advantages of the traditional fine-grained recognition techniques (e.g. coarse-to-fine classification). In this paper, we propose a novel framework, i.e., semantic bilinear pooling, for fine-grained recognition with hierarchical multi-label learning. This framework can adaptively learn the semantic information from the hierarchical labels. Specifically, a generalized softmax loss is designed for the training of the proposed framework, in order to fully exploit the semantic priors via considering the relevance between adjacent levels. A variety of experiments on several public datasets show that our proposed method has very impressive performance with low feature dimensions compared to other state-of-the-art methods.

Via

Access Paper or Ask Questions