Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jin Hong

ERANet: Edge Replacement Augmentation for Semi-Supervised Meniscus Segmentation with Prototype Consistency Alignment and Conditional Self-Training

Feb 11, 2025

Siyue Li, Yongcheng Yao, Junru Zhong, Shutian Zhao, Yudong Zhang, Shuihua Wang, Jin Hong, Weitian Chen

Abstract:Manual segmentation is labor-intensive, and automatic segmentation remains challenging due to the inherent variability in meniscal morphology, partial volume effects, and low contrast between the meniscus and surrounding tissues. To address these challenges, we propose ERANet, an innovative semi-supervised framework for meniscus segmentation that effectively leverages both labeled and unlabeled images through advanced augmentation and learning strategies. ERANet integrates three key components: edge replacement augmentation (ERA), prototype consistency alignment (PCA), and a conditional self-training (CST) strategy within a mean teacher architecture. ERA introduces anatomically relevant perturbations by simulating meniscal variations, ensuring that augmentations align with the structural context. PCA enhances segmentation performance by aligning intra-class features and promoting compact, discriminative feature representations, particularly in scenarios with limited labeled data. CST improves segmentation robustness by iteratively refining pseudo-labels and mitigating the impact of label noise during training. Together, these innovations establish ERANet as a robust and scalable solution for meniscus segmentation, effectively addressing key barriers to practical implementation. We validated ERANet comprehensively on 3D Double Echo Steady State (DESS) and 3D Fast/Turbo Spin Echo (FSE/TSE) MRI sequences. The results demonstrate the superior performance of ERANet compared to state-of-the-art methods. The proposed framework achieves reliable and accurate segmentation of meniscus structures, even when trained on minimal labeled data. Extensive ablation studies further highlight the synergistic contributions of ERA, PCA, and CST, solidifying ERANet as a transformative solution for semi-supervised meniscus segmentation in medical imaging.

Via

Access Paper or Ask Questions

WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Jan 21, 2025

Jilan Cheng, Guoli Long, Zeyu Zhang, Zhenjia Qi, Hanyu Wang, Libin Lu, Shuihua Wang, Yudong Zhang, Jin Hong

Figure 1 for WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Figure 2 for WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Figure 3 for WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Figure 4 for WaveNet-SF: A Hybrid Network for Retinal Disease Detection Based on Wavelet Transform in the Spatial-Frequency Domain

Abstract:Retinal diseases are a leading cause of vision impairment and blindness, with timely diagnosis being critical for effective treatment. Optical Coherence Tomography (OCT) has become a standard imaging modality for retinal disease diagnosis, but OCT images often suffer from issues such as speckle noise, complex lesion shapes, and varying lesion sizes, making interpretation challenging. In this paper, we propose a novel framework, WaveNet-SF, to enhance retinal disease detection by integrating spatial-domain and frequency-domain learning. The framework utilizes wavelet transforms to decompose OCT images into low- and high-frequency components, enabling the model to extract both global structural features and fine-grained details. To improve lesion detection, we introduce a multi-scale wavelet spatial attention (MSW-SA) module, which enhances the model's focus on regions of interest at multiple scales. Additionally, a high-frequency feature compensation block (HFFC) is incorporated to recover edge information lost during wavelet decomposition, suppress noise, and preserve fine details crucial for lesion detection. Our approach achieves state-of-the-art (SOTA) classification accuracies of 97.82% and 99. 58% on the OCT-C8 and OCT2017 datasets, respectively, surpassing existing methods. These results demonstrate the efficacy of WaveNet-SF in addressing the challenges of OCT image analysis and its potential as a powerful tool for retinal disease diagnosis.

Via

Access Paper or Ask Questions

Rethinking domain generalization in medical image segmentation: One image as one domain

Jan 08, 2025

Jin Hong, Bo Liu, Guoli Long

Abstract:Domain shifts in medical image segmentation, particularly when data comes from different centers, pose significant challenges. Intra-center variability, such as differences in scanner models or imaging protocols, can cause domain shifts as large as, or even larger than, those between centers. To address this, we propose the "one image as one domain" (OIOD) hypothesis, which treats each image as a unique domain, enabling flexible and robust domain generalization. Based on this hypothesis, we develop a unified disentanglement-based domain generalization (UniDDG) framework, which simultaneously handles both multi-source and single-source domain generalization without requiring explicit domain labels. This approach simplifies training with a fixed architecture, independent of the number of source domains, reducing complexity and enhancing scalability. We decouple each input image into content representation and style code, then exchange and combine these within the batch for segmentation, reconstruction, and further disentanglement. By maintaining distinct style codes for each image, our model ensures thorough decoupling of content representations and style codes, improving domain invariance of the content representations. Additionally, we enhance generalization with expansion mask attention (EMA) for boundary preservation and style augmentation (SA) to simulate diverse image styles, improving robustness to domain shifts. Extensive experiments show that our method achieves Dice scores of 84.43% and 88.91% for multi-source to single-center and single-center generalization in optic disc and optic cup segmentation, respectively, and 86.96% and 88.56% for prostate segmentation, outperforming current state-of-the-art domain generalization methods, offering superior performance and adaptability across clinical settings.

Via

Access Paper or Ask Questions

DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation

Jan 07, 2025

Bo Liu, Yudong Zhang, Shuihua Wang, Siyue Li, Jin Hong

Figure 1 for DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation

Figure 2 for DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation

Figure 3 for DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation

Figure 4 for DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation

Abstract:Retinal vascular morphology is crucial for diagnosing diseases such as diabetes, glaucoma, and hypertension, making accurate segmentation of retinal vessels essential for early intervention. Traditional segmentation methods assume that training and testing data share similar distributions, which can lead to poor performance on unseen domains due to domain shifts caused by variations in imaging devices and patient demographics. This paper presents a novel approach, DGSSA, for retinal vessel image segmentation that enhances model generalization by combining structural and style augmentation strategies. We utilize a space colonization algorithm to generate diverse vascular-like structures that closely mimic actual retinal vessels, which are then used to generate pseudo-retinal images with an improved Pix2Pix model, allowing the segmentation model to learn a broader range of structure distributions. Additionally, we utilize PixMix to implement random photometric augmentations and introduce uncertainty perturbations, thereby enriching stylistic diversity and significantly enhancing the model's adaptability to varying imaging conditions. Our framework has been rigorously evaluated on four challenging datasets-DRIVE, CHASEDB, HRF, and STARE-demonstrating state-of-the-art performance that surpasses existing methods. This validates the effectiveness of our proposed approach, highlighting its potential for clinical application in automated retinal vessel analysis.

Via

Access Paper or Ask Questions

Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Jul 02, 2024

Xiangrui Kong, Wenxiao Zhang, Jin Hong, Thomas Braunl

Figure 1 for Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Figure 2 for Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Figure 3 for Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Figure 4 for Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models

Abstract:In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in understanding and solving mathematical problems, leading to advancements in various fields. We propose an LLM-embodied path planning framework for mobile agents, focusing on solving high-level coverage path planning issues and low-level control. Our proposed multi-layer architecture uses prompted LLMs in the path planning phase and integrates them with the mobile agents' low-level actuators. To evaluate the performance of various LLMs, we propose a coverage-weighted path planning metric to assess the performance of the embodied models. Our experiments show that the proposed framework improves LLMs' spatial inference abilities. We demonstrate that the proposed multi-layer framework significantly enhances the efficiency and accuracy of these tasks by leveraging the natural language understanding and generative capabilities of LLMs. Our experiments show that this framework can improve LLMs' 2D plane reasoning abilities and complete coverage path planning tasks. We also tested three LLM kernels: gpt-4o, gemini-1.5-flash, and claude-3.5-sonnet. The experimental results show that claude-3.5 can complete the coverage planning task in different scenarios, and its indicators are better than those of the other models.

* 7 pages, 2 figures, conference

Via

Access Paper or Ask Questions

Quantum-Inspired Machine Learning: a Survey

Sep 08, 2023

Larry Huynh, Jin Hong, Ajmal Mian, Hajime Suzuki, Yanqiu Wu, Seyit Camtepe

Figure 1 for Quantum-Inspired Machine Learning: a Survey

Figure 2 for Quantum-Inspired Machine Learning: a Survey

Figure 3 for Quantum-Inspired Machine Learning: a Survey

Figure 4 for Quantum-Inspired Machine Learning: a Survey

Abstract:Quantum-inspired Machine Learning (QiML) is a burgeoning field, receiving global attention from researchers for its potential to leverage principles of quantum mechanics within classical computational frameworks. However, current review literature often presents a superficial exploration of QiML, focusing instead on the broader Quantum Machine Learning (QML) field. In response to this gap, this survey provides an integrated and comprehensive examination of QiML, exploring QiML's diverse research domains including tensor network simulations, dequantized algorithms, and others, showcasing recent advancements, practical applications, and illuminating potential future research avenues. Further, a concrete definition of QiML is established by analyzing various prior interpretations of the term and their inherent ambiguities. As QiML continues to evolve, we anticipate a wealth of future developments drawing from quantum mechanics, quantum computing, and classical machine learning, enriching the field further. This survey serves as a guide for researchers and practitioners alike, providing a holistic understanding of QiML's current landscape and future directions.

* 59 pages, 13 figures, 9 tables. - Edited for spelling, grammar, and corrected minor typos in formulas - Adjusted wording in places for better clarity - Corrected contact info - Added Table 1 to clarify variables used in dequantized algs. - Added subsections in QVAS discussing QCBMs and TN-based VQC models - Included additional references as requested by authors to ensure a more exhaustive survey

Via

Access Paper or Ask Questions

Multi-resolution Spatiotemporal Enhanced Transformer Denoising with Functional Diffusive GANs for Constructing Brain Effective Connectivity in MCI analysis

May 18, 2023

Qiankun Zuo, Chi-Man Pun, Yudong Zhang, Hongfei Wang, Jin Hong

Figure 1 for Multi-resolution Spatiotemporal Enhanced Transformer Denoising with Functional Diffusive GANs for Constructing Brain Effective Connectivity in MCI analysis

Figure 2 for Multi-resolution Spatiotemporal Enhanced Transformer Denoising with Functional Diffusive GANs for Constructing Brain Effective Connectivity in MCI analysis

Figure 3 for Multi-resolution Spatiotemporal Enhanced Transformer Denoising with Functional Diffusive GANs for Constructing Brain Effective Connectivity in MCI analysis

Figure 4 for Multi-resolution Spatiotemporal Enhanced Transformer Denoising with Functional Diffusive GANs for Constructing Brain Effective Connectivity in MCI analysis

Abstract:Effective connectivity can describe the causal patterns among brain regions. These patterns have the potential to reveal the pathological mechanism and promote early diagnosis and effective drug development for cognitive disease. However, the current studies mainly focus on using empirical functional time series to calculate effective connections, which may not comprehensively capture the complex causal relationships between brain regions. In this paper, a novel Multi-resolution Spatiotemporal Enhanced Transformer Denoising (MSETD) network with an adversarially functional diffusion model is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment (MCI) analysis. To be specific, the denoising framework leverages a conditional diffusion process that progressively translates the noise and conditioning fMRI to effective connectivity in an end-to-end manner. To ensure reverse diffusion quality and diversity, the multi-resolution enhanced transformer generator is designed to extract local and global spatiotemporal features. Furthermore, a multi-scale diffusive transformer discriminator is devised to capture the temporal patterns at different scales for generation stability. Evaluations of the ADNI datasets demonstrate the feasibility and efficacy of the proposed model. The proposed model not only achieves superior prediction performance compared with other competing methods but also identifies MCI-related causal connections that are consistent with clinical studies.

* 12 pages, 14 figures

Via

Access Paper or Ask Questions

Source-free unsupervised domain adaptation for cross-modality abdominal multi-organ segmentation

Nov 30, 2021

Jin Hong, Yu-Dong Zhang, Weitian Chen

Figure 1 for Source-free unsupervised domain adaptation for cross-modality abdominal multi-organ segmentation

Figure 2 for Source-free unsupervised domain adaptation for cross-modality abdominal multi-organ segmentation

Figure 3 for Source-free unsupervised domain adaptation for cross-modality abdominal multi-organ segmentation

Figure 4 for Source-free unsupervised domain adaptation for cross-modality abdominal multi-organ segmentation

Abstract:It is valuable to achieve domain adaptation to transfer the learned knowledge from the source labeled CT dataset to the target unlabeled MR dataset for abdominal multi-organ segmentation. Meanwhile, it is highly desirable to avoid high annotation cost of target dataset and protect privacy of source dataset. Therefore, we propose an effective source-free unsupervised domain adaptation method for cross-modality abdominal multi-organ segmentation without accessing the source dataset. The process of the proposed framework includes two stages. At the first stage, the feature map statistics loss is used to align the distributions of the source and target features in the top segmentation network, and entropy minimization loss is used to encourage high confidence segmentations. The pseudo-labels outputted from the top segmentation network is used to guide the style compensation network to generate source-like images. The pseudo-labels outputted from the middle segmentation network is used to supervise the learning of the desired model (the bottom segmentation network). At the second stage, the circular learning and the pixel-adaptive mask refinement are used to further improve the performance of the desired model. With this approach, we achieve satisfactory performances on the segmentations of liver, right kidney, left kidney, and spleen with the dice similarity coefficients of 0.884, 0.891, 0.864, and 0.911, respectively. In addition, the proposed approach can be easily extended to the situation when there exists target annotation data. The performance improves from 0.888 to 0.922 in average dice similarity coefficient, close to the supervised learning (0.929), with only one labeled MR volume.

Via

Access Paper or Ask Questions

Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

Sep 20, 2021

Jin Hong, Simon Chun-Ho Yu, Weitian Chen

Figure 1 for Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

Figure 2 for Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

Figure 3 for Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

Figure 4 for Unsupervised domain adaptation for cross-modality liver segmentation via joint adversarial learning and self-learning

Abstract:Liver segmentation on images acquired using computed tomography (CT) and magnetic resonance imaging (MRI) plays an important role in clinical management of liver diseases. Compared to MRI, CT images of liver are more abundant and readily available. However, MRI can provide richer quantitative information of the liver compared to CT. Thus, it is desirable to achieve unsupervised domain adaptation for transferring the learned knowledge from the source domain containing labeled CT images to the target domain containing unlabeled MR images. In this work, we report a novel unsupervised domain adaptation framework for cross-modality liver segmentation via joint adversarial learning and self-learning. We propose joint semantic-aware and shape-entropy-aware adversarial learning with post-situ identification manner to implicitly align the distribution of task-related features extracted from the target domain with those from the source domain. In proposed framework, a network is trained with the above two adversarial losses in an unsupervised manner, and then a mean completer of pseudo-label generation is employed to produce pseudo-labels to train the next network (desired model). Additionally, semantic-aware adversarial learning and two self-learning methods, including pixel-adaptive mask refinement and student-to-partner learning, are proposed to train the desired model. To improve the robustness of the desired model, a low-signal augmentation function is proposed to transform MRI images as the input of the desired model to handle hard samples. Using the public data sets, our experiments demonstrated the proposed unsupervised domain adaptation framework outperformed four supervised learning methods with a Dice score 0.912 plus or minus 0.037 (mean plus or minus standard deviation).

Via

Access Paper or Ask Questions