Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Deqiang Xiao

Ran Score: a LLM-based Evaluation Score for Radiology Report Generation

Mar 24, 2026

Ran Zhang, Yucong Lin, Zhaoli Su, Bowen Liu, Danni Ai, Tianyu Fu, Deqiang Xiao, Jingfan Fan, Yuanyuan Wang, Mingwei Gao(+6 more)

Abstract:Chest X-ray report generation and automated evaluation are limited by poor recognition of low-prevalence abnormalities and inadequate handling of clinically important language, including negation and ambiguity. We develop a clinician-guided framework combining human expertise and large language models for multi-label finding extraction from free-text chest X-ray reports and use it to define Ran Score, a finding-level metric for report evaluation. Using three non-overlapping MIMIC-CXR-EN cohorts from a public chest X-ray dataset and an independent ChestX-CN validation cohort, we optimize prompts, establish radiologist-derived reference labels and evaluate report generation models. The optimized framework improves the macro-averaged score from 0.753 to 0.956 on the MIMIC-CXR-EN development cohort, exceeds the CheXbert benchmark by 15.7 percentage points on directly comparable labels, and shows robust generalization on the ChestX-CN validation cohort. Here we show that clinician-guided prompt optimization improves agreement with a radiologist-derived reference standard and that Ran Score enables finding-level evaluation of report fidelity, particularly for low-prevalence abnormalities.

* 4 pages, 5 figures

Via

Access Paper or Ask Questions

Robust Point Cloud Registration via Geometric Overlapping Guided Rotation Search

Aug 24, 2025

Zhao Zheng, Jingfan Fan, Long Shao, Hong Song, Danni Ai, Tianyu Fu, Deqiang Xiao, Yongtian Wang, Jian Yang

Abstract:Point cloud registration based on correspondences computes the rigid transformation that maximizes the number of inliers constrained within the noise threshold. Current state-of-the-art (SOTA) methods employing spatial compatibility graphs or branch-and-bound (BnB) search mainly focus on registration under high outlier ratios. However, graph-based methods require at least quadratic space and time complexity for graph construction, while multi-stage BnB search methods often suffer from inaccuracy due to local optima between decomposed stages. This paper proposes a geometric maximum overlapping registration framework via rotation-only BnB search. The rigid transformation is decomposed using Chasles' theorem into a translation along rotation axis and a 2D rigid transformation. The optimal rotation axis and angle are searched via BnB, with residual parameters formulated as range maximum query (RMQ) problems. Firstly, the top-k candidate rotation axes are searched within a hemisphere parameterized by cube mapping, and the translation along each axis is estimated through interval stabbing of the correspondences projected onto that axis. Secondly, the 2D registration is relaxed to 1D rotation angle search with 2D RMQ of geometric overlapping for axis-aligned rectangles, which is solved deterministically in polynomial time using sweep line algorithm with segment tree. Experimental results on 3DMatch, 3DLoMatch, and KITTI datasets demonstrate superior accuracy and efficiency over SOTA methods, while the time complexity is polynomial and the space complexity increases linearly with the number of points, even in the worst case.

Via

Access Paper or Ask Questions

Sculpting Margin Penalty: Intra-Task Adapter Merging and Classifier Calibration for Few-Shot Class-Incremental Learning

Aug 07, 2025

Liang Bai, Hong Song, Jinfu Li, Yucong Lin, Jingfan Fan, Tianyu Fu, Danni Ai, Deqiang Xiao, Jian Yang

Abstract:Real-world applications often face data privacy constraints and high acquisition costs, making the assumption of sufficient training data in incremental tasks unrealistic and leading to significant performance degradation in class-incremental learning. Forward-compatible learning, which prospectively prepares for future tasks during base task training, has emerged as a promising solution for Few-Shot Class-Incremental Learning (FSCIL). However, existing methods still struggle to balance base-class discriminability and new-class generalization. Moreover, limited access to original data during incremental tasks often results in ambiguous inter-class decision boundaries. To address these challenges, we propose SMP (Sculpting Margin Penalty), a novel FSCIL method that strategically integrates margin penalties at different stages within the parameter-efficient fine-tuning paradigm. Specifically, we introduce the Margin-aware Intra-task Adapter Merging (MIAM) mechanism for base task learning. MIAM trains two sets of low-rank adapters with distinct classification losses: one with a margin penalty to enhance base-class discriminability, and the other without margin constraints to promote generalization to future new classes. These adapters are then adaptively merged to improve forward compatibility. For incremental tasks, we propose a Margin Penalty-based Classifier Calibration (MPCC) strategy to refine decision boundaries by fine-tuning classifiers on all seen classes' embeddings with a margin penalty. Extensive experiments on CIFAR100, ImageNet-R, and CUB200 demonstrate that SMP achieves state-of-the-art performance in FSCIL while maintaining a better balance between base and new classes.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions

MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images

Feb 05, 2025

Dawei Lu, Deqiang Xiao, Danni Ai, Jingfan Fan, Tianyu Fu, Yucong Lin, Hong Song, Xujiong Ye, Lei Zhang, Jian Yang

Figure 1 for MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images

Figure 2 for MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images

Figure 3 for MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images

Figure 4 for MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images

Abstract:Depth estimation from monocular endoscopic images presents significant challenges due to the complexity of endoscopic surgery, such as irregular shapes of human soft tissues, as well as variations in lighting conditions. Existing methods primarily estimate the depth information from RGB images directly, and often surffer the limited interpretability and accuracy. Given that RGB and depth images are two views of the same endoscopic surgery scene, in this paper, we introduce a novel concept referred as ``meta feature embedding (MetaFE)", in which the physical entities (e.g., tissues and surgical instruments) of endoscopic surgery are represented using the shared features that can be alternatively decoded into RGB or depth image. With this concept, we propose a two-stage self-supervised learning paradigm for the monocular endoscopic depth estimation. In the first stage, we propose a temporal representation learner using diffusion models, which are aligned with the spatial information through the cross normalization to construct the MetaFE. In the second stage, self-supervised monocular depth estimation with the brightness calibration is applied to decode the meta features into the depth image. Extensive evaluation on diverse endoscopic datasets demonstrates that our approach outperforms the state-of-the-art method in depth estimation, achieving superior accuracy and generalization. The source code will be publicly available.

Via

Access Paper or Ask Questions

Efficient Non-Exemplar Class-Incremental Learning with Retrospective Feature Synthesis

Nov 03, 2024

Liang Bai, Hong Song, Yucong Lin, Tianyu Fu, Deqiang Xiao, Danni Ai, Jingfan Fan, Jian Yang

Figure 1 for Efficient Non-Exemplar Class-Incremental Learning with Retrospective Feature Synthesis

Figure 2 for Efficient Non-Exemplar Class-Incremental Learning with Retrospective Feature Synthesis

Figure 3 for Efficient Non-Exemplar Class-Incremental Learning with Retrospective Feature Synthesis

Figure 4 for Efficient Non-Exemplar Class-Incremental Learning with Retrospective Feature Synthesis

Abstract:Despite the outstanding performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning from continuous data streams in real-world scenarios. Current Non-Exemplar Class-Incremental Learning (NECIL) methods mitigate forgetting by storing a single prototype per class, which serves to inject previous information when sequentially learning new classes. However, these stored prototypes or their augmented variants often fail to simultaneously capture spatial distribution diversity and precision needed for representing old classes. Moreover, as the model acquires new knowledge, these prototypes gradually become outdated, making them less effective. To overcome these limitations, we propose a more efficient NECIL method that replaces prototypes with synthesized retrospective features for old classes. Specifically, we model each old class's feature space using a multivariate Gaussian distribution and generate deep representations by sampling from high-likelihood regions. Additionally, we introduce a similarity-based feature compensation mechanism that integrates generated old class features with similar new class features to synthesize robust retrospective representations. These retrospective features are then incorporated into our incremental learning framework to preserve the decision boundaries of previous classes while learning new ones. Extensive experiments on CIFAR-100, TinyImageNet, and ImageNet-Subset demonstrate that our method significantly improves the efficiency of non-exemplar class-incremental learning and achieves state-of-the-art performance.

* 13 pages, 9 figures

Via

Access Paper or Ask Questions

Double-Shot 3D Shape Measurement with a Dual-Branch Network

Jul 19, 2024

Mingyang Lei, Jingfan Fan, Long Shao, Hong Song, Deqiang Xiao, Danni Ai, Tianyu Fu, Ying Gu, Jian Yang

Figure 1 for Double-Shot 3D Shape Measurement with a Dual-Branch Network

Figure 2 for Double-Shot 3D Shape Measurement with a Dual-Branch Network

Figure 3 for Double-Shot 3D Shape Measurement with a Dual-Branch Network

Figure 4 for Double-Shot 3D Shape Measurement with a Dual-Branch Network

Abstract:The structured light (SL)-based 3D measurement techniques with deep learning have been widely studied, among which speckle projection profilometry (SPP) and fringe projection profilometry (FPP) are two popular methods. However, they generally use a single projection pattern for reconstruction, resulting in fringe order ambiguity or poor reconstruction accuracy. To alleviate these problems, we propose a parallel dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet), to take advantage of convolutional operations and self-attention mechanisms for processing different SL modalities. Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images. To fully integrate complementary features, we design a double-stream attention aggregation module (DAAM) that consist of a parallel attention subnetwork for aggregating multi-scale spatial structure information. This module can dynamically retain local and global representations to the maximum extent. Moreover, an adaptive mixture density head with bimodal Gaussian distribution is proposed for learning a representation that is precise near discontinuities. Compared to the standard disparity regression strategy, this adaptive mixture head can effectively improves performance at object boundaries. Extensive experiments demonstrate that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset. We also show that the proposed architecture reveals the potential in infrared-visible image fusion task.

Via

Access Paper or Ask Questions

High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Nov 04, 2022

Yucong Lin, Jinhua Su, Yuhang Li, Yuhao Wei, Hanchao Yan, Saining Zhang, Jiaan Luo, Danni Ai, Hong Song, Jingfan Fan(+5 more)

Figure 1 for High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Figure 2 for High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Figure 3 for High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Figure 4 for High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Abstract:Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We developed a novel loss function that is tailored to reflect the boundary information to enhance the boundary detection. As the contrast between segmentation and background regions along the classification boundary naturally induces heterogeneity over the pixels, we propose the piece-wise two-sample t-test augmented (PTA) loss that is infused with the statistical test for such heterogeneity. We demonstrate the improved boundary detection power of the PTA loss compared to benchmark losses without a t-test component.

Via

Access Paper or Ask Questions

SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection

Oct 07, 2021

Qin Liu, Han Deng, Chunfeng Lian, Xiaoyang Chen, Deqiang Xiao, Lei Ma, Xu Chen, Tianshu Kuang, Jaime Gateno, Pew-Thian Yap(+1 more)

Figure 1 for SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection

Figure 2 for SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection

Figure 3 for SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection

Figure 4 for SkullEngine: A Multi-stage CNN Framework for Collaborative CBCT Image Segmentation and Landmark Detection

Abstract:We propose a multi-stage coarse-to-fine CNN-based framework, called SkullEngine, for high-resolution segmentation and large-scale landmark detection through a collaborative, integrated, and scalable JSD model and three segmentation and landmark detection refinement models. We evaluated our framework on a clinical dataset consisting of 170 CBCT/CT images for the task of segmenting 2 bones (midface and mandible) and detecting 175 clinically common landmarks on bones, teeth, and soft tissues.

* 10 pages, 5 figures, accepted by MLMI 2021

Via

Access Paper or Ask Questions

A Self-Supervised Deep Framework for Reference Bony Shape Estimation in Orthognathic Surgical Planning

Sep 11, 2021

Deqiang Xiao, Hannah Deng, Tianshu Kuang, Lei Ma, Qin Liu, Xu Chen, Chunfeng Lian, Yankun Lang, Daeseung Kim, Jaime Gateno(+4 more)

Figure 1 for A Self-Supervised Deep Framework for Reference Bony Shape Estimation in Orthognathic Surgical Planning

Figure 2 for A Self-Supervised Deep Framework for Reference Bony Shape Estimation in Orthognathic Surgical Planning

Figure 3 for A Self-Supervised Deep Framework for Reference Bony Shape Estimation in Orthognathic Surgical Planning

Figure 4 for A Self-Supervised Deep Framework for Reference Bony Shape Estimation in Orthognathic Surgical Planning

Abstract:Virtual orthognathic surgical planning involves simulating surgical corrections of jaw deformities on 3D facial bony shape models. Due to the lack of necessary guidance, the planning procedure is highly experience-dependent and the planning results are often suboptimal. A reference facial bony shape model representing normal anatomies can provide an objective guidance to improve planning accuracy. Therefore, we propose a self-supervised deep framework to automatically estimate reference facial bony shape models. Our framework is an end-to-end trainable network, consisting of a simulator and a corrector. In the training stage, the simulator maps jaw deformities of a patient bone to a normal bone to generate a simulated deformed bone. The corrector then restores the simulated deformed bone back to normal. In the inference stage, the trained corrector is applied to generate a patient-specific normal-looking reference bone from a real deformed bone. The proposed framework was evaluated using a clinical dataset and compared with a state-of-the-art method that is based on a supervised point-cloud network. Experimental results show that the estimated shape models given by our approach are clinically acceptable and significantly more accurate than that of the competing method.

* The 24th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2021

Via

Access Paper or Ask Questions