Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shaoyu Wang

Ordered-subsets Multi-diffusion Model for Sparse-view CT Reconstruction

May 15, 2025

Pengfei Yu, Bin Huang, Minghui Zhang, Weiwen Wu, Shaoyu Wang, Qiegen Liu

Abstract:Score-based diffusion models have shown significant promise in the field of sparse-view CT reconstruction. However, the projection dataset is large and riddled with redundancy. Consequently, applying the diffusion model to unprocessed data results in lower learning effectiveness and higher learning difficulty, frequently leading to reconstructed images that lack fine details. To address these issues, we propose the ordered-subsets multi-diffusion model (OSMM) for sparse-view CT reconstruction. The OSMM innovatively divides the CT projection data into equal subsets and employs multi-subsets diffusion model (MSDM) to learn from each subset independently. This targeted learning approach reduces complexity and enhances the reconstruction of fine details. Furthermore, the integration of one-whole diffusion model (OWDM) with complete sinogram data acts as a global information constraint, which can reduce the possibility of generating erroneous or inconsistent sinogram information. Moreover, the OSMM's unsupervised learning framework provides strong robustness and generalizability, adapting seamlessly to varying sparsity levels of CT sinograms. This ensures consistent and reliable performance across different clinical scenarios. Experimental results demonstrate that OSMM outperforms traditional diffusion models in terms of image quality and noise resilience, offering a powerful and versatile solution for advanced CT imaging in sparse-view scenarios.

Via

Access Paper or Ask Questions

Virtual-mask Informed Prior for Sparse-view Dual-Energy CT Reconstruction

Apr 10, 2025

Zini Chen, Yao Xiao, Junyan Zhang, Shaoyu Wang, Liu Shi, Qiegen Liu

Abstract:Sparse-view sampling in dual-energy computed tomography (DECT) significantly reduces radiation dose and increases imaging speed, yet is highly prone to artifacts. Although diffusion models have demonstrated potential in effectively handling incomplete data, most existing methods in this field focus on the image do-main and lack global constraints, which consequently leads to insufficient reconstruction quality. In this study, we propose a dual-domain virtual-mask in-formed diffusion model for sparse-view reconstruction by leveraging the high inter-channel correlation in DECT. Specifically, the study designs a virtual mask and applies it to the high-energy and low-energy data to perform perturbation operations, thus constructing high-dimensional tensors that serve as the prior information of the diffusion model. In addition, a dual-domain collaboration strategy is adopted to integrate the information of the randomly selected high-frequency components in the wavelet domain with the information in the projection domain, for the purpose of optimizing the global struc-tures and local details. Experimental results indicated that the present method exhibits excellent performance across multiple datasets.

Via

Access Paper or Ask Questions

HyperFree: A Channel-adaptive and Tuning-free Foundation Model for Hyperspectral Remote Sensing Imagery

Mar 27, 2025

Jingtao Li, Yingyi Liu, Xinyu Wang, Yunning Peng, Chen Sun, Shaoyu Wang, Zhendong Sun, Tian Ke, Xiao Jiang, Tangwei Lu(+2 more)

Abstract:Advanced interpretation of hyperspectral remote sensing images benefits many precise Earth observation tasks. Recently, visual foundation models have promoted the remote sensing interpretation but concentrating on RGB and multispectral images. Due to the varied hyperspectral channels,existing foundation models would face image-by-image tuning situation, imposing great pressure on hardware and time resources. In this paper, we propose a tuning-free hyperspectral foundation model called HyperFree, by adapting the existing visual prompt engineering. To process varied channel numbers, we design a learned weight dictionary covering full-spectrum from $0.4 \sim 2.5 \, \mu\text{m}$, supporting to build the embedding layer dynamically. To make the prompt design more tractable, HyperFree can generate multiple semantic-aware masks for one prompt by treating feature distance as semantic-similarity. After pre-training HyperFree on constructed large-scale high-resolution hyperspectral images, HyperFree (1 prompt) has shown comparable results with specialized models (5 shots) on 5 tasks and 11 datasets.Code and dataset are accessible at https://rsidea.whu.edu.cn/hyperfree.htm.

* Accepted by CVPR2025

Via

Access Paper or Ask Questions

Goal-Driven Reasoning in DatalogMTL with Magic Sets

Dec 10, 2024

Shaoyu Wang, Kaiyue Zhao, Dongliang Wei, Przemysław Andrzej Wałęga, Dingmin Wang, Hongmin Cai, Pan Hu

Figure 1 for Goal-Driven Reasoning in DatalogMTL with Magic Sets

Figure 2 for Goal-Driven Reasoning in DatalogMTL with Magic Sets

Abstract:DatalogMTL is a powerful rule-based language for temporal reasoning. Due to its high expressive power and flexible modeling capabilities, it is suitable for a wide range of applications, including tasks from industrial and financial sectors. However, due its high computational complexity, practical reasoning in DatalogMTL is highly challenging. To address this difficulty, we introduce a new reasoning method for DatalogMTL which exploits the magic sets technique -- a rewriting approach developed for (non-temporal) Datalog to simulate top-down evaluation with bottom-up reasoning. We implement this approach and evaluate it on several publicly available benchmarks, showing that the proposed approach significantly and consistently outperforms performance of the state-of-the-art reasoning techniques.

Via

Access Paper or Ask Questions

RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction

Nov 08, 2024

Xingyu Ai, Bin Huang, Fang Chen, Liu Shi, Binxuan Li, Shaoyu Wang, Qiegen Liu

Figure 1 for RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction

Figure 2 for RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction

Figure 3 for RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction

Figure 4 for RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction

Abstract:Recent advances in diffusion models have demonstrated exceptional performance in generative tasks across vari-ous fields. In positron emission tomography (PET), the reduction in tracer dose leads to information loss in sino-grams. Using diffusion models to reconstruct missing in-formation can improve imaging quality. Traditional diffu-sion models effectively use Gaussian noise for image re-constructions. However, in low-dose PET reconstruction, Gaussian noise can worsen the already sparse data by introducing artifacts and inconsistencies. To address this issue, we propose a diffusion model named residual esti-mation diffusion (RED). From the perspective of diffusion mechanism, RED uses the residual between sinograms to replace Gaussian noise in diffusion process, respectively sets the low-dose and full-dose sinograms as the starting point and endpoint of reconstruction. This mechanism helps preserve the original information in the low-dose sinogram, thereby enhancing reconstruction reliability. From the perspective of data consistency, RED introduces a drift correction strategy to reduce accumulated prediction errors during the reverse process. Calibrating the inter-mediate results of reverse iterations helps maintain the data consistency and enhances the stability of reconstruc-tion process. Experimental results show that RED effec-tively improves the quality of low-dose sinograms as well as the reconstruction results. The code is available at: https://github.com/yqx7150/RED.

Via

Access Paper or Ask Questions

Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model

Nov 06, 2024

Yu Guan, Kunlong Zhang, Qi Qi, Dong Wang, Ziwen Ke, Shaoyu Wang, Dong Liang, Qiegen Liu

Figure 1 for Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model

Figure 2 for Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model

Figure 3 for Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model

Figure 4 for Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model

Abstract:Diffusion models have recently demonstrated considerable advancement in the generation and reconstruction of magnetic resonance imaging (MRI) data. These models exhibit great potential in handling unsampled data and reducing noise, highlighting their promise as generative models. However, their application in dynamic MRI remains relatively underexplored. This is primarily due to the substantial amount of fully-sampled data typically required for training, which is difficult to obtain in dynamic MRI due to its spatio-temporal complexity and high acquisition costs. To address this challenge, we propose a dynamic MRI reconstruction method based on a time-interleaved acquisition scheme, termed the Glob-al-to-local Diffusion Model. Specifically, fully encoded full-resolution reference data are constructed by merging under-sampled k-space data from adjacent time frames, generating two distinct bulk training datasets for global and local models. The global-to-local diffusion framework alternately optimizes global information and local image details, enabling zero-shot reconstruction. Extensive experiments demonstrate that the proposed method performs well in terms of noise reduction and detail preservation, achieving reconstruction quality comparable to that of supervised approaches.

* 11 pages, 9 figures

Via

Access Paper or Ask Questions

Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Aug 07, 2024

Weiqi Feng, Yangrui Chen, Shaoyu Wang, Yanghua Peng, Haibin Lin, Minlan Yu

Figure 1 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Figure 2 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Figure 3 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Figure 4 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Abstract:Multimodal large language models (MLLMs) have extended the success of large language models (LLMs) to multiple data types, such as image, text and audio, achieving significant performance in various domains, including multimodal translation, visual question answering and content generation. Nonetheless, existing systems are inefficient to train MLLMs due to substantial GPU bubbles caused by the heterogeneous modality models and complex data dependencies in 3D parallelism. This paper proposes Optimus, a distributed MLLM training system that reduces end-to-end MLLM training time. Optimus is based on our principled analysis that scheduling the encoder computation within the LLM bubbles can reduce bubbles in MLLM training. To make scheduling encoder computation possible for all GPUs, Optimus searches the separate parallel plans for encoder and LLM, and adopts a bubble scheduling algorithm to enable exploiting LLM bubbles without breaking the original data dependencies in the MLLM model architecture. We further decompose encoder layer computation into a series of kernels, and analyze the common bubble pattern of 3D parallelism to carefully optimize the sub-millisecond bubble scheduling, minimizing the overall training time. Our experiments in a production cluster show that Optimus accelerates MLLM training by 20.5%-21.3% with ViT-22B and GPT-175B model over 3072 GPUs compared to baselines.

Via

Access Paper or Ask Questions

One-Step Detection Paradigm for Hyperspectral Anomaly Detection via Spectral Deviation Relationship Learning

Mar 22, 2023

Jingtao Li, Xinyu Wang, Shaoyu Wang, Hengwei Zhao, Liangpei Zhang, Yanfei Zhong

Figure 1 for One-Step Detection Paradigm for Hyperspectral Anomaly Detection via Spectral Deviation Relationship Learning

Figure 2 for One-Step Detection Paradigm for Hyperspectral Anomaly Detection via Spectral Deviation Relationship Learning

Figure 3 for One-Step Detection Paradigm for Hyperspectral Anomaly Detection via Spectral Deviation Relationship Learning

Figure 4 for One-Step Detection Paradigm for Hyperspectral Anomaly Detection via Spectral Deviation Relationship Learning

Abstract:Hyperspectral anomaly detection (HAD) involves identifying the targets that deviate spectrally from their surroundings, without prior knowledge. Recently, deep learning based methods have become the mainstream HAD methods, due to their powerful spatial-spectral feature extraction ability. However, the current deep detection models are optimized to complete a proxy task (two-step paradigm), such as background reconstruction or generation, rather than achieving anomaly detection directly. This leads to suboptimal results and poor transferability, which means that the deep model is trained and tested on the same image. In this paper, an unsupervised transferred direct detection (TDD) model is proposed, which is optimized directly for the anomaly detection task (one-step paradigm) and has transferability. Specially, the TDD model is optimized to identify the spectral deviation relationship according to the anomaly definition. Compared to learning the specific background distribution as most models do, the spectral deviation relationship is universal for different images and guarantees the model transferability. To train the TDD model in an unsupervised manner, an anomaly sample simulation strategy is proposed to generate numerous pairs of anomaly samples. Furthermore, a global self-attention module and a local self-attention module are designed to help the model focus on the "spectrally deviating" relationship. The TDD model was validated on four public HAD datasets. The results show that the proposed TDD model can successfully overcome the limitation of traditional model training and testing on a single image, and the model has a powerful detection ability and excellent transferability.

Via

Access Paper or Ask Questions

Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors

Jan 31, 2023

Jingtao Li, Xinyu Wang, Hengwei Zhao, Shaoyu Wang, Yanfei Zhong

Figure 1 for Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors

Figure 2 for Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors

Figure 3 for Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors

Figure 4 for Anomaly Segmentation for High-Resolution Remote Sensing Images Based on Pixel Descriptors

Abstract:Anomaly segmentation in high spatial resolution (HSR) remote sensing imagery is aimed at segmenting anomaly patterns of the earth deviating from normal patterns, which plays an important role in various Earth vision applications. However, it is a challenging task due to the complex distribution and the irregular shapes of objects, and the lack of abnormal samples. To tackle these problems, an anomaly segmentation model based on pixel descriptors (ASD) is proposed for anomaly segmentation in HSR imagery. Specifically, deep one-class classification is introduced for anomaly segmentation in the feature space with discriminative pixel descriptors. The ASD model incorporates the data argument for generating virtual ab-normal samples, which can force the pixel descriptors to be compact for normal data and meanwhile to be diverse to avoid the model collapse problems when only positive samples participated in the training. In addition, the ASD introduced a multi-level and multi-scale feature extraction strategy for learning the low-level and semantic information to make the pixel descriptors feature-rich. The proposed ASD model was validated using four HSR datasets and compared with the recent state-of-the-art models, showing its potential value in Earth vision applications.

* to be published in AAAI2023

Via

Access Paper or Ask Questions

Stabilizing Deep Tomographic Reconstruction Networks

Aug 19, 2020

Weiwen Wu, Dianlin Hu, Hengyong Yu, Hongming Shan, Shaoyu Wang, Wenxiang Cong, Chuang Niu, Pingkun Yan, Vince Vardhanabhuti, Ge Wang

Figure 1 for Stabilizing Deep Tomographic Reconstruction Networks

Figure 2 for Stabilizing Deep Tomographic Reconstruction Networks

Figure 3 for Stabilizing Deep Tomographic Reconstruction Networks

Figure 4 for Stabilizing Deep Tomographic Reconstruction Networks

Abstract:Tomographic image reconstruction with deep learning is an emerging field of applied artificial intelligence but a recent study reveals that deep reconstruction networks, such as well-known AUTOMAP, are unstable for computed tomography (CT) and magnetic resonance imaging (MRI). Specifically, three kinds of instabilities were identified: (1) strong output artefacts from tiny perturbation, (2) poor detection of small features, and (3) decreased performance with increased input data. These instabilities are believed to be from lacking kernel awareness and nontrivial to overcome, but compressed sensing (CS) reconstruction was reported to be stable due to its kernel awareness. Since deep reconstruction may potentially become the main driving force to achieve better image quality, stabilizing deep reconstruction networks is an urgent challenge. Here we propose an Analytic, Compressive, Iterative Deep (ACID) network to fundamentally address this challenge. Instead of only using deep learning or compressed sensing, ACID consists of four modules including deep reconstruction, CS, analytic mapping, and iterative refinement. In our experiments, ACID eliminated all three kinds of instabilities and significantly improved image quality relative to the methods in the aforementioned PNAS study. ACID is only an example of integrating diverse algorithmic ingredients but it has clearly demonstrated that data-driven reconstruction can be stabilized to outperform reconstruction using CS alone. The power of ACID comes from a unique combination of a deep reconstruction network trained on big data, CS via advanced optimization, and iterative refinement that stabilizes the whole workflow. We anticipate that this integrative closed-loop data driven approach will add great value to clinical and other applications.

* 34 pages, 11 figures, 1 table, 2 appendices, 62 references

Via

Access Paper or Ask Questions